Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Purpose of replace_with_xformers_attention() function #115

Open
cramraj8 opened this issue Apr 19, 2024 · 2 comments
Open

Purpose of replace_with_xformers_attention() function #115

cramraj8 opened this issue Apr 19, 2024 · 2 comments

Comments

@cramraj8
Copy link

cramraj8 commented Apr 19, 2024

Hi @MXueguang ,

I wonder what's the purpose of having replace_with_xformers_attention() defined in the utils.py because I am getting the following error,

AttributeError: 'LlamaAttention' object has no attribute 'num_key_value_heads'

Does the self.num_key_value_heads value in the replace_with_xformers_attention() defined somewhere else ?

@MXueguang
Copy link
Contributor

I was trying to use flashattention with replace_with_xformers_attention(). but with recent transformers, i believe LLaMA can direct use flashattention by specificing atten_implementation when loading the pretrained model. this line is not necessary any more.

@cramraj8
Copy link
Author

Got it. Thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants