-
Notifications
You must be signed in to change notification settings - Fork 26.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
filter flash_attn optional imports loading remote code #30954
filter flash_attn optional imports loading remote code #30954
Conversation
cc @Rocketknight1 as I think you were working on a related issue |
@amyeroberts @Rocketknight1 do you have any good news for us here? |
@amyeroberts @Rocketknight1 Hi! any updates on this? phi3-vision - https://huggingface.co/microsoft/Phi-3-vision-128k-instruct/blob/main/modeling_phi3_v.py#L52 But unfortunately we encountered the same issue as described in the ticket. Do you have any plans to merge the fix? |
Gentle ping @Rocketknight1 |
# filter out imports under is_flash_attn_2_available block for avoid import issues in cpu only environment | ||
content = re.sub(r"if is_flash_attn_*available\(\):\s*(from flash_attn\s*.*\s*)+", "", content, flags=re.MULTILINE) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe, the regex can be improved to r"if is_flash_attn_[0-9]+_available\(\):\s*(from flash_attn\s*.*\s*)+"
to capture the include properly.
I was trying to use this change locally for OrionStarAI/Orion-14B-Chat and the flash_attn imports were not working properly, hence the imports were not removed.
Most probably, the similar stands for the other models mentioned in the ticket as the imports string are the same.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's try to make it work for as many as possible! AST my be of help here as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry, but I lost any motivation to finishing this PR after 3 months o silence... I'm not good in regular expressions, so I have no time to explore all models on hub to find all possible patterns...
I'll apply suggestion from @CuriousPanCake, but do not promise to look anything beyond of that if you do not have specific suggestions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry on our side for the late review, I'll help you get this merge, it's already good IMO to fix on most of the models!
c29889a
to
844ca87
Compare
Ping me if you need help to fix the CI / a review 🤗 |
@ArthurZucker thank you for review, I fixed code style CI related issue, but I have no idea about flax tests (I do not think that it is some how related to my changes) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey! The fault for the extremely slow review is mine, and I'm sorry, especially because this issue affected other users besides you, and I really should have gotten it merged before now!
The tl;dr is that I think it's ready for merge as-is, because the only relevant test in our codebase is is_flash_attn_2_available
. However, I've suggested a change that should catch anything that looks like is_flash_attn[...]available
, which should cover us if any custom code is calling something similar, but without causing unwanted side effects.
Let me know if you're happy with the change, and then we'll try to merge this ASAP - you definitely don't need to trawl through the code after having to wait all this time. Thank you for the PR!
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
@eaidova thanks for committing the suggestion, merging now! |
What does this PR do?
in some remote available models code, we can meet optional import flash_attention module without try-except block.
Examples:
phi3-vision - https://huggingface.co/microsoft/Phi-3-vision-128k-instruct/blob/main/modeling_phi3_v.py#L52
orion-14b - https://huggingface.co/OrionStarAI/Orion-14B-Chat/blob/main/modeling_orion.py#L36
deepseek-moe - https://huggingface.co/deepseek-ai/deepseek-moe-16b-base/blob/main/modeling_deepseek.py#L54
nanoLLAVA- https://huggingface.co/qnguyen3/nanoLLaVA/blob/main/modeling_llava_qwen2.py#L861
loading such model in environment where flash_attn package is not installed failed with trust_remote_code flag. It may be problematic to install and import this package (e.g. for environment where no cuda and torch installed for cpu only) for some environment. This PR update dependencies search logic for checking dynamic modules loading.
Fixes # (issue)
Before submitting
Pull Request section?
to it if that's the case.
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.