Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

filter flash_attn optional imports loading remote code #30954

Merged

Conversation

eaidova
Copy link
Contributor

@eaidova eaidova commented May 22, 2024

What does this PR do?

in some remote available models code, we can meet optional import flash_attention module without try-except block.
Examples:
phi3-vision - https://huggingface.co/microsoft/Phi-3-vision-128k-instruct/blob/main/modeling_phi3_v.py#L52
orion-14b - https://huggingface.co/OrionStarAI/Orion-14B-Chat/blob/main/modeling_orion.py#L36
deepseek-moe - https://huggingface.co/deepseek-ai/deepseek-moe-16b-base/blob/main/modeling_deepseek.py#L54
nanoLLAVA- https://huggingface.co/qnguyen3/nanoLLaVA/blob/main/modeling_llava_qwen2.py#L861

loading such model in environment where flash_attn package is not installed failed with trust_remote_code flag. It may be problematic to install and import this package (e.g. for environment where no cuda and torch installed for cpu only) for some environment. This PR update dependencies search logic for checking dynamic modules loading.

Fixes # (issue)

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the
    documentation guidelines, and
    here are tips on formatting docstrings.
  • Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@amyeroberts
Copy link
Collaborator

cc @Rocketknight1 as I think you were working on a related issue

@andrei-kochin
Copy link

@amyeroberts @Rocketknight1 do you have any good news for us here?

@huggingface huggingface deleted a comment from github-actions bot Jun 28, 2024
@itikhono
Copy link

itikhono commented Jul 2, 2024

@amyeroberts @Rocketknight1 Hi! any updates on this?
we are working on improving SDPA operation support on openvino side, and using these models for testing our changes:

phi3-vision - https://huggingface.co/microsoft/Phi-3-vision-128k-instruct/blob/main/modeling_phi3_v.py#L52
orion-14b - https://huggingface.co/OrionStarAI/Orion-14B-Chat/blob/main/modeling_orion.py#L36
deepseek-moe - https://huggingface.co/deepseek-ai/deepseek-moe-16b-base/blob/main/modeling_deepseek.py#L54
nanoLLAVA- https://huggingface.co/qnguyen3/nanoLLaVA/blob/main/modeling_llava_qwen2.py#L861

But unfortunately we encountered the same issue as described in the ticket. Do you have any plans to merge the fix?

@amyeroberts
Copy link
Collaborator

Gentle ping @Rocketknight1

Comment on lines 152 to 153
# filter out imports under is_flash_attn_2_available block for avoid import issues in cpu only environment
content = re.sub(r"if is_flash_attn_*available\(\):\s*(from flash_attn\s*.*\s*)+", "", content, flags=re.MULTILINE)
Copy link

@CuriousPanCake CuriousPanCake Aug 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe, the regex can be improved to r"if is_flash_attn_[0-9]+_available\(\):\s*(from flash_attn\s*.*\s*)+" to capture the include properly.
I was trying to use this change locally for OrionStarAI/Orion-14B-Chat and the flash_attn imports were not working properly, hence the imports were not removed.
Most probably, the similar stands for the other models mentioned in the ticket as the imports string are the same.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's try to make it work for as many as possible! AST my be of help here as well

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, but I lost any motivation to finishing this PR after 3 months o silence... I'm not good in regular expressions, so I have no time to explore all models on hub to find all possible patterns...

I'll apply suggestion from @CuriousPanCake, but do not promise to look anything beyond of that if you do not have specific suggestions.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry on our side for the late review, I'll help you get this merge, it's already good IMO to fix on most of the models!

@eaidova eaidova force-pushed the ea/avoid_flash_attn_remote_code branch from c29889a to 844ca87 Compare August 6, 2024 08:19
@ArthurZucker
Copy link
Collaborator

Ping me if you need help to fix the CI / a review 🤗

@eaidova
Copy link
Contributor Author

eaidova commented Aug 7, 2024

@ArthurZucker thank you for review, I fixed code style CI related issue, but I have no idea about flax tests (I do not think that it is some how related to my changes)

Copy link
Member

@Rocketknight1 Rocketknight1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey! The fault for the extremely slow review is mine, and I'm sorry, especially because this issue affected other users besides you, and I really should have gotten it merged before now!

The tl;dr is that I think it's ready for merge as-is, because the only relevant test in our codebase is is_flash_attn_2_available. However, I've suggested a change that should catch anything that looks like is_flash_attn[...]available, which should cover us if any custom code is calling something similar, but without causing unwanted side effects.

Let me know if you're happy with the change, and then we'll try to merge this ASAP - you definitely don't need to trawl through the code after having to wait all this time. Thank you for the PR!

src/transformers/dynamic_module_utils.py Outdated Show resolved Hide resolved
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
@Rocketknight1
Copy link
Member

@eaidova thanks for committing the suggestion, merging now!

@Rocketknight1 Rocketknight1 merged commit cc832cb into huggingface:main Aug 8, 2024
21 checks passed
@eaidova eaidova deleted the ea/avoid_flash_attn_remote_code branch August 18, 2024 04:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants