Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix fuyu device_map compatibility #29880

Merged
merged 1 commit into from
Mar 27, 2024

Conversation

SunMarc
Copy link
Member

@SunMarc SunMarc commented Mar 26, 2024

What does this PR do ?

This PR fixes the multi-gpu issue that users were experiencing here #29848. This happened because this PR kinda reverted the changes I made to make it compatible with device_map here.
This is not very elegant since we need to modify the forward a bit, but this is an exception since we are dealing with the output of the vision model and the input of the language model.

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Copy link
Collaborator

@ArthurZucker ArthurZucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks 🤗

@SunMarc SunMarc merged commit 31c575b into huggingface:main Mar 27, 2024
18 checks passed
@SunMarc SunMarc deleted the fix-fuyu-device-map branch March 27, 2024 09:18
@grahamannett
Copy link
Contributor

fyi this seems to actually still be broken as image_patches_indices can be wrong device for gather_continuous_embeddings

@grahamannett
Copy link
Contributor

grahamannett commented May 10, 2024

@ArthurZucker and @SunMarc seems like it might be helpful to specify a test on this and what sort of GPU setup the test has? Took me a few days to pinpoint but I believe I get an error when device_map='auto' for 8x GPUs but not on 2x (and smaller GPU's)?

Seems with more GPU's for multimodal models theres likely to be less chance the embeddings are on the same device for many HF models and not sure if that would be considered a bug or is specific to how infer_auto_device_map from PreTrainedModel. Have not looked too deeply into if other models are casting embeddings to same device or what the overhead on that is compared to having embeddings all on same device to start with (also seems this could be issue with FSDP as well from my understanding as transformer_auto_wrap_policy is used with decoder blocks?).

itazap pushed a commit that referenced this pull request May 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants