-
Notifications
You must be signed in to change notification settings - Fork 25.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add parallelization support for T5EncoderModel #9082
Conversation
add model parallelism to T5EncoderModel
Very cool! Could you also enable the parallelization tests for these models? You can check how it was done in the initial model parallel PR, here's the commit related to the tests. You can just add the |
Thanks for the tip. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This LGTM. Looking into it it seems we have an error in T5Stask
as it is creating the device map with torch.cuda.device_count()
, rather than the range
of that value like you're doing it here. Since we're always passing the device map to T5Stack
(it's never used as a standalone model) we don't see it, but it doesn't seem correct.
What do you think? If you think this is true, do you mind adding a range
in T5Stack
so that we can merge it together? Thanks!
Also it would be great if you could run |
Yes, you are correct, T5Stack should also use range. Since "get_device_map" function apply len to it . |
Done and passed the code quality testing. |
Wonderful! |
What does this PR do?
Extend T5EncoderModel to support model parallization across different GPUs.
Before submitting
Pull Request section?
to it if that's the case.
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
T5: @patrickvonplaten