-
Notifications
You must be signed in to change notification settings - Fork 986
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix: Support model parallelism in HF transformer #3459
Conversation
Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>
Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>
Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>
Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>
Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>
Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>
Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>
Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>
Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>
Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>
74ef248
to
7ed7893
Compare
if self.model._no_split_modules: | ||
self.device_map = "auto" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we call infer_auto_device_map
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the functionality wise it's the same
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: gavrishp, yuzisun The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
* Support model parallelism in HF transformer Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com> * Support models that dont supoort slipt Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com> * fix padding Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com> * fix defauults Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com> * set cuda as default Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com> * set cuda as default Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com> * fix lint Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com> * update automodel Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com> * fix review comment Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com> * update review comment Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com> * update comment Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com> --------- Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com> Signed-off-by: tjandy98 <3953059+tjandy98@users.noreply.github.com>
What this PR does / why we need it:
Include device_map changes and set it to "auto" to allow model parallelism prevent from OOM trying to fit model into a single device
Which issue(s) this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)
format, will close the issue(s) when PR gets merged):Fixes #
Type of changes
Please delete options that are not relevant.
Feature/Issue validation/testing:
Please describe the tests that you ran to verify your changes and relevant result summary. Provide instructions so it can be reproduced.
Please also list any relevant details for your test configuration.
Test A
Test B
Logs
Special notes for your reviewer:
Tested with large models like in llama2 70B
Checklist:
Release note: