-
Notifications
You must be signed in to change notification settings - Fork 3.6k
Introduce parameter to fix deepspeed crash for RNNS #9489
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…ide of deepspeed.initialize
|
I did look into the possibility of automating this. However within DeepSpeed the conversion happens without passing necessary information to do the conversion: https://github.com/microsoft/DeepSpeed/blob/master/deepspeed/runtime/zero/stage3.py#L663 Thus we need to do this beforehand to ensure we setup correctly. |
tchaton
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGMT !
What does this PR do?
Fixes #9394
By introducing
partition_module, the user is able to turn of the explicit conversion of the entire module. This leaves DeepSpeed to do the conversion itself.The reason in most cases its best to do it upfront, is because we can control the parameters passed to the initialisation + if the user defines layers in
configure_sharded_modelwe can also ensure all other layers are converted (Which is a requirement).Before submitting
PR review
Anyone in the community is welcome to review the PR.
Before you start reviewing make sure you have read Review guidelines. In short, see the following bullet-list:
Did you have fun?
Make sure you had fun coding 🙃