-
Notifications
You must be signed in to change notification settings - Fork 933
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[feat] Add optimizer state sharding (ZeRO) #636
Conversation
This pull request was exported from Phabricator. Differential Revision: D24317858 |
Summary: Pull Request resolved: facebookresearch#636 Adding Zero optimizer state sharding to MMF from [fairscale](https://github.com/facebookresearch/fairscale) library. Added to tutorials It can used with this option : `optimizer.enable_state_sharding=True` Differential Revision: D24317858 fbshipit-source-id: bc6cd687f0e580716b71b7a4daf143dc5097cc3e
2466ea9
to
bc3d78a
Compare
This pull request was exported from Phabricator. Differential Revision: D24317858 |
Summary: Pull Request resolved: facebookresearch#636 Adding Zero optimizer state sharding to MMF from [fairscale](https://github.com/facebookresearch/fairscale) library. Added to tutorials It can used with this option : `optimizer.enable_state_sharding=True` Differential Revision: D24317858 fbshipit-source-id: 471ce803f716b81dc4928d8cba281cceb75ab8d4
bc3d78a
to
72a7537
Compare
This pull request was exported from Phabricator. Differential Revision: D24317858 |
Summary: Pull Request resolved: facebookresearch#636 Adding Zero optimizer state sharding to MMF from [fairscale](https://github.com/facebookresearch/fairscale) library. Added to tutorials It can used with this option : `optimizer.enable_state_sharding=True` Differential Revision: D24317858 fbshipit-source-id: 4d3e09697df4f4f3b7c84e337d3287595e1b421f
72a7537
to
90c717d
Compare
This pull request was exported from Phabricator. Differential Revision: D24317858 |
1 similar comment
This pull request was exported from Phabricator. Differential Revision: D24317858 |
Summary: Pull Request resolved: facebookresearch#636 Adding Zero optimizer state sharding to MMF from [fairscale](https://github.com/facebookresearch/fairscale) library. Added to tutorials It can used with this option : `optimizer.enable_state_sharding=True` Differential Revision: D24317858 fbshipit-source-id: e1c25d81c231f7ab18fbe61e374a539d988e4556
90c717d
to
1ffa8b9
Compare
Summary: Pull Request resolved: facebookresearch#636 Adding Zero optimizer state sharding to MMF from [fairscale](https://github.com/facebookresearch/fairscale) library. Added to tutorials It can used with this option : `optimizer.enable_state_sharding=True` Differential Revision: D24317858 fbshipit-source-id: aa2b5868a9cddddaf09cb045ea2c5be6bec86b14
1ffa8b9
to
5b51998
Compare
This pull request was exported from Phabricator. Differential Revision: D24317858 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is awesome. Some comments. Do you plan to add any tests for these?
.circleci/config.yml
Outdated
@@ -22,6 +22,7 @@ install_dep: &install_dep | |||
command: | | |||
source activate mmf | |||
pip install --upgrade setuptools | |||
pip install --progress-bar off -r requirements.txt |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am somewhat not a fan of this as it creates confusion. setup.py
should in itself be enough to install any dependencies.
mmf/utils/build.py
Outdated
optimizer = optimizer_class(parameters, **params) | ||
|
||
if optimizer_config.get("enable_state_sharding", False): | ||
from fairscale.optim.oss import OSS |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can also do a try
, except ImportError
and ask user to install the fairscale library separately if we don't want explicit dependency on it. This would also solve some things on fbcode side, user who want it should include fairscale in the targets. Adding more dependency always causes more pain.
Summary: Pull Request resolved: facebookresearch#636 Adding Zero optimizer state sharding to MMF from [fairscale](https://github.com/facebookresearch/fairscale) library. Added to tutorials It can used with this option : `optimizer.enable_state_sharding=True` Differential Revision: D24317858 fbshipit-source-id: 90161c7b3314b631e85e59e50ca1701ed8daec79
5b51998
to
a9f8f65
Compare
This pull request was exported from Phabricator. Differential Revision: D24317858 |
Summary: Pull Request resolved: facebookresearch#636 Adding Zero optimizer state sharding to MMF from [fairscale](https://github.com/facebookresearch/fairscale) library. Added to tutorials It can used with this option : `optimizer.enable_state_sharding=True` Differential Revision: D24317858 fbshipit-source-id: 6c16e0614bb568c9ac35a70760af560d4a8c4d60
a9f8f65
to
25fa55e
Compare
This pull request was exported from Phabricator. Differential Revision: D24317858 |
@@ -226,7 +226,24 @@ def build_optimizer(model, config): | |||
) | |||
|
|||
parameters = get_optimizer_parameters(model, config) | |||
optimizer = optimizer_class(parameters, **params) | |||
|
|||
if optimizer_config.get("enable_state_sharding", False): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would make sense to write a simple test to test this. We have had issue with build_optimizer
in past.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good. Thanks for pushing this. A minor comment about test and should be good to land after that.
This pull request has been merged in 3ae71ec. |
Summary:
Adding Zero optimizer state sharding to MMF from fairscale library.
Added to tutorials
It can used with this option :
optimizer.enable_state_sharding=True
Differential Revision: D24317858