Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

D-Adaptation and Prodigy contrib implementations #651

Merged
merged 1 commit into from
Dec 11, 2023

Conversation

adefazio
Copy link
Contributor

@adefazio adefazio commented Dec 1, 2023

Implementations of D-Adaptation AdamW and the related method Prodigy based on the official PyTorch implementations. I have verified that they give the same outputs as the PyTorch version on an example problem. Unit tests similar to those used on other optimizers in contrib.

https://github.com/facebookresearch/dadaptation
https://github.com/konstmish/prodigy

These two new optimizers perform learning rate adaptation, similar to Mechanic and COCOB, two optimizers already included in contrib, but by a different mechanism, and so I think these are relevant to Optax and interesting to the community. D-Adaptation won an ICML outstanding paper award and is already gaining a lot of traction in the ML community, particularly for fine-tuning diffusion models with the Prodigy variant.

@fabianp
Copy link
Member

fabianp commented Dec 3, 2023

Thanks @adefazio for the contribution!

@vroulet vroulet self-assigned this Dec 5, 2023
@vroulet vroulet self-requested a review December 5, 2023 19:53
Copy link
Collaborator

@vroulet vroulet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution!
Minor comment: if s and d could have more explicit names, that could be helpful for a newcomer to investigate the algorithm but numerous algorithms have non-readable parameters like b1 or beta so that's fine as is too.

optax/contrib/dadapt_adamw.py Outdated Show resolved Hide resolved
optax/contrib/dadapt_adamw.py Outdated Show resolved Hide resolved
optax/contrib/dadapt_adamw.py Outdated Show resolved Hide resolved
optax/contrib/dadapt_adamw.py Outdated Show resolved Hide resolved
optax/contrib/dadapt_adamw.py Outdated Show resolved Hide resolved
optax/contrib/prodigy.py Outdated Show resolved Hide resolved
optax/contrib/prodigy.py Outdated Show resolved Hide resolved
optax/contrib/prodigy.py Outdated Show resolved Hide resolved
optax/contrib/dadapt_adamw.py Outdated Show resolved Hide resolved
optax/contrib/prodigy.py Outdated Show resolved Hide resolved
@adefazio adefazio requested a review from vroulet December 6, 2023 17:58
Copy link
Collaborator

@vroulet vroulet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks perfect, thank you! Final request: could you squash your commits into one?

@adefazio
Copy link
Contributor Author

adefazio commented Dec 7, 2023

I'm squashed the commits, thanks for reviewing so quickly!

@copybara-service copybara-service bot merged commit 1a7956d into google-deepmind:master Dec 11, 2023
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants