Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add amsgrad optimizer #382

Merged
merged 6 commits into from
Nov 3, 2022

Conversation

merajhashemi
Copy link
Contributor

Hi,
This pr implements Amsgrad—an extension to Adam that improves its convergence properties.

@merajhashemi
Copy link
Contributor Author

Hi @mkunesch, any chance you could review this? Thanks.

@mtthss
Copy link
Collaborator

mtthss commented Aug 23, 2022

@merajhashemi thanks for the PR!
It fails the CI tests (see above) could address the failures?

@merajhashemi
Copy link
Contributor Author

@mtthss Done! Could you run the CI again?

@mkunesch mkunesch self-requested a review September 11, 2022 13:33
@mkunesch mkunesch self-assigned this Sep 12, 2022
@@ -189,6 +189,7 @@ def adam(
b2: float = 0.999,
eps: float = 1e-8,
eps_root: float = 0.0,
amsgrad: bool = False,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's introduce a separate alias amsgrad for this rather than adding a flag to Adam. My thinking is that there will be other improvements/modifications to Adam in the future and we should avoid an accumulation of options in the simple adam setup. Furthermore, I think it would make it easier for user to find amsgrad in optax and more obvious in the code that amsgrad is being used.

optax/_src/transform.py Show resolved Hide resolved
optax/_src/transform.py Show resolved Hide resolved
optax/_src/transform.py Show resolved Hide resolved
optax/_src/transform.py Show resolved Hide resolved
Copy link
Member

@mkunesch mkunesch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(sorry, I pressed send to early)

Thank you so much for this contribution! This is excellent!

I've just added very minor comments.

Could you add a test for this PR? The easiest thing might be to just add amsgrad to the optimizer list in alias_test.py so that it gets tested on a parabola. It might also be nice to test the nu_max behavior explicitly in transform_test.py but the parabola test is the more important one.

Also, just to say that we will only be able to merge this after the ICLR deadline, but we can approve it before then so that we can merge immediately on the 29th of September.

Thanks a lot again for this excellent contribution!

@hbq1
Copy link
Collaborator

hbq1 commented Oct 18, 2022

Hi @merajhashemi, thanks a lot for your contribution! Any change you could address the comments made by @mkunesch, so we could merge this PR in the next version of Optax? Thanks!

@mkunesch
Copy link
Member

mkunesch commented Oct 19, 2022

Actually, the changes are quite minimal (introducing a separate alias + adding it to the test). I'd be happy to approve and make these changes upon merging if that's okay with you @merajhashemi?

@merajhashemi
Copy link
Contributor Author

Hi @hbq1 @mkunesch

Thanks for the comments.
The reason I kept amsgrad in Adam was to keep it similar to other libraries; both pytorch and tensorflow have amsgrad as an argument inside their Adam optimizer. Still, I could split it out and add the tests on the weekend. How does that sound?

@mkunesch
Copy link
Member

mkunesch commented Oct 24, 2022

Hi! Ah, thanks a lot for that context - that makes a lot of sense.

I'm probably still leaning towards splitting as we have generally tried to avoid boolean flags in optax for a while now and don't mirror pytorch and tensorflow in other optimizers too.

@hbq1 : was your 👍 a vote for leaving it as an argument or for splitting it?

Thanks a lot!

@hbq1
Copy link
Collaborator

hbq1 commented Oct 24, 2022

I like the idea of splitting it into a separate optimiser for clarity 👍

Copy link
Member

@mkunesch mkunesch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great to me! Thanks a lot for splitting the optimizer from adam.

(I'll make some very minor formatting edits and fix the conflicts with master as I merge the PR)

Thanks a lot for the contribution again!

@mkunesch
Copy link
Member

mkunesch commented Nov 1, 2022

(sorry for the 3 commits to your branch - I had to merge master before importing the PR and I messed up the merge in the github editor)

@copybara-service copybara-service bot merged commit 47fe655 into google-deepmind:master Nov 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants