-
Notifications
You must be signed in to change notification settings - Fork 25.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PL: --adafactor option #6776
PL: --adafactor option #6776
Conversation
examples/lightning_base.py
Outdated
@@ -137,7 +138,11 @@ def configure_optimizers(self): | |||
"weight_decay": 0.0, | |||
}, | |||
] | |||
optimizer = AdamW(optimizer_grouped_parameters, lr=self.hparams.learning_rate, eps=self.hparams.adam_epsilon) | |||
if self.hparams.adafactor: | |||
optimizer = Adafactor(optimizer_grouped_parameters, lr=self.hparams.learning_rate, scale_parameter=False, relative_step=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I ignore the adam_epsilon clarg here, since the defaults are different. Could add a --adafactor_epsilon
clarg, but I'll wait until somebody asks me to. So many clargs!
Codecov Report
@@ Coverage Diff @@
## master #6776 +/- ##
==========================================
- Coverage 78.47% 77.48% -1.00%
==========================================
Files 157 157
Lines 28569 28569
==========================================
- Hits 22420 22137 -283
- Misses 6149 6432 +283
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding this!
This reverts commit 1179f87.
CC @moscow25, @patil-suraj
I used the "External LR" setup and verified that is saves a significant amount of memory on pegasus finetuning.
Happy to add to Trainer.