Allow to set Adam beta1, beta2 in TrainingArgs #5592

gonglinyuan · 2020-07-08T06:30:04Z

In some models, beta1 and beta2 in Adam optimizer are set to be different from the default values (0.9, 0.999). For example, RoBERTa set beta2 = 0.98. It is thereby necessary to add beta1 and beta2 in TrainingArgs if the user wants to fine-tune RoBERTa and other similar models. Also, another hyperparameter of Adam, adam_epsilon, has already been added to TrainingArgs. For the purpose of consistency, it would be better of adam_beta1 and adam_beta2 are also added.

codecov · 2020-07-08T07:03:20Z

Codecov Report

Merging #5592 into master will increase coverage by 1.21%.
The diff coverage is 75.00%.

@@            Coverage Diff             @@
##           master    #5592      +/-   ##
==========================================
+ Coverage   76.95%   78.16%   +1.21%     
==========================================
  Files         145      145              
  Lines       25317    25319       +2     
==========================================
+ Hits        19482    19790     +308     
+ Misses       5835     5529     -306

Impacted Files	Coverage Δ
src/transformers/trainer.py	`37.96% <0.00%> (ø)`
src/transformers/trainer_tf.py	`16.53% <ø> (ø)`
src/transformers/optimization_tf.py	`57.65% <100.00%> (ø)`
src/transformers/training_args.py	`78.00% <100.00%> (+0.44%)`	⬆️
src/transformers/modeling_tf_roberta.py	`43.98% <0.00%> (-49.38%)`	⬇️
src/transformers/generation_tf_utils.py	`83.20% <0.00%> (-1.76%)`	⬇️
src/transformers/file_utils.py	`79.26% <0.00%> (-0.34%)`	⬇️
src/transformers/modeling_tf_mobilebert.py	`96.77% <0.00%> (+73.38%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update cfbb982...9bdb1a5. Read the comment docs.

LysandreJik

LGTM, love this! @julien-c @jplu

jplu · 2020-07-09T14:31:42Z

LGTM! Really nice!!!

julien-c · 2020-07-14T08:07:26Z

I'm fine with this

gonglinyuan added 2 commits July 8, 2020 14:23

Add Adam beta1, beta2 to trainier

ea96ff7

Make style consistent

9bdb1a5

LysandreJik approved these changes Jul 9, 2020

View reviewed changes

julien-c mentioned this pull request Jul 14, 2020

Add beta 1 and beta 2 option in TrainingArguments for AdamW optimizer. #5699

Closed

LysandreJik merged commit b21993b into huggingface:master Jul 27, 2020

sgugger mentioned this pull request Nov 20, 2020

Document adam betas TrainingArguments #8688

Merged

anaconda121 mentioned this pull request Jun 11, 2022

Not possible to tune adam_beta1 and adam_beta2 parameters using simple transformers, yet there are available to tune using huggingface ThilinaRajapakse/simpletransformers#1423

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow to set Adam beta1, beta2 in TrainingArgs #5592

Allow to set Adam beta1, beta2 in TrainingArgs #5592

gonglinyuan commented Jul 8, 2020

codecov bot commented Jul 8, 2020 •

edited

Loading

LysandreJik left a comment

jplu commented Jul 9, 2020

julien-c commented Jul 14, 2020

Allow to set Adam beta1, beta2 in TrainingArgs #5592

Allow to set Adam beta1, beta2 in TrainingArgs #5592

Conversation

gonglinyuan commented Jul 8, 2020

codecov bot commented Jul 8, 2020 • edited Loading

Codecov Report

LysandreJik left a comment

Choose a reason for hiding this comment

jplu commented Jul 9, 2020

julien-c commented Jul 14, 2020

codecov bot commented Jul 8, 2020 •

edited

Loading