[test schedulers] adjust to test the first step's reading #6429

stas00 · 2020-08-12T03:38:31Z

As I was working on a new scheduler, it was difficult to match numbers since the first step's reading was dropped in unwrap_schedule wrappers (they were taking the measurement after stepping). This PR adjusts the wrappers to first take a reading and then step.

This PR also makes a small refactoring to move all the unwrapping into the script, so the test just compares 2 lists. (avoiding multiple [l[0] for l in lrs_1])

The updated table is:

        scheds = {
            get_constant_schedule: ({}, [10.0] * self.num_steps),
            get_constant_schedule_with_warmup: (
                {"num_warmup_steps": 4},
                [0.0, 2.5, 5.0, 7.5, 10.0, 10.0, 10.0, 10.0, 10.0, 10.0],
            ),
            get_linear_schedule_with_warmup: (
                {**common_kwargs},
                [0.0, 5.0, 10.0, 8.75, 7.5, 6.25, 5.0, 3.75, 2.5, 1.25],
            ),
            get_cosine_schedule_with_warmup: (
                {**common_kwargs},
                [0.0, 5.0, 10.0, 9.61, 8.53, 6.91, 5.0, 3.08, 1.46, 0.38],
            ),
            get_cosine_with_hard_restarts_schedule_with_warmup: (
                {**common_kwargs, "num_cycles": 2},
                [0.0, 5.0, 10.0, 8.53, 5.0, 1.46, 10.0, 8.53, 5.0, 1.46],
            ),
            get_polynomial_decay_schedule_with_warmup: (
                {**common_kwargs, "power": 2.0, "lr_end": 1e-7},
                [0.0, 5.0, 10.0, 7.656, 5.625, 3.906, 2.5, 1.406, 0.625, 0.156],
            ),
        }

Unrelated to the changes suggestion in this PR, it exposes 2 minor issues:

We definitely have a one off problem there, as the last step's reading is one reading too early (which this change exposes) - it doesn't complete the intended cycle. This is probably unimportant for 100s of steps, but it definitely stands out when developing a new scheduler.

To illustrate, see this change in reported number for get_polynomial_decay_schedule_with_warmup:

-                [5.0, 10.0, 7.656, 5.625, 3.906, 2.5, 1.406, 0.625, 0.156, 1e-07],
+                [0.0, 5.0, 10.0, 7.656, 5.625, 3.906, 2.5, 1.406, 0.625, 0.156],

the expected last step of 1e-07 is not there. It never was.

Also the first step's reading is 0.0 in all schedulers, except in get_constant_schedule, so the first step does nothing. This can be fixed with a potentially added min_lr=1e-7 to all schedulers, as it was suggested by @sshleifer in one of the recent scheduler-related PRs.

Let me know if this better fits into its own issue, as these issues have nothing to do with the PR itself. Or perhaps the 2 issues are just unimportant...

codecov · 2020-08-12T03:47:23Z

Codecov Report

Merging #6429 into master will increase coverage by 0.05%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master    #6429      +/-   ##
==========================================
+ Coverage   79.89%   79.94%   +0.05%     
==========================================
  Files         153      153              
  Lines       27902    27902              
==========================================
+ Hits        22291    22307      +16     
+ Misses       5611     5595      -16

Impacted Files	Coverage Δ
src/transformers/tokenization_albert.py	`28.84% <0.00%> (-58.66%)`	⬇️
src/transformers/modeling_tf_distilbert.py	`64.47% <0.00%> (-32.95%)`	⬇️
src/transformers/tokenization_utils.py	`90.40% <0.00%> (+0.40%)`	⬆️
src/transformers/tokenization_bert.py	`91.51% <0.00%> (+0.44%)`	⬆️
src/transformers/configuration_utils.py	`96.59% <0.00%> (+0.68%)`	⬆️
src/transformers/tokenization_openai.py	`84.09% <0.00%> (+1.51%)`	⬆️
src/transformers/tokenization_utils_fast.py	`94.28% <0.00%> (+2.14%)`	⬆️
src/transformers/tokenization_auto.py	`97.72% <0.00%> (+2.27%)`	⬆️
src/transformers/tokenization_transfo_xl.py	`42.48% <0.00%> (+3.75%)`	⬆️
src/transformers/generation_tf_utils.py	`86.71% <0.00%> (+5.01%)`	⬆️
... and 3 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4ffea5c...324dd60. Read the comment docs.

LysandreJik

Great, cleaner! Thanks a lot @stas00

…e#6429) * [test schedulers] small improvement * cleanup

…ggingface#6429)" This reverts commit a42b40d.

stas00 added 2 commits August 12, 2020 00:16

[test schedulers] small improvement

1e0c076

cleanup

324dd60

stas00 force-pushed the sched2 branch from 6c43e65 to 324dd60 Compare August 12, 2020 07:16

LysandreJik approved these changes Aug 27, 2020

View reviewed changes

LysandreJik merged commit dbfe34f into huggingface:master Aug 27, 2020

stas00 deleted the sched2 branch August 27, 2020 16:41

Zigur pushed a commit to Zigur/transformers that referenced this pull request Oct 26, 2020

[test schedulers] adjust to test the first step's reading (huggingfac…

a06936d

…e#6429) * [test schedulers] small improvement * cleanup

fabiocapsouza pushed a commit to fabiocapsouza/transformers that referenced this pull request Nov 15, 2020

[test schedulers] adjust to test the first step's reading (huggingfac…

a42b40d

…e#6429) * [test schedulers] small improvement * cleanup

fabiocapsouza added a commit to fabiocapsouza/transformers that referenced this pull request Nov 15, 2020

Revert "[test schedulers] adjust to test the first step's reading (hu…

621c923

…ggingface#6429)" This reverts commit a42b40d.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[test schedulers] adjust to test the first step's reading #6429

[test schedulers] adjust to test the first step's reading #6429

stas00 commented Aug 12, 2020 •

edited

Loading

codecov bot commented Aug 12, 2020 •

edited

Loading

LysandreJik left a comment

[test schedulers] adjust to test the first step's reading #6429

[test schedulers] adjust to test the first step's reading #6429

Conversation

stas00 commented Aug 12, 2020 • edited Loading

codecov bot commented Aug 12, 2020 • edited Loading

Codecov Report

LysandreJik left a comment

Choose a reason for hiding this comment

stas00 commented Aug 12, 2020 •

edited

Loading

codecov bot commented Aug 12, 2020 •

edited

Loading