Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor almost identical tests #6339

Merged
merged 3 commits into from
Aug 10, 2020
Merged

Conversation

stas00
Copy link
Contributor

@stas00 stas00 commented Aug 8, 2020

in preparation for adding more schedulers this PR refactors these almost identical tests.

Unfortunately can't use pytest.mark.parametrize, so the only drawback that it makes them all into a single test. It'd have been nice to parametrize instead.

@codecov
Copy link

codecov bot commented Aug 8, 2020

Codecov Report

Merging #6339 into master will decrease coverage by 0.03%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #6339      +/-   ##
==========================================
- Coverage   78.37%   78.34%   -0.04%     
==========================================
  Files         148      148              
  Lines       27196    27196              
==========================================
- Hits        21316    21307       -9     
- Misses       5880     5889       +9     
Impacted Files Coverage Δ
src/transformers/generation_tf_utils.py 84.21% <0.00%> (-2.26%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 322dffc...92d3825. Read the comment docs.

@stas00
Copy link
Contributor Author

stas00 commented Aug 8, 2020

could also modify unwrap_schedule and unwrap_and_save_reload_schedule to return a clean list of numbers, and then it'd be just:

        for scheduler_func, data in scheds.items():
            kwargs, expected_learning_rates = data

            scheduler = scheduler_func(self.optimizer, **kwargs)
            lrs_1 = unwrap_schedule(scheduler, self.num_steps)
            self.assertListAlmostEqual(lrs_1, expected_learning_rates, tol=1e-2)

            scheduler = scheduler_func(self.optimizer, **kwargs)
            lrs_2 = unwrap_and_save_reload_schedule(scheduler, self.num_steps)
            self.assertListEqual(lrs_1, lrs_2)

but perhaps it'd be less intuitive for those reading the test code.

@sshleifer
Copy link
Contributor

sshleifer commented Aug 8, 2020

Does this impact tracebacks in a bad way? Previously I would know which scheduler I broke if test_warmup_constant_scheduler failed.

@stas00
Copy link
Contributor Author

stas00 commented Aug 8, 2020

That's super-imporant, @sshleifer, thank you for flagging that!

Added an assert msg to make it clear what fails, e.g. if I break data for the sake of demo, we now get:

        for scheduler_func, data in scheds.items():
            kwargs, expected_learning_rates = data

            scheduler = scheduler_func(self.optimizer, **kwargs)
            lrs_1 = unwrap_schedule(scheduler, self.num_steps)
            self.assertEqual(len(lrs_1[0]), 1)
            self.assertListAlmostEqual(
>               [l[0] for l in lrs_1], expected_learning_rates, tol=1e-2, msg=f"failed for {scheduler_func}"
            )

tests/test_optimization.py:126:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
tests/test_optimization.py:92: in assertListAlmostEqual
    self.assertAlmostEqual(a, b, delta=tol, msg=msg)
E   AssertionError: 2.5 != 3.5 within 0.01 delta (1.0 difference) : failed for <function get_constant_schedule_with_warmup at 0x7f5da6f0bdd0>

@stas00
Copy link
Contributor Author

stas00 commented Aug 8, 2020

hmm, not sure whether the last commit, to make the assert message even more specific, was needed.

Also, alternatively, I can move the code out of unittest class and then use pytest parametrization so it'll be self-documenting on assert. Ala:

@pytest.mark.parametrize(

@sshleifer
Copy link
Contributor

LGTM as is, but won't merge it myself.

Copy link
Member

@LysandreJik LysandreJik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, looking forward to new schedulers!

@LysandreJik LysandreJik merged commit 1429b92 into huggingface:master Aug 10, 2020
@stas00 stas00 deleted the opt-refact branch August 10, 2020 16:29
sgugger added a commit that referenced this pull request Aug 11, 2020
* [wip] add get_polynomial_decay_schedule_with_warmup

* style

* add assert

* change lr_end to a much smaller default number

* check for exact equality

* [model_cards] electra-base-turkish-cased-ner (#6350)

* for electra-base-turkish-cased-ner

* Add metadata

Co-authored-by: Julien Chaumond <chaumond@gmail.com>

* Temporarily de-activate TPU CI

* Update modeling_tf_utils.py (#6372)

fix typo: ckeckpoint->checkpoint

* the test now works again (#6371)

* correct pl link in readme (#6364)

* refactor almost identical tests (#6339)

* refactor almost identical tests

* important to add a clear assert error message

* make the assert error even more descriptive than the original bt

* Small docfile fixes (#6328)

* Patch models (#6326)

* TFAlbertFor{TokenClassification, MultipleChoice}

* Patch models

* BERT and TF BERT info


s

* Update check_repo

* Ci GitHub caching (#6382)

* Cache Github Actions CI

* Remove useless file

* Colab button (#6389)

* Add colab button

* Add colab link for tutorials

* Fix links for open in colab (#6391)

* Update src/transformers/optimization.py

consistently use lr_end=1e-7 default

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* [wip] add get_polynomial_decay_schedule_with_warmup

* style

* add assert

* change lr_end to a much smaller default number

* check for exact equality

* Update src/transformers/optimization.py

consistently use lr_end=1e-7 default

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* remove dup (leftover from merge)

* convert the test into the new refactored format

* stick to using the current_step as is, without ++

Co-authored-by: M. Yusuf Sarıgöz <yusufsarigoz@gmail.com>
Co-authored-by: Julien Chaumond <chaumond@gmail.com>
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
Co-authored-by: Alexander Measure <ameasure@gmail.com>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants