Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix #4375: Always use trainer.global_step for step #4376

Merged
merged 10 commits into from
Nov 22, 2020

Conversation

moi90
Copy link
Contributor

@moi90 moi90 commented Oct 26, 2020

What does this PR do?

This PR changes LearningRateMonitor to consistently use step=trainer.global_step, independently of logging_interval.

Fixes #4375

Before submitting

  • Was this discussed/approved via a Github issue? (no need for typos and docs improvements)
  • Did you read the contributor guideline, Pull Request section?
  • Did you make sure your PR does only one thing, instead of bundling different changes together? Otherwise, we ask you to create a separate PR for every change.
  • Did you make sure to update the documentation with your changes?
  • Did you write any new necessary tests?
  • Did you verify new and existing tests pass locally with your changes?
  • If you made a notable change (that affects users), did you update the CHANGELOG?

PR review

  • Is this pull request ready for review? (if not, please submit in draft mode)

Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

@moi90 moi90 closed this Oct 26, 2020
@moi90 moi90 reopened this Oct 26, 2020
@moi90 moi90 marked this pull request as ready for review October 26, 2020 16:25
@codecov
Copy link

codecov bot commented Oct 26, 2020

Codecov Report

Merging #4376 (2c6e8e5) into master (299de5d) will not change coverage.
The diff coverage is 100%.

@@          Coverage Diff           @@
##           master   #4376   +/-   ##
======================================
  Coverage      93%     93%           
======================================
  Files         117     117           
  Lines        9010    9010           
======================================
  Hits         8383    8383           
  Misses        627     627           

@ydcjeff ydcjeff added bug Something isn't working logger Related to the Loggers labels Oct 27, 2020
@ydcjeff ydcjeff added this to the 1.0.x milestone Oct 27, 2020
@rohitgr7
Copy link
Contributor

rohitgr7 commented Oct 27, 2020

Is there a problem logging it with epoch in case of epoch logging_level?? The reason I am asking this because it might not align with the global_step of other logs which are logged at epoch level since those are created at epoch end and this one at epoch start.

@moi90
Copy link
Contributor Author

moi90 commented Oct 27, 2020

I think it is right to log the learning rate at the beginning of the epoch with the trainer.global_step at this time. It is only natural to read the log in chronological order and assume that logged properties like learning rate apply to all following steps until a new value is logged. Anything else does not make sense at all in my understanding.

@edenlightning
Copy link
Contributor

@moi90 would you be able to take a look at failing tests?

@moi90
Copy link
Contributor Author

moi90 commented Nov 6, 2020

It seems that the failing tests are fixed by the changes merged from master.

@edenlightning edenlightning modified the milestones: 1.0.x, 1.0.7 Nov 10, 2020
@Borda Borda modified the milestones: 1.0.7, 1.0.x Nov 11, 2020
@edenlightning edenlightning modified the milestones: 1.0.x, 1.0.7 Nov 13, 2020
@Borda Borda modified the milestones: 1.0.7, 1.0.x Nov 13, 2020
@Borda Borda modified the milestones: 1.0.x, 1.1 Nov 18, 2020
Copy link
Contributor

@rohitgr7 rohitgr7 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested. LGTM!

Copy link
Member

@SkafteNicki SkafteNicki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@SkafteNicki SkafteNicki merged commit 8601268 into Lightning-AI:master Nov 22, 2020
@Borda Borda modified the milestones: 1.1, 1.0.x Nov 23, 2020
Borda pushed a commit that referenced this pull request Nov 23, 2020
* Fix #4375: Always use trainer.global_step for step

* Changelog

* Remove superflous use "epoch"

* Update Changelog

Co-authored-by: Nicki Skafte <skaftenicki@gmail.com>

(cherry picked from commit 8601268)
@moi90
Copy link
Contributor Author

moi90 commented Nov 25, 2020

Great, thanks for merging!

@moi90 moi90 deleted the fix-LearningRateMonitor branch November 25, 2020 08:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working logger Related to the Loggers
Projects
None yet
Development

Successfully merging this pull request may close these issues.

LearningRateMonitor produces inconsistent logs with logging_interval="epoch"
7 participants