Skip to content

Conversation

@ShadenSmith
Copy link
Contributor

No description provided.

@ShadenSmith ShadenSmith added the bug Something isn't working label Mar 15, 2020
@ShadenSmith ShadenSmith requested a review from tjruwase March 15, 2020 14:46
@ShadenSmith
Copy link
Contributor Author

The PR tests aren't triggering for some reason. Investigating.

Copy link
Collaborator

@jeffra jeffra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a test that triggers this bug? Maybe even testing all (or most) of these keys? I think it should be just a checking save and load that would work.

@ShadenSmith
Copy link
Contributor Author

Finally got around to flushing out some tests. We now explicitly test for those deepspeed states like global_steps. I also added tests for LR schedulers.

@ShadenSmith ShadenSmith requested a review from jeffra May 5, 2020 19:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

deepspeed_light.py bug: 'global_step' should be 'global_steps' in _load_checkpoint()

3 participants