New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add functions to collate deepspeed zero 3 checkpoints #8701
Conversation
Codecov Report
@@ Coverage Diff @@
## master #8701 +/- ##
======================================
Coverage 93% 93%
======================================
Files 167 169 +2
Lines 14037 14075 +38
======================================
+ Hits 13008 13045 +37
- Misses 1029 1030 +1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how much is it our original code?
I found out you can bundle MIT/apache together under the apache licence, so i've done that and included lightning in the licence :) |
What does this PR do?
Required for #8691, follow-up from #8397
Adds a utility as suggested by @kaushikb11 to collate deepspeed sharded checkpoints. This is nicer than pointing users to a random gist by me :)
Before submitting
PR review
Anyone in the community is welcome to review the PR.
Before you start reviewing make sure you have read Review guidelines. In short, see the following bullet-list:
Did you have fun?
Make sure you had fun coding 馃檭