forked from bigscience-workshop/Megatron-DeepSpeed
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Floating-point ops counting and reloading (bigscience-workshop#40)
* initial flo count/logging setup (need to fix model parameter count) * initial flo count/logging setup (need to fix model parameter count) * 1B3 parameter setup + flos counting * 1B3 parameter setup + flos counting * 1B3 parameter setup + flos counting * 1B3 parameter setup * 1B3 parameter setup * synched with latest 13B script * synched with latest 13B script * pipe transformer docstring * improve DS integration evaluation + logging * use pp engine even for pp=1 (bigscience-workshop#6) * removed slurm_examples * flos re-loading * Update megatron/training.py Co-authored-by: Thomas Wang <24695242+thomasw21@users.noreply.github.com> * Update megatron/data/gpt_dataset.py Co-authored-by: Thomas Wang <24695242+thomasw21@users.noreply.github.com> * Update megatron/utils.py Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * Update megatron/utils.py Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * formatting fix, reserving bug for somewhere else, adding flo-logging to TB groups * indentation bug * fixing possible double counts * tweaks * warning for double counts Co-authored-by: Shaden Smith <shaden.smith@microsoft.com> Co-authored-by: Jeff Rasley <jerasley@microsoft.com> Co-authored-by: TevenLeScao <uhk85as@jean-zay1.idris.fr> Co-authored-by: Thomas Wang <24695242+thomasw21@users.noreply.github.com> Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
- Loading branch information
1 parent
3a48503
commit 3b2207f
Showing
5 changed files
with
49 additions
and
10 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters