Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
re-introduce: stage3: efficient compute of scaled_global_grad_norm (#…
…5493) reverting previous revert of this feature: nelyahu@bc48371 in addition, bug fix for offload mode.
- Loading branch information