Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Update base for Update on "[Gradient Compression] Allow BatchedPowerS…
…GD to run vanilla allreduce for the first K iterations" Similar to #50973, allow the batched version to run vanilla allreduce for the first K iterations. This may be useful if the batched version can be applied to some use cases where the accuracy requirement is not very strict. Original PR issue: Investigate Applying PowerSGD to Communication Hook for Gradient Compression #47202 Differential Revision: [D26077709](https://our.internmc.facebook.com/intern/diff/D26077709/) [ghstack-poisoned]
- Loading branch information