Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Gradient Compression] Allow PowerSGD to run vallina allreduce for the first K iterations #50973

Closed
wants to merge 4 commits into from

Commits on Jan 23, 2021

  1. [Gradient Compression] Allow PowerSGD to run vallina allreduce for th…

    …e first K iterations
    
    This can extend the original PowerSGD method to a hybrid approach: vanilla allreduce + PowerSGD. This can help further improve the accuracy, at the cost of a lower speedup.
    
    Also add more comments on the fields in `PowerSGDState`.
    
    Original PR issue: Investigate Applying PowerSGD to Communication Hook for Gradient Compression #47202
    
    Differential Revision: [D26031478](https://our.internmc.facebook.com/intern/diff/D26031478/)
    
    [ghstack-poisoned]
    wayi committed Jan 23, 2021
    Configuration menu
    Copy the full SHA
    c294bb5 View commit details
    Browse the repository at this point in the history
  2. Update on "[Gradient Compression] Allow PowerSGD to run vallina allre…

    …duce for the first K iterations"
    
    This can extend the original PowerSGD method to a hybrid approach: vanilla allreduce + PowerSGD. This can help further improve the accuracy, at the cost of a lower speedup.
    
    Also add more comments on the fields in `PowerSGDState`.
    
    Original PR issue: Investigate Applying PowerSGD to Communication Hook for Gradient Compression #47202
    
    Differential Revision: [D26031478](https://our.internmc.facebook.com/intern/diff/D26031478/)
    
    [ghstack-poisoned]
    wayi committed Jan 23, 2021
    Configuration menu
    Copy the full SHA
    ba4b710 View commit details
    Browse the repository at this point in the history
  3. Update on "[Gradient Compression] Allow PowerSGD to run vallina allre…

    …duce for the first K iterations"
    
    This can extend the original PowerSGD method to a hybrid approach: vanilla allreduce + PowerSGD. This can help further improve the accuracy, at the cost of a lower speedup.
    
    Also add more comments on the fields in `PowerSGDState`.
    
    Original PR issue: Investigate Applying PowerSGD to Communication Hook for Gradient Compression #47202
    
    Differential Revision: [D26031478](https://our.internmc.facebook.com/intern/diff/D26031478/)
    
    [ghstack-poisoned]
    wayi committed Jan 23, 2021
    Configuration menu
    Copy the full SHA
    a999628 View commit details
    Browse the repository at this point in the history

Commits on Jan 25, 2021

  1. Update on "[Gradient Compression] Allow PowerSGD to run vallina allre…

    …duce for the first K iterations"
    
    This can extend the original PowerSGD method to a hybrid approach: vanilla allreduce + PowerSGD. This can help further improve the accuracy, at the cost of a lower speedup.
    
    Also add more comments on the fields in `PowerSGDState`.
    
    Original PR issue: Investigate Applying PowerSGD to Communication Hook for Gradient Compression #47202
    
    Differential Revision: [D26031478](https://our.internmc.facebook.com/intern/diff/D26031478/)
    
    [ghstack-poisoned]
    wayi committed Jan 25, 2021
    Configuration menu
    Copy the full SHA
    61f32b6 View commit details
    Browse the repository at this point in the history