Code for the signSGD paper
Switch branches/tags
Nothing to show
Clone or download
Latest commit 33cd978 Nov 8, 2018
Type Name Latest commit message Commit time
Failed to load latest commit information.
cifar Update Jun 22, 2018
gradient_expts Update Jun 23, 2018
imagenet added code markers Nov 3, 2018
toy_problem Create Jun 22, 2018
.gitignore upload uncommented code Jun 21, 2018 updated instructions Nov 8, 2018

signSGD: compressed optimisation for non-convex problems

Here I house mxnet code for the signSGD paper. Some links:

General instructions:

  • Signum is implemented as an official optimiser in mxnet!
  • to use Signum in this codebase, we pass in the optim 'signum' as a command line argument.
  • if you do not use our suggested hyperparameters, be careful to tune them yourself.
  • Signum hyperparameters are typically similar to Adam hyperparameters, not SGD!

There are four folders:

  1. cifar/ -- code to train resnet-20 on Cifar-10.
  2. gradient_expts/ -- code to compute gradient statistics as in Figure 1 and 2. Includes Welford algorithm.
  3. imagenet/ -- code to train resnet-50 on Imagenet. Implementation inspired by that of Wei Wu.
  4. toy_problem/ -- simple example where signSGD is more robust than SGD.

More info to be found within each folder.

Any questions / comments? Don't hesitate to get in touch: