Skip to content

Implementation of Shake-Shake by chainer (Shake-Shake regularization of 3-branch residual networks: https://openreview.net/forum?id=HkO-PCmYl)

License

Notifications You must be signed in to change notification settings

nutszebra/shake_shake

Repository files navigation

What's this

Implementation of Shake-Shake [1] by chainer

Dependencies

git clone https://github.com/nutszebra/shake_shake.git
cd shake_shake
git submodule init
git submodule update

How to run

python main.py -p ./ -g 0 

Details about my implementation

  • Data augmentation
    Train: Pictures are randomly resized in the range of [32, 36], then 32x32 patches are extracted randomly and are normalized locally. Horizontal flipping is applied with 0.5 probability.
    Test: Pictures are resized to 32x32, then they are normalized locally. Single image test is used to calculate total accuracy.

  • Optimization
    Momentum SGD with 0.9 momentum

  • Weight decay
    0.0001

  • Batch size
    128

  • Cosine annealing
    eta_max is 0.2 and eta_min is 0.002. The number of total epoch is 1800.

  • Shake-Shake
    forward: Shake
    backward: Shake
    level: Image

Cifar10 result

network model(Shake-Shake-Image) total accuracy (%)
[1] 2x32d 96.45
[1] 2x64d 97.02
[1] 2x96d 97.14
my implementation 2x64d 96.69

loss

total accuracy

References

Shake-Shake regularization of 3-branch residual networks [1]

About

Implementation of Shake-Shake by chainer (Shake-Shake regularization of 3-branch residual networks: https://openreview.net/forum?id=HkO-PCmYl)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages