Skip to content
Partially Adaptive Momentum Estimation
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.
LICENSE Create LICENSE Nov 14, 2018 fixed factor Sep 21, 2018 new version Jan 23, 2019 new version Jan 23, 2019 new version Jan 23, 2019 Padam v1 Jun 5, 2018


This repository contains our pytorch implementation of Partially Adaptive Momentum Estimation method (Padam) in the paper [Closing the Generalization Gap of Adaptive Gradient Methods in Training Deep Neural Networks].


  • Pytorch
  • CUDA


Use python to run for experiments on Cifar10 and for experiments on Cifar100

Command Line Arguments:

  • --lr: (start) learning rate
  • --method: optimization method, e.g., "sgdm", "adam", "amsgrad", "padam"
  • --net: network architecture, e.g. "vggnet", "resnet", "wideresnet"
  • --partial: partially adaptive parameter for Padam method
  • --wd: weight decay
  • --Nepoch: number of training epochs
  • --resume: whether resume from previous training process

Usage Examples:

  • Run experiments on Cifar10:
  -  python  --lr 0.1 --method "padam" --net "vggnet"  --partial 0.125 --wd 5e-4
  • Run experiments on Cifar100:
  -  python  --lr 0.1 --method "padam" --net "resnet"  --partial 0.125 --wd 5e-4
You can’t perform that action at this time.