Reproducing the paper "PADAM: Closing The Generalization Gap of Adaptive Gradient Methods In Training Deep Neural Networks" for the ICLR 2019 Reproducibility Challenge
-
Updated
Apr 13, 2019 - Python
Reproducing the paper "PADAM: Closing The Generalization Gap of Adaptive Gradient Methods In Training Deep Neural Networks" for the ICLR 2019 Reproducibility Challenge
A compressed adaptive optimizer for training large-scale deep learning models using PyTorch
Effect of Optimizer Selection and Hyperparameter Tuning on Training Efficiency and LLM Performance
Implementation and comparison of SGD, SGD with momentum, RMSProp and AMSGrad optimizers on the Image classification task using MNIST dataset
Add a description, image, and links to the sgd-momentum topic page so that developers can more easily learn about it.
To associate your repository with the sgd-momentum topic, visit your repo's landing page and select "manage topics."