On the Variance of the Adaptive Learning Rate and Beyond
-
Updated
Jul 31, 2021 - Python
On the Variance of the Adaptive Learning Rate and Beyond
Learning Rate Warmup in PyTorch
ADAM - A Question Answering System. Inspired from IBM Watson
RAdam implemented in Keras & TensorFlow
Easy-to-use AdaHessian optimizer (PyTorch)
Toy implementations of some popular ML optimizers using Python/JAX
Partially Adaptive Momentum Estimation method in the paper "Closing the Generalization Gap of Adaptive Gradient Methods in Training Deep Neural Networks" (accepted by IJCAI 2020)
Quasi Hyperbolic Rectified DEMON Adam/Amsgrad with AdaMod, Gradient Centralization, Lookahead, iterative averaging and decorrelated Weight Decay
Adam, NAdam and AAdam optimizers
[Python] [arXiv/cs] Paper "An Overview of Gradient Descent Optimization Algorithms" by Sebastian Ruder
AdaShift optimizer implementation in PyTorch
implementation of neural network from scratch only using numpy (Conv, Fc, Maxpool, optimizers and activation functions)
This is an implementation of Adam: A Method for Stochastic Optimization.
Tensorflow-Keras callback implementing arXiv 1712.07628
Coding Assignment of PFN intern 2019
Comprehensive image classification for training multilayer perceptron (MLP), LeNet, LeNet5, conv2, conv4, conv6, VGG11, VGG13, VGG16, VGG19 with batch normalization, ResNet18, ResNet34, ResNet50, MobilNetV2 on MNIST, CIFAR10, CIFAR100, and ImageNet1K.
Add a description, image, and links to the adam topic page so that developers can more easily learn about it.
To associate your repository with the adam topic, visit your repo's landing page and select "manage topics."