Remove that Square Root: A New Efficient Scale-Invariant Version of AdaGrad

This repository documents the code to reproduce the experiments reported in the paper:

Remove that Square Root: A New Efficient Scale-Invariant Version of AdaGrad

In this work, we introduce a novel optimization algorithm called KATE, a scale invariant adaptation of AdaGrad. Here we provide a screenshot of KATE's pseudocode from the paper.

In this repository we compare the performance of KATE with well-known algorithms like AdaGrad anbd ADAM on logistic regression, image classification and text classification problems. If you use this code for your research, please cite the paper as follow

@article{choudhury2024remove,
  title={Remove that Square Root: A New Efficient Scale-Invariant Version of AdaGrad},
  author={Choudhury, Sayantan and Tupitsa, Nazarii and Loizou, Nicolas and Horvath, Samuel and Takac, Martin and Gorbunov, Eduard},
  journal={arXiv preprint arXiv:2403.02648},
  year={2024}
}

Requirements

The anaconda environment can be easily created by the following command:

conda env create -f environment.yml

Logistic Regression

Scale Invariance

In Figure 1 of our paper, we compare the performance of KATE on scaled and un-scaled data and empirically show the scale-invariance property. Please run the code in KATEscaleinvariance.py to reproduce the plots of Figure 1.

Robustness of KATE

In Figure 2 of our paper, we compare the performance of KATE with AdGrad, AdaGradNorm, SGD-Decay and SGD-constant to examine the robustness of KATE. Please run the code in RobustKATE.py to reproduce the plots of Figure 2.

Performance of KATE on Real Data

In Figure 3 of our paper, we compare the performance of KATE with AdGrad, AdaGradNorm, SGD-Decay and SGD-constant on real data. Please run the code in KATEheart.py, KATEaustralian.py and KATEsplice.py to reproduce the performance of KATE on heart, australian and splice dataset, respectively.

Training of Neural Network

In Figure 4 of our paper, we compare the performance of KATE with AdGrad and ADAM on two tasks.

Image Classification: For training ResNet18 in CIFAR10 dataset.
Text Classification: BERT fine-tuning on the emotions dataset from the Hugging Face Hub.

Please run the code in train.ipynb to reproduce the plots for these two tasks.

Name		Name	Last commit message	Last commit date
Latest commit History 53 Commits
codes		codes
image		image
logistic_regression		logistic_regression
README.md		README.md
RoBERTa_Fine_Tuning_Emotion_classification.ipynb		RoBERTa_Fine_Tuning_Emotion_classification.ipynb
environment.yml		environment.yml
plot-general.ipynb		plot-general.ipynb
plot-roBERTa-tuned.ipynb		plot-roBERTa-tuned.ipynb
plot-tuned.ipynb		plot-tuned.ipynb
train.ipynb		train.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Remove that Square Root: A New Efficient Scale-Invariant Version of AdaGrad

Table of Contents

Requirements