Towards Training Without Depth Limits: Large Batch Normalization Without Gradient Explosion

Code for the paper: Towards Training Without Depth Limits: Large Batch Normalization Without Gradient Explosion.

The repository contains all the necessary code to reproduce the experiments inside of the src/ directory:

modules: contains the code defining the neural network modules (models.py), as well as various utility functions for training, testing, data loading and measurements (data_utils.py, utils.py)
theorem_validations: contains Jupyter notebooks for reproducing the results from the main figures in the paper i.e. the main theorems
environment.yml: the conda environment containing the necessary packages for reproducing the results; install using conda env create -f environment.yml

For more information about each concept, see the paper content.

In order to reproduce the training accuracy plots, as well as the implicity SGD orthogonality results, run run.py on a dataset of your choice, and then plot the resulting .csv files containing the results. See run.py for the main command line arguments required to execute the script.

Author contact: Alexandru Meterez.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
environment.yml		environment.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Towards Training Without Depth Limits: Large Batch Normalization Without Gradient Explosion

About

Releases

Packages

Languages

License

alexandrumeterez/bngrad

Folders and files

Latest commit

History

Repository files navigation

Towards Training Without Depth Limits: Large Batch Normalization Without Gradient Explosion

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages