This repository investigates gradient descent in linear networks as described in the paper Exact solutions to the nonlinear dynamics of learning in deep linear neural networks by Andrew Saxe, James McClelland, and Surya Ganguli.
Everything is in the self-contained notebook linear-gradient-descent.ipynb, which
- explains and implements gradient descent for the linear network,
- trains a linear network to solve a classification task, and
- investigates a learning regime called dynamical isometry.
To run the notebook yourself,
- clone this repo:
git clone ...
- run
make env
at the top level to build a python environment with the packages listed inrequirements.txt
- run
make notebook
to start a local Jupyter notebook server - browse to
main.ipynb
and open it