Skip to content
master
Go to file
Code

README.md

Smoothness vs Gradient norm

A pytorch implementation for the LSTM experiments in the paper: Why Gradient Clipping Accelerates Training: A Theoretical Justification for Adaptivity

This repo is based on the repo at https://github.com/salesforce/awd-lstm-lm. Please refer to the original repo for a detailed setup description.

For data setup, run ./getdata.sh.

To reproduce the result of estimating smoothness vs gradient norm on AWD-LSTM training with PTB, simply run CUDA_VISIBLE_DEVICES=1 python main.py --epochs 2

The smoothness and gradient norm data collected along training is stored as csv files in side the ./ckpts folder.

About

A pytorch implementation for the LSTM experiments in the paper: Why Gradient Clipping Accelerates Training: A Theoretical Justification for Adaptivity

Resources

License

Releases

No releases published

Packages

No packages published
You can’t perform that action at this time.