PyTorch Memory optimizations via gradient checkpointing

This repository contains implementation of various PyTorch models using the gradient checkpointing[1] which allows trading compute for memory and hence allows training bigger/wider models and use large minibatch sizes.

The application of checkpointing is showcased on various models:

ResNet
DenseNet
LSTM model from pytorch examples here
VNet model which is used in medical imaging applications, available here

Results of checkpointing on these models are showcased below:

In order to use the models, you need to install PyTorch master following instructions from here

To run checkpointed models and their baseline tests, follow the commands below:

# for checkpointed
python test_memory_optimized.py

# for baseline
python test_memory_optimized.py

Tutorial

We provide a tutorial to describe how to use checkpointing for various kinds of models.

There are few special kinds of layers like Batch normalization, dropout that should be handled carefully. The details for handling those are also available in the tutorial

References

[1]. Siskind, Jeffrey Mark, and Barak A. Pearlmutter. "Divide-and-Conquer Checkpointing for Arbitrary Programs with No User Annotation." arXiv preprint arXiv:1708.06799 (2017).

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
models		models
tutorial		tutorial
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
test_memory_baseline.py		test_memory_baseline.py
test_memory_optimized.py		test_memory_optimized.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

models

models

tutorial

tutorial

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

test_memory_baseline.py

test_memory_baseline.py

test_memory_optimized.py

test_memory_optimized.py

Repository files navigation

PyTorch Memory optimizations via gradient checkpointing

Tutorial

References

About

Releases

Packages

Languages

License

prigoyal/pytorch_memonger

Folders and files

Latest commit

History

Repository files navigation

PyTorch Memory optimizations via gradient checkpointing

Tutorial

References

About

Topics

Resources

License

Stars

Watchers

Forks

Languages