The State of Sparsity in Deep Neural Networks
This directory contains the code accompanying the paper "The State of Sparsity in Deep Neural Networks". All authors contributed to this code.
layers subdirectory contains implementations of variational dropout and l0 regularization in TensorFlow. The
sparse_rn50 subdirectories contain code for the Transformer and ResNet-50 experiments from the aforementioned paper. The
results subdirectory contains CSV files of the results of all hyperparameter configurations that we explored for each model, sparsity technique, and sparsity level.
Build Docker Image
To build a Docker image with all required dependencies, run
sudo docker build -t <image_name> .. The base setup installs TensorFlow with GPU support and is based off Nvidia's CUDA-9.0 image with all the required libraries to run TensorFlow. To launch the container, run
sudo docker run --runtime=nvidia -v ~/:/mount/ -it <image_name>:latest. This command additionaly makes your home directory accessbile at
/mount inside the container.
To run with GPU support, swap
Once inside the container, this repo contains all of the code and data needed to decode the WMT English-German 2014 test set and calculate the BLEU score for each of the checkpoints we provided.
Small scripts to decode from Transformer checkpoints trained with each technique are provided in
sparse_transformer/decode/. For random pruning checkpoints, use the
decode_mp.sh script. For variational dropout, you'll need to pass in the same log alpha threshold that was used to achieve the BLEU score in checkpoint directory, which is provided as the last number in the checkpoint directory name.
The results of decoding from the model checkpoint will be saved in the
sparse_transformer/decode/ directory with a name like
newstest2014.end.sparse_transformer.... To calculate the BLEU score for these decodes, run
sh get_ende_bleu.sh <decode_output>. This script relies on the mosesdecoder project (https://github.com/moses-smt/mosesdecoder), and assumes this is installed at
/mount/mosesdecoder inside the container. The output of the script should match the BLEU score reported in the checkpoint directory.
Scripts to evaluate ResNet-50 checkpoints on the ImageNet test set are provided in
sparse_rn50/evaluate/. For random pruning checkpoints, use the
decode_mp.sh script. You'll similarly need to pass in the log alpha threshold to evaluate va¯riaitonal dropout checkpoints, which was 0.5 for all our models. This repository does not include the ImageNet dataset, so you'll also need to point these scripts at a local version of the ImageNet test set stored as TFRecords. The output of the script should match the top-1 accuracy reported in the checkpoint directory.
Calculate Weight Sparsity
To calculate the weight sparsity for a checkpoint, use the
checkpoint_sparsity.py script and pass the checkpoint file, sparsity technique, and model ("transformer" or "rn50"). For variational dropout, also pass the same log alpha threshold.
The top performing checkpoints for each model and sparsity technique can be downloaded with the following links.
|ResNet-50||Magnitude Pruning (extended/non-uniform)||80%||76.52||link|
|ResNet-50||Magnitude Pruning (extended/non-uniform)||90%||75.16||link|
|ResNet-50||Magnitude Pruning (extended/non-uniform)||95%||72.71||link|
|ResNet-50||Magnitude Pruning (extended/non-uniform)||96.5%||69.26||link|