Skip to content
Switch branches/tags
This branch is up to date with master.

Latest commit


Git stats


Failed to load latest commit information.
Latest commit message
Commit time

N2N: Network to Network Compression using Policy Gradient Reinforcement Learning (ICLR 2018)

This is the code to run the model compression algorithm described in the paper. It currently supports trained models in pytorch. If you would like to use it with a model in another deep learning framework, it would have to be converted to pytorch first. Link to ICLR paper


There are some dependencies for running this

  1. python >= 2.7
  2. pytorch >= 0.2
  3. torchvision >= 0.19

How to run

  1. Clone this repository using
git clone
  1. Download teacher models from the links below

  2. Layer removal and Layer shrinkage instructions are described below Additional detailed instructions can be found in the help menu in


Here is an example command to train the layer removal policy on the cifar10 dataset using the resnet-18 model

python removal cifar10 teacherModels/ --cuda True 


NOTE: To run shrinkage, specify both teacher model and reduced model from stage1

python shrinkage cifar10 teacherModels/ --model Stage1_cifar10/ --cuda True 

Downloading models

All models can be downloaded at this link

Pre-trained teacher models

The teacher models are to be specified to to train.

Pre-trained student models

The pre-trained student models are given to show the performance of the models described in the paper. They can be tested using Test using

python studentModels/ cifar10

Pre-trained policies

The pre-trained polcies are specified to run the transfer learning experiments

Experiments folder

The experiments folder contains various variants of layer removal and shrinkage that were tried for the actual paper. These were mainly experiments which require substantial modifications to the main code or were used on earlier iterations of the project. They have to be moved to the main folder before being run. The following describes each experiment

  1. - Layer removal using the Autoregressive controller
  2. - Layer shrinkage for Non-ResNet convolutional models
  3. - Layer removal for Non-ResNet convolutional models using the bidirectional controller
  4. - Layer removal for Non-ResNet convolutional models using the encoder-decoder controller
  5. - Layer removal using the Actor-Critic controller
  6. - Layer removal for ResNet models using the Autoregressive controller


No releases published


No packages published