Learning Efficient Convolutional Networks through Network Slimming, In ICCV 2017.
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
convert clean Aug 23, 2017
datasets clean Aug 23, 2017
models dirty_code Aug 23, 2017
prune clean Aug 23, 2017
.gitignore clean Aug 23, 2017
LICENSE Create LICENSE Mar 18, 2018
README.md link Jul 6, 2018
checkpoints.lua dirty_code Aug 23, 2017
dataloader.lua dirty_code Aug 23, 2017
example.sh clean Aug 23, 2017
main.lua clean Aug 23, 2017
main_fine_tune.lua clean Aug 23, 2017
opts.lua clean Aug 23, 2017
train.lua clean Aug 23, 2017
train_fine_tune.lua clean Aug 23, 2017

README.md

Network Slimming

This repository contains the code for the following paper

Learning Efficient Convolutional Networks through Network Slimming (ICCV 2017).

Zhuang Liu, Jianguo Li, Zhiqiang Shen, Gao Huang, Shoumeng Yan, Changshui Zhang.

The code is based on fb.resnet.torch.

We have now released another [PyTorch implementation] which supports ResNet and DenseNet, based on Qiang Wang's Pytorch implementation listed below.

Other Implementations: [Pytorch] by Qiang Wang, [Chainer] by Daiki Sanno.

Citation:

@inproceedings{Liu2017learning,
	title = {Learning Efficient Convolutional Networks through Network Slimming},
	author = {Liu, Zhuang and Li, Jianguo and Shen, Zhiqiang and Huang, Gao and Yan, Shoumeng and Zhang, Changshui},
	booktitle = {ICCV},
	year = {2017}
}

Introduction

Network Slimming is a neural network training scheme that can simultaneously reduce the model size, run-time memory, computing operations, while introducing no accuracy loss to and minimum overhead to the training process. The resulting models require no special libraries/hardware for efficient inference.

Approach

Figure 1: The channel pruning process.

We associate a scaling factor (reused from batch normalization layers) with each channel in convolutional layers. Sparsity regularization is imposed on these scaling factors during training to automatically identify unimportant channels. The channels with small scaling factor values (in orange color) will be pruned (left side). After pruning, we obtain compact models (right side), which are then fine-tuned to achieve comparable (or even higher) accuracy as normally trained full network.


Figure 2: Flow-chart of the network slimming procedure. The dotted line is for the multi-pass version of the procedure.

Example Usage

This repo holds the example code for VGGNet on CIFAR-10 dataset.

  1. Prepare the directories to save the results
mkdir vgg_cifar10/
mkdir vgg_cifar10/pruned
mkdir vgg_cifar10/converted
mkdir vgg_cifar10/fine_tune
  1. Train vgg network with channel level sparsity, S is the lambda in the paper which controls the significance of sparsity
th main.lua -netType vgg -save vgg_cifar10/ -S 0.0001
  1. Identify a certain percentage of relatively unimportant channels and set their scaling factors to 0
th prune/prune.lua -percent 0.7 -model vgg_cifar10/model_160.t7  -save vgg_cifar10/pruned/model_160_0.7.t7
  1. Re-build a real compact network and copy the weights from the model in the last stage
th convert/vgg.lua -model vgg_cifar10/pruned/model_160_0.7.t7 -save vgg_cifar10/converted/model_160_0.7.t7
  1. Fine-tune the compact network
th main_fine_tune.lua -retrain vgg_cifar10/converted/model_160_0.7.t7 -save vgg_cifar10/fine_tune/

Contact

liuzhuangthu at gmail.com