Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
..
Failed to load latest commit information.
README.md
acc_conv.py
acc_fc.py
accnn.py
config.json
rank_selection.py
utils.py

README.md

Accelerate Convolutional Neural Networks

This tool aims to accelerate the test-time computation and decrease number of parameters of deep CNNs.

How to use

Use accnn.py to get a new model by specifying an original model and the speeding-up ratio.

You may provide a json to explicitly control the architecture of the new model, otherwise the rank-selection algorithm would be used to do it automatically and the configuration would be saved to file config.json.

acc_conv.py and acc_fc.py would be involved automatically when using accnn.py while acc_conv.py and acc_fc.py can also be used seperately.

Example

###Speedup whole network

  • Speed up a model by 2 times and use rank-selection to determine ranks of each layer automatically

    python accnn.py -m MODEL-PREFIX --save-model new-vgg16 --ratio 2
  • Use your own configuration file without rank-selection

    python accnn.py -m MODEL-PREFIX --save-model new-model --config YOUR-CONFIG_JSON

###Speedup a single layer

  • Decompose a convolutional layer:

    python acc_conv.py -m MODEL-PREFIX --layer LAYER-NAME --K NUM-FILTER --save-model new-model
  • Decompose a fullyconnected layer:

    python acc_fc.py -m MODEL-PREFIX --layer LAYER-NAME --K NUM-HIDDEN --save-model new-model
  • uses --help to see more options

Results

The experiments are carried on a single machine with four Nvidia Titan X GPUs. The top-5 accuracy is evaluated on ImageNet validation dataset.

Model Top-5 accuracy Theoretical speed up CPU speed up GPU speed up
model0 89.6% 1x 1x 1x
model1 88.6% 2.4x 2.2x 1.1x
model2 89.8% 2.4x 2.2x 1.1x
model3 87.5% 3x 2.6x 1.2x
model4 89.6% 3x 2.6x 1.2x
  • model0 is the original VGG16 model directly converted from Caffe Model Zoo
  • model1 is the accelerated model based on config.json
  • model2 is the same as model1 but is fine-tuned on ImageNet training dataset for 5 epochs
  • model3 is the accelerated model based on rank-selection with 3 times speeding up
  • model4 is the same as model3 but is fine-tuned on ImageNet training dataset for 5 epochs
  • The experiments in GPU are carried with cuDNN 4

Notes

  • This tool is verified on the VGG-16 model converted from Caffe by caffe_converter tool.

  • accnn.py tool only supports single input and output

  • This tool mainly implements the algorithm of Cheng et al. [2] to decompose a convolutional layer to two convolutional layers both in spatial dimensions and across channels. acc_conv.py provides the function to replace a (N,d,d) conv. layer by two (K,d,1) and (N,1,d) conv. layers.

  • The idea of rank-selection tool is based on the related work of Zhang et al [1] that we could use the product of PCA energy to determine the rank for each layer.

Reference Paper

[1] Zhang, Xiangyu, et al. "Efficient and accurate approximations of nonlinear convolutional networks." arXiv preprint arXiv:1411.4229 (2014).

[2] Tai, Cheng, et al. "Convolutional neural networks with low-rank regularization." arXiv preprint arXiv:1511.06067 (2015).