Quantized-CNN for Mobile Devices

Quantized-CNN is a novel framework of convolutional neural network (CNN) with simultaneous computation acceleration and model compression in the test-phase. Mobile devices can perform efficient on-site image classification via our Quantized-CNN, with only negligible loss in accuracy.

一种量化CNN的方法（Q-CNN），
量化卷积层中的滤波器和全连接层中的加权矩阵，
通过量化网络参数，
用近似内积计算有效地估计卷积和全连接层的响应,
最小化参数量化期间每层响应的估计误差，
更好地保持模型性能。

步骤：
    首先，全连接的层保持不变,用纠错量化所有卷积层。
    其次，利用ILSVRC-12训练集对量化网络的全连接层进行微调，恢复分类精度。
    最后，纠错量化微调的层网络的全连接。

Installation

We have prepared a file (500+MB) containing 1k images drawn from the ILSVRC-12 validation set for a more accurate speed-test. You can download it from here, and put it under the "ILSVRC12.227x227.IMG" directory.

For the original AlexNet model, you can download the corresponding model files from here, and put them under the "AlexNet/Bin.Files" directory.

Prior to compilation, you need to install ATLAS and OpenVML, and modify the "CXXFLAGS" and "LDFLAGS" entries in the Makefile, if needed. Also, you should append the corresponding library paths to LD_LIBRARY_PATH in the ~/.bashrc. After that, use "make" to generate the executable file and "make run" to perform the speed-test with the above 1k images.

You can also use our code for single image classification (BMP format). Please refer to "src/Main.cc" for details.

Speed-test

The experiment is carried out on a single desktop PC, equipped with an Intel® Core™ i7-4790K CPU and 32GB RAM. All programs are executed in the single-thread mode, without GPU acceleration. Note that the run-time speed comparison result may vary under different hardware conditions.

We compare the run-time speed of AlexNet, for which Quantized-CNN's theoretical speed-up is 4.15×. For the baseline method, we use the Caffe implementation, compiled with ATLAS (default BLAS choice). We measure the forward-passing time per image, based on the average of 100 batches. Each batch contains a single image, since in practice, users usually take one photo with their cellphones and then fed it into the ConvNet for classification. The experiment is repeated five times and here are the results:

Time (ms)	CNN	Quantized-CNN	Speed-up
1	167.431	55.346	-
2	168.578	55.382	-
3	166.120	55.372	-
4	172.792	55.389	-
5	164.008	55.250	-
Ave.	167.786	55.348	3.03×

Quantized-CNN achieves 3.03× speed-up against the Caffe implementation, slightly lower than the theoretical one but still quite acceptable. Meanwhile, our method requires much less memory and storage space, which is critical for mobile applications.

Citation

Please cite our paper if it helps your research:

@inproceedings{wu2016quantized,
  author = {Jiaxiang Wu, Cong Leng, Yuhang Wang, Qinghao Hu, and Jian Cheng},
  title = {Quantized Convolutional Neural Networks for Mobile Devices},
  booktitle = {IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
  year = {2016},
}

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
AlexNet		AlexNet
Bmp.Files		Bmp.Files
Cls.Names		Cls.Names
ILSVRC12.227x227.IMG		ILSVRC12.227x227.IMG
include		include
src		src
.gitignore		.gitignore
Makefile		Makefile
Makefile.native		Makefile.native
Makefile.noblas		Makefile.noblas
README.md		README.md
cpplint.py		cpplint.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Quantized-CNN for Mobile Devices

Installation

Speed-test

Citation

About

Releases

Packages

Languages

Ewenwan/quantized-cnn

Folders and files

Latest commit

History

Repository files navigation

Quantized-CNN for Mobile Devices

Installation

Speed-test

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages