Fractional Skipping: Towards Finer-Grained Dynamic CNN inference [PDF]

Jianghao Shen, Yonggan Fu, Yue Wang, Pengfei Xu, Zhangyang Wang, Yingyan Lin

In AAAI 2020.

Overview

We present DFS (Dynamic Fractional Skipping), a dynamic inference framework that extends binary layer skipping options with "fractional skipping" ability - by quantizing the layer weights and activations into different bitwidths.

Highlights:

Novel integration of two CNN inference mindsets: dynamic layer skipping and static quantization
Introduced input-adaptive quantization at inference for the first time
Better performance and computational cost tradeoff than SkipNet and other relevant competitors

Figure 6: Comparing the accuracy vs. computation percentage of DFS-ResNet74 and SkipNet74 on CIFAR10.

Method

Figure1. An illustration of the DFS framework, where C1, C2, C3 denote three consecutive convolution layers, each of which consists of a column of filters as represented using cuboids. For each layer, the decision is computed by the corresponding gating network denoted with "Gx". In this example, the first conv layer is executed fractionally with a low bitwidth, the second layer is fully executed, while the third one is skipped.

Figure 2. An illustration of the RNN gate used in DFS. The output is a skipping probability vector, where the green arrows denote the layer skip options (skip/keep), and the blue arrows represent the quantization options. During inference, the skip/keep/quantization options corresponding to the largest vector element will be selected and to be executed.

Prerequisites

Ubuntu
Python 3
NVIDIA GPU + CUDA cuDNN

Installation

Clone this repo:

git clone https://github.com/Torment123/DFS.git
cd DFS

Install dependencies

pip install requirements.txt

Usage

Work flow: pretrain the ResNet backbone → train gate → train DFS

0. Data Preparation

data.py includes the data preparation for the CIFAR-10 and CIFAR-100 datasets.

1. Pretrain the ResNet backbone We first train a base ResNet model in preparation for further DFS training stage.

CUDA_VISIBLE_DEVICES=0 python3 train_base.py train cifar10_resnet_38 --dataset cifar10 --save-folder save_checkpoints/backbone

2. Train gate We then add RNN gate to the pretrained ResNet. Fix the parameters of ResNet, only train the RNN gate to reach zero skip ratio. set minimum = 100, lr = 0.01, iters=2000

CUDA_VISIBLE_DEVICES=0 python3 train_sp_integrate_dynamic_quantization_initial.py train cifar10_rnn_gate_38 --minimum 100 --lr 0.01  --resume save_checkpoints/backbone/model_best.pth.tar --iters 2000--save-folder save_checkpoints/full_execution

3. Train DFS After the gate is trained to reach full execution, we then unfreeze the backbone's parameters and jointly train it with the gate for our specified skip ratio. Set minimum = specified computation percentage, lr = 0.01.

CUDA_VISIBLE_DEVICES=0 python3 train_sp_integrate_dynamic_quantization.py train cifar10_rnn_gate_38 --minimum _specified_ _computation_ _percentage_ --lr 0.01 --resume save_checkpoints/full_execution/checkpoint_latest.pth.tar --save-folder save_checkpoints/DFS

Acknowledgement

The sequential formulation of dynamic inference problem from SkipNet
The quantization function from Scalable Methods

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
modules		modules
data.py		data.py
models.py		models.py
models_initial.py		models_initial.py
readme.md		readme.md
requirements.txt		requirements.txt
supplementary_material.pdf		supplementary_material.pdf
train_base.py		train_base.py
train_sp_integrate_dynamic_quantization.py		train_sp_integrate_dynamic_quantization.py
train_sp_integrate_dynamic_quantization_initial.py		train_sp_integrate_dynamic_quantization_initial.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Fractional Skipping: Towards Finer-Grained Dynamic CNN inference [PDF]

Overview

Method

Prerequisites

Installation

Usage

Acknowledgement

License

About

Releases

Packages

Languages

VITA-Group/DFS

Folders and files

Latest commit

History

Repository files navigation

Fractional Skipping: Towards Finer-Grained Dynamic CNN inference [PDF]

Overview

Method

Prerequisites

Installation

Usage

Acknowledgement

License

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages