Skip to content
/ DFS Public
forked from Torment123/DFS

[AAAI'2020] "Fractional Skipping: Towards Finer-Grained Dynamic CNN inference", Jianghao Shen, Yonggan Fu, Yue Wang, Pengfei Xu, Zhangyang Wang, Yingyan Lin

Notifications You must be signed in to change notification settings

VITA-Group/DFS

 
 

Repository files navigation

Fractional Skipping: Towards Finer-Grained Dynamic CNN inference [PDF]

Jianghao Shen, Yonggan Fu, Yue Wang, Pengfei Xu, Zhangyang Wang, Yingyan Lin

In AAAI 2020.

Overview

We present DFS (Dynamic Fractional Skipping), a dynamic inference framework that extends binary layer skipping options with "fractional skipping" ability - by quantizing the layer weights and activations into different bitwidths.

Highlights:

  • Novel integration of two CNN inference mindsets: dynamic layer skipping and static quantization
  • Introduced input-adaptive quantization at inference for the first time
  • Better performance and computational cost tradeoff than SkipNet and other relevant competitors

performance_skipnet

Figure 6: Comparing the accuracy vs. computation percentage of DFS-ResNet74 and SkipNet74 on CIFAR10.

Method

DFS

Figure1. An illustration of the DFS framework, where C1, C2, C3 denote three consecutive convolution layers, each of which consists of a column of filters as represented using cuboids. For each layer, the decision is computed by the corresponding gating network denoted with "Gx". In this example, the first conv layer is executed fractionally with a low bitwidth, the second layer is fully executed, while the third one is skipped.

Gating

Figure 2. An illustration of the RNN gate used in DFS. The output is a skipping probability vector, where the green arrows denote the layer skip options (skip/keep), and the blue arrows represent the quantization options. During inference, the skip/keep/quantization options corresponding to the largest vector element will be selected and to be executed.

Prerequisites

  • Ubuntu
  • Python 3
  • NVIDIA GPU + CUDA cuDNN

Installation

  • Clone this repo:
git clone https://github.com/Torment123/DFS.git
cd DFS
  • Install dependencies
pip install requirements.txt

Usage

  • Work flow: pretrain the ResNet backbone → train gate → train DFS

0. Data Preparation

  • data.py includes the data preparation for the CIFAR-10 and CIFAR-100 datasets.

1. Pretrain the ResNet backbone We first train a base ResNet model in preparation for further DFS training stage.

CUDA_VISIBLE_DEVICES=0 python3 train_base.py train cifar10_resnet_38 --dataset cifar10 --save-folder save_checkpoints/backbone

2. Train gate We then add RNN gate to the pretrained ResNet. Fix the parameters of ResNet, only train the RNN gate to reach zero skip ratio. set minimum = 100, lr = 0.01, iters=2000

CUDA_VISIBLE_DEVICES=0 python3 train_sp_integrate_dynamic_quantization_initial.py train cifar10_rnn_gate_38 --minimum 100 --lr 0.01  --resume save_checkpoints/backbone/model_best.pth.tar --iters 2000--save-folder save_checkpoints/full_execution

3. Train DFS After the gate is trained to reach full execution, we then unfreeze the backbone's parameters and jointly train it with the gate for our specified skip ratio. Set minimum = specified computation percentage, lr = 0.01.

CUDA_VISIBLE_DEVICES=0 python3 train_sp_integrate_dynamic_quantization.py train cifar10_rnn_gate_38 --minimum _specified_ _computation_ _percentage_ --lr 0.01 --resume save_checkpoints/full_execution/checkpoint_latest.pth.tar --save-folder save_checkpoints/DFS

Acknowledgement

  • The sequential formulation of dynamic inference problem from SkipNet
  • The quantization function from Scalable Methods

License

MIT

About

[AAAI'2020] "Fractional Skipping: Towards Finer-Grained Dynamic CNN inference", Jianghao Shen, Yonggan Fu, Yue Wang, Pengfei Xu, Zhangyang Wang, Yingyan Lin

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages

  • Python 100.0%