**Table: **
Methods to reduce numerical precison for AlexNet. Accuracy measured for Top-5 error on ImageNet. * Not Applied to first and/or last layers
![](2016_methods_to_reduce_numerical_precision_for_AlexNet.png)

source: [Efficient Processing of Deep Neural Networks: A Tutorial and Survey](https://arxiv.org/abs/1703.09039)
 on page 22

**Method**:

binary/ternary weights: BC(BianryConnect), TC(TernaryConnect), TWN(Ternary Weight Networks), BWN(Binary Weight Networks)

binary/ternary weights and binary/ternary activatons: BNN(Binarized Neural Networks), XNOR-Nets

n-bit quantization for weights, activations, gradients: DoReFa-Net, QNN(Quantized Neural Networks), 

Bitwise NN

-------------------

## BNNs

### Article
[Binarized Neural Networks: Training Neural Networks with Weights and Activations Constrained to +1 or −1](https://arxiv.org/abs/1602.02830)

### Author
- Matthieu Courbariaux
- Itay Hubara
- Daniel Soudry
- Ran El-Yaniv
- Yoshua Bengio

### Time
2016.2
### Method
Binary/Ternary
### Ideas/Contributions
Binarization method: deterministic (b = sign(x)) (uesd) or stochastic(b 依概率$p = \sigma(x) = \mathrm{clip}(\frac{x+1}{2}, 0, 1)$为+1， otherwise -1)

Binarized activations+weights, during training and test

Gradients propagating method: straight-through estimator
$$q = \mathrm{sign}(r)$$
$$g _ r = g _ q \mathbb{1}_{|r| \leq 1}$$

First layer: 8bit images compute 8 times, then mutiply $2^i$ and sum.
$$s = x \cdot w^b$$
$$s = \sum_{n=1}^{8}{2^{n-1}(x^n \cdot w^b)}$$

Weight clipping: (Update) $W \leftarrow \mathrm{clip}(W, -1, 1)$

Others: Shift based Batch Normalizing Transform(使用移位运算的Batch Norm)

Others: Shift based AdaMax

Training: Inputs(R), Weights({-1, 1}), Activations({-1, 1})  
Deployment: Inputs(R), Weights({-1, 1}), Activations({-1, 1}) 

### Experiments

Classification test error rates

| MNIST | SVHN | CIFAR-10 | ImageNet
--|--|-----------
BNNs | 1.40% | 2.53% | 10.15%| -

### Other Results
Assuming we have $M_l$ filters in the $l$ convolutional layer, we have to store a 4D weight matrix of size $M_l \times M_{l-1} \times k \times k$. Consequently, the number of unique filters is $2^{k^2 M_{l-1}}$


For example, in our ConvNet architecture trained on the CIFAR-10 benchmark, there are only 42% unique filters per layer on average.

Seven Times Faster on GPU at Run-Time

In comparison with 32-bit DNNs, BNNs require 32 times smaller memory size and 32 times fewer memory accesses.

## BNNs(tune, ImageNet)

### Article

Binarized Neural Networks on the ImageNet Classification Task

### Author

- Xundong Wu
- Yong Wu
- Yong Zhao

### Time
2016.4

### Method

binary/ternary

### Ideas/Contributions

First Layer: regular weight convolutional layer

Training: When we use a lower learning rate of 0.001, we observed a weight distribution more uniform or concentrated around 0, indicating the weights are taking more steps to travel between -1 and +1.

dropout ratio of 0.2 was used for the dropout layer

wider than usual layers used for the first and second layers, to avoid information bottleneck associated with binarized networks.

### Experiments

- 13 layer network trained with regular targets achieved 80% top-5 accuracy.
- 13 layer network trained with soft targets. 
- Fine tuning the 13 layer network with combined soft and regular target achieved 84.1% top-5 accuracy

With a moderate size network of 13 layers, we obtained top-5 classification accuracy rate of 84.1 percent on validation set through network distillation, much better than previous published results of 73.2% on XNOR network and 69.1% on binarized GoogleNET.


-----------------

## GXNOR networks

**Article**: [Gated XNOR Networks: Deep Neural Networks with Ternary Weights and Activations under a Unified Discretization Framework](https://arxiv.org/abs/1705.09283)

**Author**: Lei Deng, Peng Jiao, Jing Pei, Zhenzhi Wu, Guoqi Li

**Time**: 2017.5

**Method**: Binary/Ternary

**Ideas/Contributions**:

1. 权值和激活值都在$\mathbb{Z}_N$离散数空间中，二值或三值是其特殊情况，
2. 对于三值的量化函数给出新的梯度近似（不同与BNN或XNOR-NET）
3. （discrete state transition，DST方法）基于概率实现权值的更新（即反向传播的为离散值，而非连续实数）
4. “门控”电路，有利于硬件实现

** Experiments**:

三值化的GXNOR在MNIST，CIFAR-10，SVHN中的准确率达到顶尖水平


---------------

## LBC
**未看**

**Article**: [Local Binary Convolutional Neural Networks](https://arxiv.org/abs/1608.06049)

**Author**: Felix Juefei-Xu, Vishnu Naresh Boddeti, Marios Savvides

**Time**: 16.08

** Method**: Compress

**Ideas/Contributions**:

Local binary patterns (LBP)

**Experiments**:

- 与常规CNN在MNIST，CIFAR-10,SVHN,ImageNet上保持了相近的准确率
- 离散和二值化权重，减少9~169倍参数（取决于filter大小）

[Low-Precision Batch-Normalized Activations](https://arxiv.org/abs/1702.08231)

大致是替换Batch Normalizaed和ReLu为低精度版本以减少计算量和空间


-------------------

**Article**:[Network Sketching: Exploiting Binary Structure in Deep CNNs](https://arxiv.org/abs/1706.02021)

**Author**: Yiwen Guo, Anbang Yao, Hao Zhao, Yurong Chen

**Time**: 2017.06

**Method**: binary-weight CNNs



------------

## TNNs

**Article**: [Ternary Neural Networks for Resource-Efficient AI Applications](https://arxiv.org/abs/1609.00222)

**Author**: Hande Alemdar, Vincent Leroy, Adrien Prost-Boucle, Frédéric Pétrot

**Time**: 2016.09

**Method**: Binary/Ternary

**Ideas/Contributions**:

Training: Inputs({-1, 0, 1}), Weights(R), Activations({-1, 0, 1})  
Deployment: Inputs({-1, 0, 1}), Weights({-1, 0, 1}), Activations({-1, 0, 1})

Fully Discretized(完全不用浮点运算和乘法操作(TEST))

FGPA，ASIC硬件实现

**Experiments**:

|MNIST |CIFAR10 |SVHN |GTRSB| CIFAR100
--|--|--
TNNs | 1.67 | 12.11 | 2.73 | 0.98 | 48.40


------------

## TNNs

**Article**: [Ternary Neural Networks with Fine-Grained Quantization](https://arxiv.org/abs/1705.01462)

**Author**: Naveen Mellempudi, Abhisek Kundu, Dheevatsa Mudigere, Dipankar Das, Bharat Kaul, Pradeep Dubey

**Time**: 2017.05

**Method**: Compress

**Ideas/Contributions**:

ternary weights, 8/4bit activations, with minimal or no re-training, and yet achieving near state-of-art accuracy.

propose a novel fine-grained quantization (FGQ) method to convert pre-trained models to a ternary representation with minimal loss in test accuracy, without re-training.

**Experiments**:

achieve classification accuracy (Top-1) of 73.85% with 8-bit activations (2w-8a) and 70.69% with 4-bit activations (2w-4a), on the ImageNet dataset using a pre-trained Resnet-101 model (no re-training).

------------

## TWN

**Article**: [Ternary weight networks](https://arxiv.org/abs/1605.04711)

**Author**: Fengfu Li, Bo Zhang, Bin Liu

**Time**: 2016.05

**Method**: ternay-weight

**Ideas/Contributions**:

权重三值化，并有缩放因子

**Experiments**:

Validation accuracies (%). Results on ImageNet are with ResNet-18 / ResNet-18B.
![Validation accuracies (%). Results on ImageNet are with ResNet-18 / ResNet-18B.](TWN_result.png)

------------

## BFCN

**Article**: [Training Bit Fully Convolutional Network for Fast Semantic Segmentation](https://arxiv.org/abs/1612.00212)

**Author**: He Wen, Shuchang Zhou, Zhe Liang, Yuxiang Zhang, Dieqiao Feng, Xinyu Zhou, Cong Yao

**Time**: 2016.12

**Method**: quantization

**Ideas/Contributions**:

Apply on Semantic Segmentation: propose BFCN, an FCN that has low bit-width weights and activations, which is an extension to the combination of methods from Binarized Neural Network, XNOR-net and DoReFa-net.

replace the convolutional filter in reconstruction with residual blocks to better suit the need of low bit-width network.

propose a novel bit-width decay method to train BFCN with better performance.


**Experiments**:

network |VOC12| Cityscapes| speedup
--|--|--
32-bit FCN| 69.8% |62.1%| 1x
2-bit BFCN | 67.0% | 60.3% | 4.1x
1-2 BFCN | 62.8% | 57.4% | 7.8x

## TTQ
**Article**: [Trained Ternary Quantization](https://arxiv.org/abs/1612.01064)

**Authors**: Chenzhuo Zhu, Song Han, Huizi Mao, William J. Dally

**Time**: 2016.12

**Method**: ternary-weight

**Ideas/Contributions**:

三值化时正负缩放因子不同且可训练


**Experiments**:

Top1 and Top5 error rate of AlexNet on ImageNet:

Error | Full precision | 1-bit (DoReFa) | 2-bit (TWN) | 2-bit (Ours)
--|--|--
Top1 | 42.8% | 46.1% | 45.5% | 42.5%
Top5 | 19.7% | 23.7% | 23.2% | 20.3%


## XNOR-Net
**Article**: [XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks](https://arxiv.org/abs/1603.05279)

**Authors**: Mohammad Rastegari, Vicente Ordonez, Joseph Redmon, Ali Farhadi

**Time**: 2016.03

**Method**: binary-weight+binary

**Ideas/Contributions**:

在BinaryNets的基础上添加缩放因子 

Binary-Weight-Networks 和 XNOR-NET两种结构

B-A-C-P 结构

**Experiments**:

This results in 58× faster convolutional operations (in terms of number of the high precision operations) and 32× memory savings.

BWN and XNOR-Net outperform BinaryConnect(BC) and BinaryNet(BNN) in all the epochs by large margin(∼17%).

### QNN
**Article**: [Quantized Neural Networks: Training Neural Networks with Low Precision Weights and Activations](https://arxiv.org/abs/1609.07061)

**Authors**: Itay Hubara, Matthieu Courbariaux, Daniel Soudry, Ran El-Yaniv, Yoshua Bengio

**Time**: 2016.09

**Method**: Quantization

**Ideds/Contributions**:

在Binaried Neural Networks的基础上放宽，从binary变为quantization

**Experiments**:

![](QNN_experiments.png)

More Experiments On RNN, On GoolgeNet