is BatchNorm binarized? #714

emiliopaolini · 2021-10-18T06:48:47Z

Hi everyone, i have a simple question: if i look at the summary of every model in larq, it looks like that computation for batch normalization has been binarized. I would like to know how this is done. If it adopts the same strategy as convolution/dense layer or if i am wrong and batch normalization is not binarized at all. Thank you very much!

Tombana · 2021-10-20T12:31:49Z

Hi @emiliopaolini,

The model summary should show the batchnormalization parameters as 32-bit, they are not binarized. When the model is converted to a tflite file, then the batchnormalization layer can be fused with the convolution layer, but internally those batchnorm parameters are still 32-bit.

emiliopaolini · 2021-10-20T13:33:10Z

Okay i understand. So for example, if i would like to fuse the conv layer and the batch norm and thus removing the batch normalization layer, do you think is it possible? I know that it can be done when the network is not quantized (https://github.com/yaysummeriscoming/Merge_Batch_Norm/blob/master/Remove%20BN%20Transform.pdf). Does the same apply to quantized neural networks? Thank you very much!

Tombana · 2021-10-20T15:34:40Z

Yes that is possible, TFLite supports this. If the network is not quantized, the batchnormalization values will be fused into the weight matrix and the bias vector.
If the network is int8 quantized then the tflite specification states that a convolution filter has "per-channel quantization parameters", which means each channel gets a 32-bit scale factor, which effectively includes the batchnorm multiplier.

emiliopaolini · 2021-10-21T07:13:36Z

So, in order to understand: can i apply the same equations that are applied to the non-quantized case? And then inference is still performed using binary fused parameters right?

Tombana · 2021-10-21T07:47:27Z

For float or int8 convolutions, the batchnorm coefficients can be fused into the weight matrix.
For binary convolutions, it is not possible to fuse these multipliers into the weight matrix, because then the weight matrix is no longer {-1,+1}-valued (it would no longer be a binary convolution). However, Larq compute engine still supports a 'fused' batchnorm in the following sense: after doing the accumulation of -1,+1 values, the batchnorm computation is applied before writing the results back to memory. In other words, the batchnorm is not a separate layer anymore, but fused into the convolution.

So in terms of the equations in your PDF: gamma / sqrt( sigma^2 + epsilon ) becomes a single coefficient. For float and int8 layers, that coefficient is fused into the weights themselves. For binary layers, that coefficient is stored separately, but the computation is still fused together with the convolution.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

is BatchNorm binarized? #714

is BatchNorm binarized? #714

emiliopaolini commented Oct 18, 2021

Tombana commented Oct 20, 2021

emiliopaolini commented Oct 20, 2021

Tombana commented Oct 20, 2021

emiliopaolini commented Oct 21, 2021

Tombana commented Oct 21, 2021

is BatchNorm binarized? #714

is BatchNorm binarized? #714

Comments

emiliopaolini commented Oct 18, 2021

Tombana commented Oct 20, 2021

emiliopaolini commented Oct 20, 2021

Tombana commented Oct 20, 2021

emiliopaolini commented Oct 21, 2021

Tombana commented Oct 21, 2021