Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

is BatchNorm binarized? #714

Open
emiliopaolini opened this issue Oct 18, 2021 · 5 comments
Open

is BatchNorm binarized? #714

emiliopaolini opened this issue Oct 18, 2021 · 5 comments

Comments

@emiliopaolini
Copy link

Hi everyone, i have a simple question: if i look at the summary of every model in larq, it looks like that computation for batch normalization has been binarized. I would like to know how this is done. If it adopts the same strategy as convolution/dense layer or if i am wrong and batch normalization is not binarized at all. Thank you very much!

@Tombana
Copy link
Contributor

Tombana commented Oct 20, 2021

Hi @emiliopaolini,

The model summary should show the batchnormalization parameters as 32-bit, they are not binarized. When the model is converted to a tflite file, then the batchnormalization layer can be fused with the convolution layer, but internally those batchnorm parameters are still 32-bit.

@emiliopaolini
Copy link
Author

Okay i understand. So for example, if i would like to fuse the conv layer and the batch norm and thus removing the batch normalization layer, do you think is it possible? I know that it can be done when the network is not quantized (https://github.com/yaysummeriscoming/Merge_Batch_Norm/blob/master/Remove%20BN%20Transform.pdf). Does the same apply to quantized neural networks? Thank you very much!

@Tombana
Copy link
Contributor

Tombana commented Oct 20, 2021

Yes that is possible, TFLite supports this. If the network is not quantized, the batchnormalization values will be fused into the weight matrix and the bias vector.
If the network is int8 quantized then the tflite specification states that a convolution filter has "per-channel quantization parameters", which means each channel gets a 32-bit scale factor, which effectively includes the batchnorm multiplier.

@emiliopaolini
Copy link
Author

So, in order to understand: can i apply the same equations that are applied to the non-quantized case? And then inference is still performed using binary fused parameters right?

@Tombana
Copy link
Contributor

Tombana commented Oct 21, 2021

For float or int8 convolutions, the batchnorm coefficients can be fused into the weight matrix.
For binary convolutions, it is not possible to fuse these multipliers into the weight matrix, because then the weight matrix is no longer {-1,+1}-valued (it would no longer be a binary convolution). However, Larq compute engine still supports a 'fused' batchnorm in the following sense: after doing the accumulation of -1,+1 values, the batchnorm computation is applied before writing the results back to memory. In other words, the batchnorm is not a separate layer anymore, but fused into the convolution.

So in terms of the equations in your PDF: gamma / sqrt( sigma^2 + epsilon ) becomes a single coefficient. For float and int8 layers, that coefficient is fused into the weights themselves. For binary layers, that coefficient is stored separately, but the computation is still fused together with the convolution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants