Why use 22 bit quantized activations for some layer norms (except in Embeddings)? #5

bdalal · 2021-03-30T22:25:49Z

Hi,
I've noticed that the QuantAct layers preceding IntLayerNorm in the IBertSelfOutput and IbertOutput modules specify a 22 bit activation width while the QuantAct layer preceding IntLayerNorm in IBertEmbedding specifies a 16 bit activation.

I couldn't find any mention of these bit width choices in the paper. Could you please explain why these choices have been made?

Thank you!

The text was updated successfully, but these errors were encountered:

kssteven418 · 2021-03-31T11:50:16Z

Those numbers are manually chosen to (1) avoid overflow and (2) minimize accuracy degradation.

We find that activations in Embedding layers are somewhat regular and contain fewer number of outliers, allowing 16-bit quantization without accuracy degradation. In contrary, activations in Transformer layers contain more number of outliers (that are sometimes orders of magnitude larger) and, therefore, assigning 16-bit for them could have a significant impact in accuracy. We find that 22-bit is a large enough bit width to avoid performance drop and at the same time avoid overflow in the subsequent IntLayerNorm layers. Therefore, it is fine to use 22-bit in Emedding layers as well - which is a more conservative strategy.

bdalal · 2021-03-31T16:08:57Z

That makes sense. Thank you for the clarification!

bdalal added the question Further information is requested label Mar 30, 2021

bdalal closed this as completed Mar 31, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why use 22 bit quantized activations for some layer norms (except in Embeddings)? #5

Why use 22 bit quantized activations for some layer norms (except in Embeddings)? #5

bdalal commented Mar 30, 2021

kssteven418 commented Mar 31, 2021

bdalal commented Mar 31, 2021

Why use 22 bit quantized activations for some layer norms (except in Embeddings)? #5

Why use 22 bit quantized activations for some layer norms (except in Embeddings)? #5

Comments

bdalal commented Mar 30, 2021

kssteven418 commented Mar 31, 2021

bdalal commented Mar 31, 2021