A question on the nonlinear layers #1

Kevinpsk · 2021-12-10T17:11:51Z

Hi there,

Thanks a lot for releasing the code.
I have a question regarding the nonlinear layers such as GELU, softmax or even LayerNorm (as it has RSQRT). If I understand your code correctly, you are using the floating-point versions of their implementations in the QAT model. Does this mean that we are not actually simulating the quantized behaviours of these layers in the QAT model accurately? Maybe these layers are implemented as look-up tables or they have full int implementations on hardware devices, and not simulating these in QAT has minimal impact on quantized model performance? Can you clarify on this a bit more?

Thanks a lot.

akamaster · 2022-04-10T12:13:22Z

I am also wondering about this. Can anyone from the authors/developers clarify?

fxmarty · 2022-10-24T11:02:22Z

Yes, it looks like QDQ is used here:

transformer-quantization/models/quantized_bert.py

Line 197 in 18ae42c

attention_probs = nn.Softmax(dim=-1)(attention_scores)

transformer-quantization/quantization/autoquant_utils.py

Line 55 in 18ae42c

class QuantLayerNorm(QuantizationHijacker, nn.LayerNorm):

transformer-quantization/models/quantized_bert.py

Line 291 in 18ae42c

return quantize_model(nn.Sequential(m_dense, m_act), **quant_params)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A question on the nonlinear layers #1

A question on the nonlinear layers #1

Kevinpsk commented Dec 10, 2021

akamaster commented Apr 10, 2022 •

edited

Loading

fxmarty commented Oct 24, 2022

A question on the nonlinear layers #1

A question on the nonlinear layers #1

Comments

Kevinpsk commented Dec 10, 2021

akamaster commented Apr 10, 2022 • edited Loading

fxmarty commented Oct 24, 2022

akamaster commented Apr 10, 2022 •

edited

Loading