Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A question on the nonlinear layers #1

Open
Kevinpsk opened this issue Dec 10, 2021 · 2 comments
Open

A question on the nonlinear layers #1

Kevinpsk opened this issue Dec 10, 2021 · 2 comments

Comments

@Kevinpsk
Copy link

Hi there,

Thanks a lot for releasing the code.
I have a question regarding the nonlinear layers such as GELU, softmax or even LayerNorm (as it has RSQRT). If I understand your code correctly, you are using the floating-point versions of their implementations in the QAT model. Does this mean that we are not actually simulating the quantized behaviours of these layers in the QAT model accurately? Maybe these layers are implemented as look-up tables or they have full int implementations on hardware devices, and not simulating these in QAT has minimal impact on quantized model performance? Can you clarify on this a bit more?

Thanks a lot.

@akamaster
Copy link

akamaster commented Apr 10, 2022

I am also wondering about this. Can anyone from the authors/developers clarify?

@fxmarty
Copy link

fxmarty commented Oct 24, 2022

Yes, it looks like QDQ is used here:

attention_probs = nn.Softmax(dim=-1)(attention_scores)

class QuantLayerNorm(QuantizationHijacker, nn.LayerNorm):

return quantize_model(nn.Sequential(m_dense, m_act), **quant_params)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants