Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for INT8 quantization of fusion_gru op #27330

Closed
wojtuss opened this issue Sep 15, 2020 · 5 comments
Closed

Add support for INT8 quantization of fusion_gru op #27330

wojtuss opened this issue Sep 15, 2020 · 5 comments
Assignees

Comments

@wojtuss
Copy link

wojtuss commented Sep 15, 2020

Please, enable INT8 quantization of the fusion_gru op using oneDNN-based INT8 kernel.

@wojtuss wojtuss self-assigned this Sep 15, 2020
@wojtuss
Copy link
Author

wojtuss commented Sep 17, 2020

After enabling INT8 quantization of the fusion_gru op (a PR will be submitted soon) we achieved the following accuracy result for the QAT GRU model on a CLX 6248 (with VNNI support):

  qat (fp32) int8 diff
Precision 0.89198 0.89221 0.00023
Recall 0.89449 0.89412 -0.00037
F1 score 0.89323 0.89316 -0.00007

@lidanqing-intel
Copy link
Contributor

PR #27481 submitted

@wojtuss
Copy link
Author

wojtuss commented Nov 30, 2020

A support for INT8 quantization of the GRU model is added to Paddle. The only piece we are waiting to add here is a version of oneDNN with additional GRU INT8 optimizations. The upgrade of oneDNN will be submitted in the PR #28420.

@wojtuss
Copy link
Author

wojtuss commented Jan 4, 2021

The PR #28420 is merged so the support for INT8 quantization of the GRU model is done.

@wojtuss wojtuss closed this as completed Jan 4, 2021
@paddle-bot-old
Copy link

paddle-bot-old bot commented Jan 4, 2021

Are you satisfied with the resolution of your issue?

YES
No

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants