Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add QDQBert model and QAT example of SQUAD task #14057

Closed

Conversation

shangz-ai
Copy link
Contributor

@shangz-ai shangz-ai commented Oct 18, 2021

What does this PR do?

This PR includes:

  1. Add support of Q/DQ BERT model based on HF BERT model.
    (src/transformers/models/qdqbert/)

QDQBERT model add fake quantization operations (pair of QuantizeLinear/DequantizeLinear ops) to:

  • linear layer inputs and weights
  • matmul inputs
  • residual add inputs

in BERT model.

QDQBERT model will be able to load from any checkpoint of HF BERT model, and perform Quantization Aware Training/Post Training Quantization with the support from PyTorch-Quantization toolkit.

  1. Add an example of SQUAD tasks finetuned by the QDQBERT model and inferenced by TensorRT
    (examples/pytorch/question-answering/QAT-qdqbert/)

In the example, we use qdqbert model to do Quantization Aware Training from pretrained HF BERT model on SQUAD task. Then TensorRT can run the inference of the generated ONNX model for optimal INT8 performance out-of-the-box.

Also added a module in (examples/pytorch/question-answering/run_qa.py, trainer_qa.py) for saving the SQUAD task specific BERT model as ONNX files, for a consistency check with QAT-qdqbert example.

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link
    to it if that's the case.

A related discussion on this topic Issue 10639

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@shangz-ai shangz-ai changed the title Add qdqbert model Add QDQBert model Oct 18, 2021
@shangz-ai shangz-ai changed the title Add QDQBert model Add QDQBert model and BERT QAT example Oct 18, 2021
@shangz-ai shangz-ai changed the title Add QDQBert model and BERT QAT example Add QDQBert model and QAT example of SQUAD task Oct 18, 2021
@shangz-ai
Copy link
Contributor Author

Hi, this PR includes both the support of QDQBert model and the QAT example of using QDQBert model for SQUAD task.
I'm not sure whether it is the right place to leave the QAT example at examples/pytorch/question-answering/QAT-qdqbert/, since this QAT example will be nicer to compare with regular BERT SQUAD task at examples/pytorch/question-answering/.
Comments are welcome for the QAT example and other parts as well. : ) @LysandreJik

@shangz-ai shangz-ai marked this pull request as ready for review October 18, 2021 23:37
Copy link
Collaborator

@sgugger sgugger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your PR. Note that it's hard to review because it include changes from other commits on master (bad rebase?) so it would be better if you could re-open a clean PR from your branch.

Concerning the examples:

  1. I don't think the QAT example should go in the examples maintained by the team, given it introduces a lot of new code no one on the team wrote and will be able to maintain properly. It should go in a research project.
  2. The classic QA example should not be touched by this PR. In general any new functionality should be added to all examples at the same time, which could be done in a separate PR. It's also my understanding that the ONNX conversion won't work for many of the models, but maybe I'm wrong on this.

@shangz-ai
Copy link
Contributor Author

Thanks for your PR. Note that it's hard to review because it include changes from other commits on master (bad rebase?) so it would be better if you could re-open a clean PR from your branch.

Concerning the examples:

  1. I don't think the QAT example should go in the examples maintained by the team, given it introduces a lot of new code no one on the team wrote and will be able to maintain properly. It should go in a research project.
  2. The classic QA example should not be touched by this PR. In general any new functionality should be added to all examples at the same time, which could be done in a separate PR. It's also my understanding that the ONNX conversion won't work for many of the models, but maybe I'm wrong on this.

Thanks for the comments! I'm opening up a new PR here: #14066
based on the latest master branch.
The QAT example now goes into transformers/examples/research_projects/qat-qdqbert/, and the classic QA examples are untouched.

@shangz-ai shangz-ai closed this Nov 5, 2021
@shangz-ai shangz-ai deleted the add-nvidia-qdqbert-model branch November 19, 2021 18:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants