Regarding the quantization matmul operator and softmax operator in PyTorch #2247
Unanswered
xiexiaozheng
asked this question in
Q&A
Replies: 1 comment 1 reply
-
Hi @xiexiaozheng,
More information about which operations support execution in INT8 precision and how a quantized model from its original precision is transformed to a low precision is here https://docs.openvino.ai/2023.1/openvino_docs_OV_UG_lpt.html |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
@alexsu52 Hi, I attempted to QAT quantize a toy model that contains matmul and softmax operators. However, after exporting the model using torch.onnx.export, I noticed that no fake quantization nodes were inserted after the matmul operator, and there were none after softmax either. Why is that?
mode code is like this:
the model with onnx format like this
Beta Was this translation helpful? Give feedback.
All reactions