Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extend supported types by QLinearMatMul (float16, float 8 types) #5473

Merged
merged 28 commits into from
Nov 16, 2023

Conversation

xadupre
Copy link
Contributor

@xadupre xadupre commented Aug 4, 2023

Description

QLinearMatMul is used on quantized types. This PR extends the list of supported quantized types to float 8 types and the list of supported inputs types to float16, bfloat16.

Signed-off-by: Xavier Dupre <xadupre@microsoft.com>
Signed-off-by: Xavier Dupre <xadupre@microsoft.com>
Signed-off-by: Xavier Dupre <xadupre@microsoft.com>
Signed-off-by: Xavier Dupre <xadupre@microsoft.com>
Signed-off-by: Xavier Dupre <xadupre@microsoft.com>
onnx/test/reference_evaluator_test.py Fixed Show fixed Hide fixed
onnx/test/reference_evaluator_test.py Fixed Show fixed Hide fixed
onnx/test/reference_evaluator_test.py Fixed Show fixed Hide fixed
onnx/test/reference_evaluator_test.py Fixed Show fixed Hide fixed
Signed-off-by: Xavier Dupre <xadupre@microsoft.com>
Signed-off-by: Xavier Dupre <xadupre@microsoft.com>
Signed-off-by: Xavier Dupre <xadupre@microsoft.com>
@xadupre xadupre changed the title Extend supported types by QLinearConv (int8, float 8 types) Extend supported types by QLinearMatMul (int8, float 8 types) Aug 7, 2023
Signed-off-by: Xavier Dupre <xadupre@microsoft.com>
Signed-off-by: Xavier Dupre <xadupre@microsoft.com>
Signed-off-by: xadupre <xadupre@microsoft.com>
Signed-off-by: Xavier Dupre <xadupre@microsoft.com>
@xadupre xadupre marked this pull request as ready for review August 30, 2023 17:16
@xadupre xadupre requested review from a team as code owners August 30, 2023 17:16
@justinchuby justinchuby added this to the 1.15 milestone Aug 30, 2023
@gramalingam gramalingam added the operator Issues related to ONNX operators label Aug 30, 2023
Signed-off-by: Xavier Dupre <xadupre@microsoft.com>
Signed-off-by: Xavier Dupre <xadupre@microsoft.com>
Copy link
Contributor

@gramalingam gramalingam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At some point, we should try to turn this into a function "Dequantize => MatMul => Quantize". But fine if that is done separately. (There may be some questions about precision of intermediate values etc. there.)

xadupre and others added 2 commits September 5, 2023 14:15
Signed-off-by: Xavier Dupre <xadupre@microsoft.com>
@WilliamTambellini
Copy link

Tks @xadupre Does it mean int4 is not yet supported ?

@xadupre
Copy link
Contributor Author

xadupre commented Sep 12, 2023

Int4 is not defined yet in onnx. That would be the first step before adding it to the list of supported types. Maybe it would be worth discussing it during one of the sig meeting.

Copy link
Contributor

@gramalingam gramalingam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(As per offline discussion, Xavier suggests holding off this PR changes until better float 8 quantization support is available. Adding this comment to avoid accidental merge.)

@justinchuby justinchuby modified the milestones: 1.15, 1.16 Sep 19, 2023
@justinchuby
Copy link
Contributor

Moved to 1.16

@WilliamTambellini
Copy link

That s unfortunate. What about reducing the scope and at least add uint8/int8 (no float8)?
Fast int8 is now already available in most (4th gen) CPUs, and most recent nvidia gpus.

Signed-off-by: Xavier Dupre <xadupre@microsoft.com>
Signed-off-by: Xavier Dupre <xadupre@microsoft.com>
Signed-off-by: Xavier Dupre <xadupre@microsoft.com>
Signed-off-by: Xavier Dupre <xadupre@microsoft.com>
Signed-off-by: Xavier Dupre <xadupre@microsoft.com>
Signed-off-by: Xavier Dupre <xadupre@microsoft.com>
Signed-off-by: Xavier Dupre <xadupre@microsoft.com>
Signed-off-by: Xavier Dupre <xadupre@microsoft.com>
Signed-off-by: Xavier Dupre <xadupre@microsoft.com>
@xadupre xadupre changed the title Extend supported types by QLinearMatMul (int8, float 8 types) Extend supported types by QLinearMatMul (float16, float 8 types) Oct 27, 2023
@codecov
Copy link

codecov bot commented Oct 27, 2023

Codecov Report

Attention: 47 lines in your changes are missing coverage. Please review.

Comparison is base (ede2c77) 56.06% compared to head (4eaf601) 56.04%.

Files Patch % Lines
onnx/backend/test/case/node/qlinearmatmul.py 0.00% 46 Missing ⚠️
onnx/reference/ops/op_qlinear_matmul.py 50.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #5473      +/-   ##
==========================================
- Coverage   56.06%   56.04%   -0.02%     
==========================================
  Files         501      501              
  Lines       29366    29409      +43     
  Branches     4404     4413       +9     
==========================================
+ Hits        16463    16482      +19     
- Misses      12091    12115      +24     
  Partials      812      812              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Signed-off-by: Xavier Dupre <xadupre@microsoft.com>
@xadupre xadupre added this pull request to the merge queue Nov 16, 2023
Merged via the queue into onnx:main with commit b60f694 Nov 16, 2023
37 checks passed
@xadupre xadupre deleted the qm branch November 16, 2023 12:07
xadupre added a commit to microsoft/onnxruntime that referenced this pull request Jan 12, 2024
…hts (#18043)

### Description

Whenever a node QuantizeLinear or DequantizeLinear, the type of the
weights before being quantize must be known to create the scale with the
expected type. Another option would be to add many operator CastLike but
that would push the burden to onnxruntime optimizer.

The PR tries to avoid changing the signature. To do so, it modified the
scale computation to use a numpy array to store the result and not a
python float. The numpy array must be of the same type than the weights
to quantize.

The PR adds many `assert` to check the type of the scale is not a python
type or a float64. This was added to make sure all the code follows the
same logic. These lines were kept for the first review.

DequantizeLinear, QuantizeLinear cannot be tested with onnx==1.15. PR
onnx/onnx#5709 is missing to fix shape
inference. PR onnx/onnx#5473) is missing to
support QLinearMatMul with float 16. That explains why some tests are
disabled with float 16.

### Motivation and Context

The current quantization tool assumes every weight is float 32. For
large models such as LLAMA, it is usually float 16. The quantization
needs to quantize such weights.
mszhanyi pushed a commit to microsoft/onnxruntime that referenced this pull request Jan 15, 2024
…hts (#18043)

### Description

Whenever a node QuantizeLinear or DequantizeLinear, the type of the
weights before being quantize must be known to create the scale with the
expected type. Another option would be to add many operator CastLike but
that would push the burden to onnxruntime optimizer.

The PR tries to avoid changing the signature. To do so, it modified the
scale computation to use a numpy array to store the result and not a
python float. The numpy array must be of the same type than the weights
to quantize.

The PR adds many `assert` to check the type of the scale is not a python
type or a float64. This was added to make sure all the code follows the
same logic. These lines were kept for the first review.

DequantizeLinear, QuantizeLinear cannot be tested with onnx==1.15. PR
onnx/onnx#5709 is missing to fix shape
inference. PR onnx/onnx#5473) is missing to
support QLinearMatMul with float 16. That explains why some tests are
disabled with float 16.

### Motivation and Context

The current quantization tool assumes every weight is float 32. For
large models such as LLAMA, it is usually float 16. The quantization
needs to quantize such weights.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
operator Issues related to ONNX operators
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants