-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[IR] Add support for quant_parameter_tensor_names field #2080
Conversation
❌ 114 Tests Failed:
View the top 3 failed test(s) by shortest run time
To view more test analytics, go to the Test Analytics Dashboard |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PR Overview
This PR introduces support for quantization annotations by adding a new field ("quant_parameter_tensor_names") in the Value metadata. Key changes include:
- Updating the GraphProtocol documentation to outline the new quantization annotation behavior.
- Enhancing the deserialization logic by adding support for quantization annotations in graphs.
- Adding new serialization/deserialization utilities (_deserialize_quantization_annotation, _maybe_add_quantization_annotation, and _serialize_tensor_annotation_into) to incorporate quantization annotation data.
Reviewed Changes
File | Description |
---|---|
onnxscript/ir/_protocols.py | Updated docstring to reflect the new quantization annotation support. |
onnxscript/ir/serde.py | Extended graph/node deserialization and serialization to handle quantization annotations. |
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
Comments suppressed due to low confidence (1)
onnxscript/ir/serde.py:1273
- Ensure that there are unit tests covering the new quantization annotation paths in both serialization and deserialization for initializer and non-initializer values.
if input_.name not in from_.initializers:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PR Overview
This PR adds support for the quant_parameter_tensor_names field for quantization annotations in the IR. Key changes include:
- Adding a new test suite to verify the correct serialization and deserialization of quantization annotations.
- Extending the serde module with new functions to deserialize and serialize quantization annotations.
- Updating the GraphProtocol documentation to reflect the new quantization_annotation support.
Reviewed Changes
File | Description |
---|---|
onnxscript/ir/serde_test.py | Introduces unit tests for verifying quantization annotation serialization/deserialization. |
onnxscript/ir/serde.py | Implements the logic to correctly handle quant_parameter_tensor_names in graphs. |
onnxscript/ir/_protocols.py | Updates the documentation to reflect the new quantization_annotation field. |
Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.
Comments suppressed due to low confidence (2)
onnxscript/ir/serde.py:741
- Instead of using an assert to check that the tensor name in the annotation matches the value's name, consider adding explicit error handling (e.g., raising a descriptive exception) to handle any mismatches gracefully in production.
assert proto.tensor_name == value.name
onnxscript/ir/serde.py:1284
- [nitpick] Verify that using the condition 'if input_.name not in from_.initializers' reliably prevents double-adding quantization annotations for inputs. It may be beneficial to ensure that 'from_.initializers' is always set for values requiring annotations.
if input_.name not in from_.initializers:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not so familiar to quantization_annotation
in ONNX, and can't find more interpretation in this PR. Based on what I found in ONNX, it belongs to graph(https://github.com/search?q=repo%3Aonnx%2Fonnx%20quantization_annotation&type=code)? I am not sure if there is a better place for it in IR. cc @gramalingam
Approving this to unblock for the time being, also this does not seem like an irreversible change if needed.
@titaiwangms Thanks. I updated the pr field to explain the design choices. Essentially the info belong to individual values but is stored as a list in the GraphProto like ValueInfoProtos. |
Support quantization_annotation in graph inputs, node in/out and graph outputs.
Two design decisions made are
quant_parameter_tensor_names
information. This is similar to ValueInfoProto where in proto we store a list of proto messages whose keys point tensor names. But the information really belongs to individual values.quantization_annotation
is deserialized into the Value'smeta
field under thequant_parameter_tensor_names
key. Values that are stored under this key will be serialized as quantization annotations. I chose to add a value inmeta
instead of creating a new property in value to avoid over complicating the preperties in Value.Example usage