Skip to content

[Relax][ONNX] Add frontend support for QuantizeLinear, DequantizeLinear, and DynamicQuantizeLinear#19391

Merged
tlopex merged 2 commits intoapache:mainfrom
hunghsiangwang:onnx-frontend
Apr 12, 2026
Merged

[Relax][ONNX] Add frontend support for QuantizeLinear, DequantizeLinear, and DynamicQuantizeLinear#19391
tlopex merged 2 commits intoapache:mainfrom
hunghsiangwang:onnx-frontend

Conversation

@hunghsiangwang
Copy link
Copy Markdown
Contributor

Summary

This PR adds Relax ONNX frontend support for:

  • QuantizeLinear
  • DequantizeLinear
  • DynamicQuantizeLinear

The implementation follows existing TVM ONNX frontend patterns and keeps QDQ handling consistent for singleton quantization parameters and optional zero-point inputs.

Changes

  • add ONNX frontend converters for QuantizeLinear,DequantizeLinear, and DynamicQuantizeLinear
  • register Q/DQ-related ops in the ONNX converter map
  • handle optional zero-point inputs consistently during import
  • preserve singleton quantization parameter semantics in the QDQ legalization path
  • improve QDQ legalization behavior for imported ONNX models
  • add and update frontend tests for Q/DQ and DynamicQuantizeLinear

Tests

Added or updated tests in tests/python/relax/test_frontend_onnx.py to cover:

  • singleton-qparam QuantizeLinear in opset 10
  • singleton-qparam DequantizeLinear in opset 10
  • optional-zero-point QuantizeLinear in opset 13
  • DynamicQuantizeLinear in opset 11

Validation

Validated with:

  • python -m pytest -n 1 tests/python/relax/test_frontend_onnx.py -k "quantizelinear or dequantizelinear or dynamicquantizelinear" -v

Result:

  • 4 passed

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements the ONNX operators QuantizeLinear, DequantizeLinear, and DynamicQuantizeLinear in the Relax frontend. It also enhances the legalization and struct info inference for quantization operations to correctly handle singleton tensors (shape-[1]) as scalars and expands the supported data types for the zero_point parameter. The review feedback indicates that the axis attribute in the v10 implementations of QuantizeLinear and DequantizeLinear is currently hardcoded to 0, whereas it should default to 1 and be retrieved from the operator attributes to comply with the ONNX specification.

Comment on lines +317 to +322
x, scale = inputs[0], inputs[1]
zp = inputs[2] if len(inputs) > 2 and inputs[2] is not None else None
out_dtype = "uint8" if zp is None else zp.struct_info.dtype
if zp is None:
zp = relax.const(0, out_dtype)
return relax.op.quantize(x, scale, zp, axis=0, out_dtype=out_dtype)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The axis for quantization is hardcoded to 0. According to the ONNX specification for QuantizeLinear opset 10, there is an axis attribute that defaults to 1. This implementation should handle the axis attribute from attr, similar to how it's done in _impl_v13.

Suggested change
x, scale = inputs[0], inputs[1]
zp = inputs[2] if len(inputs) > 2 and inputs[2] is not None else None
out_dtype = "uint8" if zp is None else zp.struct_info.dtype
if zp is None:
zp = relax.const(0, out_dtype)
return relax.op.quantize(x, scale, zp, axis=0, out_dtype=out_dtype)
x, scale = inputs[0], inputs[1]
zp = inputs[2] if len(inputs) > 2 and inputs[2] is not None else None
axis = attr.get("axis", 1)
if hasattr(x.struct_info, "ndim") and x.struct_info.ndim <= 1 and axis == 1:
axis = 0
out_dtype = "uint8" if zp is None else zp.struct_info.dtype
if zp is None:
zp = relax.const(0, out_dtype)
return relax.op.quantize(x, scale, zp, axis=axis, out_dtype=out_dtype)

Comment on lines +340 to +344
x, scale = inputs[0], inputs[1]
zp = inputs[2] if len(inputs) > 2 and inputs[2] is not None else None
if zp is None:
zp = relax.const(0, x.struct_info.dtype)
return relax.op.dequantize(x, scale, zp, axis=0, out_dtype="float32")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The axis for dequantization is hardcoded to 0. The ONNX specification for DequantizeLinear opset 10 includes an axis attribute that defaults to 1. This should be handled from the attr dictionary, similar to the implementation in _impl_v13.

Suggested change
x, scale = inputs[0], inputs[1]
zp = inputs[2] if len(inputs) > 2 and inputs[2] is not None else None
if zp is None:
zp = relax.const(0, x.struct_info.dtype)
return relax.op.dequantize(x, scale, zp, axis=0, out_dtype="float32")
x, scale = inputs[0], inputs[1]
zp = inputs[2] if len(inputs) > 2 and inputs[2] is not None else None
axis = attr.get("axis", 1)
if hasattr(x.struct_info, "ndim") and x.struct_info.ndim <= 1 and axis == 1:
axis = 0
if zp is None:
zp = relax.const(0, x.struct_info.dtype)
return relax.op.dequantize(x, scale, zp, axis=axis, out_dtype="float32")

Copy link
Copy Markdown
Member

@tlopex tlopex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@tlopex tlopex merged commit 645fcf9 into apache:main Apr 12, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants