Skip to content

Commit ad1f556

Browse files
askhadelinkerzhang
authored andcommitted
Add clarification for bias quantization in QlinearConv Op spec (#2464)
1 parent d9a73cc commit ad1f556

File tree

3 files changed

+10
-3
lines changed

3 files changed

+10
-3
lines changed

docs/Changelog.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9735,6 +9735,8 @@ This version of the operator has been available since version 10 of the default
97359735
and computes the quantized output. Each scale and zero-point pair must have same shape.
97369736
It means they must be either scalars (per tensor) or 1-D tensors (per output channel).
97379737
Each input or output and its related zero point must have same type.
9738+
When bias is present it must be quantized using scale = input scale * weight scale and
9739+
zero point as 0.
97389740

97399741
#### Version
97409742

@@ -9777,7 +9779,7 @@ This version of the operator has been available since version 10 of the default
97779779
<dt><tt>y_zero_point</tt> : T3</dt>
97789780
<dd>Scale tensor for output 'y'. It's a scalar, which means a per-tensor/layer quantization.</dd>
97799781
<dt><tt>B</tt> (optional) : T4</dt>
9780-
<dd>Optional 1D bias to be added to the convolution, has size of M.</dd>
9782+
<dd>Optional 1D bias to be added to the convolution, has size of M. Bias must be quantized using scale = x_scale * w_scale and zero_point = 0</dd>
97819783
</dl>
97829784

97839785
#### Outputs

docs/Operators.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10700,6 +10700,8 @@ expect(node, inputs=[x, y], outputs=[z],
1070010700
and computes the quantized output. Each scale and zero-point pair must have same shape.
1070110701
It means they must be either scalars (per tensor) or 1-D tensors (per output channel).
1070210702
Each input or output and its related zero point must have same type.
10703+
When bias is present it must be quantized using scale = input scale * weight scale and
10704+
zero point as 0.
1070310705

1070410706
#### Version
1070510707

@@ -10742,7 +10744,7 @@ This version of the operator has been available since version 10 of the default
1074210744
<dt><tt>y_zero_point</tt> : T3</dt>
1074310745
<dd>Scale tensor for output 'y'. It's a scalar, which means a per-tensor/layer quantization.</dd>
1074410746
<dt><tt>B</tt> (optional) : T4</dt>
10745-
<dd>Optional 1D bias to be added to the convolution, has size of M.</dd>
10747+
<dd>Optional 1D bias to be added to the convolution, has size of M. Bias must be quantized using scale = x_scale * w_scale and zero_point = 0</dd>
1074610748
</dl>
1074710749

1074810750
#### Outputs

onnx/defs/nn/defs.cc

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -778,6 +778,8 @@ a quantized filter, its scale and zero point, and output's scale and zero point,
778778
and computes the quantized output. Each scale and zero-point pair must have same shape.
779779
It means they must be either scalars (per tensor) or 1-D tensors (per output channel).
780780
Each input or output and its related zero point must have same type.
781+
When bias is present it must be quantized using scale = input scale * weight scale and
782+
zero point as 0.
781783
)DOC";
782784

783785
ONNX_OPERATOR_SET_SCHEMA(
@@ -849,7 +851,8 @@ ONNX_OPERATOR_SET_SCHEMA(
849851
.Input(
850852
8,
851853
"B",
852-
"Optional 1D bias to be added to the convolution, has size of M.",
854+
"Optional 1D bias to be added to the convolution, has size of M. "
855+
"Bias must be quantized using scale = x_scale * w_scale and zero_point = 0",
853856
"T4",
854857
OpSchema::Optional)
855858
.Output(

0 commit comments

Comments
 (0)