Describe the issue
ONNX Runtime CPUExecutionProvider produces different results for a constant DequantizeLinear -> Reshape pattern when graph optimization is enabled.
The model contains an int8 constant tensor:
[12, -45, 7, -3]
with scale 0.02, followed by DequantizeLinear and Reshape.
With ORT_DISABLE_ALL, the output is correct:
[[0.24, -0.9], [0.14, -0.06]]
With ORT_ENABLE_ALL, the negative dequantized values become 0:
[[0.24, 0.0], [0.14, 0.0]]
This indicates that an optimization or constant-folding path changes the semantics of the model.
To reproduce
import sys
import numpy as np
import onnxruntime as ort
from onnx import TensorProto, helper, numpy_helper
def make_model():
weights = np.array([12, -45, 7, -3], dtype=np.int8)
scale = np.float32(0.02)
shape = np.array([2, 2], dtype=np.int64)
nodes = [
helper.make_node("Constant", [], ["wq"], value=numpy_helper.from_array(weights)),
helper.make_node("Constant", [], ["scale"], value=numpy_helper.from_array(scale)),
helper.make_node("Constant", [], ["shape"], value=numpy_helper.from_array(shape)),
helper.make_node("DequantizeLinear", ["wq", "scale"], ["wf"]),
helper.make_node("Reshape", ["wf", "shape"], ["y"]),
]
graph = helper.make_graph(
nodes,
"g",
[],
[helper.make_tensor_value_info("y", TensorProto.FLOAT, [2, 2])],
)
model = helper.make_model(graph, opset_imports=[helper.make_opsetid("", 21)])
model.ir_version = 10
return model.SerializeToString()
def run(level):
options = ort.SessionOptions()
options.graph_optimization_level = level
sess = ort.InferenceSession(
make_model(),
sess_options=options,
providers=["CPUExecutionProvider"],
)
return sess.run(None, {})[0]
disabled = run(ort.GraphOptimizationLevel.ORT_DISABLE_ALL)
enabled = run(ort.GraphOptimizationLevel.ORT_ENABLE_ALL)
print("disable_all:", disabled)
print("enable_all:", enabled)
print("max_abs_diff:", float(np.max(np.abs(disabled - enabled))))
print("PASS=", np.allclose(disabled, enabled))
sys.exit(0 if not np.allclose(disabled, enabled) else 1)
Urgency
Expected output
disable_all: [[ 0.24 -0.9 ]
[ 0.14 -0.06]]
enable_all: [[ 0.24 -0.9 ]
[ 0.14 -0.06]]
max_abs_diff: 0.0
PASS=True
Actual output
disable_all: [[ 0.24 -0.9 ]
[ 0.14 -0.06]]
enable_all: [[0.24 0. ]
[0.14 0. ]]
max_abs_diff: 0.8999999761581421
PASS=False
Platform
Linux
OS Version
Linux-6.17.0-20-generic-x86_64-with-glibc2.39
ONNX Runtime Installation
Released Package
ONNX Runtime Version or Commit ID
1.25.1
ONNX Runtime API
Python
Architecture
X86
Execution Provider
Default CPU
Execution Provider Library Version
No response
Describe the issue
ONNX Runtime
CPUExecutionProviderproduces different results for a constantDequantizeLinear -> Reshapepattern when graph optimization is enabled.The model contains an
int8constant tensor:[12, -45, 7, -3]with scale
0.02, followed byDequantizeLinearandReshape.With
ORT_DISABLE_ALL, the output is correct:[[0.24, -0.9], [0.14, -0.06]]With
ORT_ENABLE_ALL, the negative dequantized values become0:[[0.24, 0.0], [0.14, 0.0]]This indicates that an optimization or constant-folding path changes the semantics of the model.
To reproduce
Urgency
Expected output
Actual output
Platform
Linux
OS Version
Linux-6.17.0-20-generic-x86_64-with-glibc2.39
ONNX Runtime Installation
Released Package
ONNX Runtime Version or Commit ID
1.25.1
ONNX Runtime API
Python
Architecture
X86
Execution Provider
Default CPU
Execution Provider Library Version
No response