[Bug] Invalid type for QuantizeLinear dtype post-ORT optimizations

### Describe the issue

After optimizing an ONNX model using ONNX Runtime, I encounter an error when running the optimized model. The error message indicates that the QuantizeLinear operator received a zero-point of invalid dtype (int32). This issue does not exist in the original model and is introduced during optimization level 0 (Basic).

Error Message:
```
Traceback (most recent call last):
    ort_session = ort.InferenceSession(
  File "/usr/local/lib/python3.10/dist-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 419, in __init__
    self._create_inference_session(providers, provider_options, disabled_optimizers)
  File "/usr/local/lib/python3.10/dist-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 483, in _create_inference_session
    sess.initialize_session(providers, provider_options, disabled_optimizers)
onnxruntime.capi.onnxruntime_pybind11_state.InvalidGraph: [ONNXRuntimeError] : 10 : INVALID_GRAPH : This is an invalid model. Type Error: Type 'tensor(int32)' of input parameter (dq0_x_zero_point) of operator (QuantizeLinear) in node (r_out_q) is invalid.
```

I believe the error is introduced during optimization `QDQPropagationTransformer`, specifically [this line](https://github.com/microsoft/onnxruntime/blob/bbe7c8773837aa7573e202aefd2c633a06be2c23/onnxruntime/core/optimizer/qdq_transformer/qdq_propagation.cc#L160). `QuantizeLinear` does not support zero-points of type `int32` while `DequantizeLinear` does. This line of code inserts a `QuantizeLinear` node by copying its matching `DequantizeLinear` node's parameters, which causes the incorrect dtype. 



### To reproduce

Run the below script (note the commented line towards the end):
```
import onnx.helper as helper
import onnx.numpy_helper as numpy_helper
from onnx import TensorProto
import numpy as np
import onnxruntime as ort

dq0 = helper.make_node(
    'DequantizeLinear',
    ['dq0_x', 'dq0_x_scale', 'dq0_x_zero_point'],
    ['dq_out'],
)

r = helper.make_node(
    'Reshape',
    ['dq_out', 'shape'],
    ['r_out'],
)

add = helper.make_node(
    'Add',
    ['in', 'r_out'],
    ['add_out'],
)

q = helper.make_node(
    'QuantizeLinear',
    ['add_out', 'q_y_scale', 'q_y_zero_point'],
    ['q_out'],
)

dq1 = helper.make_node(
    'DequantizeLinear',
    ['q_out', 'dq1_x_scale', 'dq1_x_zero_point'],
    ['out'],
)

initializers = [
    numpy_helper.from_array(np.random.rand(1000).astype(np.int32), name='dq0_x'),
    numpy_helper.from_array(np.array(0.1, dtype=np.float32), name='dq0_x_scale'),
    numpy_helper.from_array(np.array(0, dtype=np.int32), name='dq0_x_zero_point'),
    numpy_helper.from_array(np.array(0.1, dtype=np.float32), name='dq1_x_scale'),
    numpy_helper.from_array(np.array(0, dtype=np.uint8), name='dq1_x_zero_point'),
    numpy_helper.from_array(np.array(0.1, dtype=np.float32), name='q_y_scale'),
    numpy_helper.from_array(np.array(0, dtype=np.uint8), name='q_y_zero_point'),
    numpy_helper.from_array(np.array([1, 1000], dtype=np.int64), name='shape')
]

graph = helper.make_graph(
    nodes=[dq0, r, add, q, dq1],
    name='QDQ-bug',
    inputs=[helper.make_tensor_value_info('in', TensorProto.FLOAT, [1, 1000])],
    outputs=[helper.make_tensor_value_info('out', TensorProto.FLOAT, [1, 1000])],
    initializer=initializers
)
opset_imports = [helper.make_operatorsetid("ai.onnx", 20)]
model = helper.make_model(graph, opset_imports=opset_imports)

sess_options = ort.SessionOptions()
##### if you uncomment this line, the code works #######
# sess_options.graph_optimization_level = ort.GraphOptimizationLevel.ORT_DISABLE_ALL
ort_session = ort.InferenceSession(
    model.SerializeToString(),
    sess_options=sess_options
)
```

Note: changing to opset ai.onnx 21 also mitigates the error. Unclear to me why this is the case since `QuantizeLinear` in opset 21 still does not support int32 zero-point...?


### Urgency

_No response_

### Platform

Linux

### OS Version

Ubuntu 22.04.4 LTS

### ONNX Runtime Installation

Released Package

### ONNX Runtime Version or Commit ID

1.18.0

### ONNX Runtime API

Python

### Architecture

X64

### Execution Provider

Default CPU

### Execution Provider Library Version

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bug] Invalid type for QuantizeLinear dtype post-ORT optimizations #25001

Describe the issue

To reproduce

Urgency

Platform

OS Version

ONNX Runtime Installation

ONNX Runtime Version or Commit ID

ONNX Runtime API

Architecture

Execution Provider

Execution Provider Library Version

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Bug] Invalid type for QuantizeLinear dtype post-ORT optimizations #25001

Description

Describe the issue

To reproduce

Urgency

Platform

OS Version

ONNX Runtime Installation

ONNX Runtime Version or Commit ID

ONNX Runtime API

Architecture

Execution Provider

Execution Provider Library Version

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions