Quantized model fails during inference in 1.21.1 (works in 1.19.1) with error `QLinearConv : result zero point must be a scalar or 1D tensor of size 1`

### Describe the issue

A quantized ONNX model that works correctly with ONNX Runtime 1.19.1 now fails during inference in version 1.21.1 with the following error:

```
[E:onnxruntime:, sequential_executor.cc:572 ExecuteKernel] Non-zero status code returned while running
QLinearConv node. Name:'/features/features.0/Conv_token_1' Status Message:
/onnxruntime_src/onnxruntime/core/providers/cpu/quantization/qlinearconv.cc:67 static void
onnxruntime::QLinearConv<ActType>::ComputeOffset(onnxruntime::OpKernelContext*, int64_t, ActType&,
ActType&, uint8_t&) [with ActType = unsigned char; int64_t = long int; uint8_t = unsigned char]
IsScalarOr1ElementVector(Y_zero_point) was false. QLinearConv : result zero point must be a scalar or 1D tensor of
size 1
```

ONNX version: 1.17.0

### To reproduce

To reproduce the issue, run the following code:

```python
import onnxruntime as ort
import numpy as np

sess = ort.InferenceSession("test_model.onnx", providers=["CPUExecutionProvider"])
output = sess.run(None, {"input.0": np.random.normal(size=(1, 1, 20, 20)).astype(np.float32)})
print(output)
```

- 1.19.1 – inference works
- 1.21.1 – inference fails with the error above

Additionally, I’ve found that if I disable ONNX graph optimization, inference works in 1.21.1. This suggests that the issue may be related to optimizations applied to quantized model.

```python
sess_options = ort.SessionOptions()
sess_options.graph_optimization_level = ort.GraphOptimizationLevel.ORT_DISABLE_ALL
```


### Urgency

There is a regression in ONNX Runtime

### Platform

Linux

### OS Version

Ubuntu Ubuntu 20.04.6 LTS

### ONNX Runtime Installation

Released Package

### ONNX Runtime Version or Commit ID

1.21.1

### ONNX Runtime API

Python

### Architecture

X64

### Execution Provider

Default CPU

### Execution Provider Library Version

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Quantized model fails during inference in 1.21.1 (works in 1.19.1) with error `QLinearConv : result zero point must be a scalar or 1D tensor of size 1` #24518

Describe the issue

To reproduce

Urgency

Platform

OS Version

ONNX Runtime Installation

ONNX Runtime Version or Commit ID

ONNX Runtime API

Architecture

Execution Provider

Execution Provider Library Version

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Quantized model fails during inference in 1.21.1 (works in 1.19.1) with error QLinearConv : result zero point must be a scalar or 1D tensor of size 1 #24518

Description

Describe the issue

To reproduce

Urgency

Platform

OS Version

ONNX Runtime Installation

ONNX Runtime Version or Commit ID

ONNX Runtime API

Architecture

Execution Provider

Execution Provider Library Version

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Quantized model fails during inference in 1.21.1 (works in 1.19.1) with error `QLinearConv : result zero point must be a scalar or 1D tensor of size 1` #24518