Unable to quantize `torchvision.detection` models

### Describe the issue

I'm trying to quantize a model from `torchvision` that I've exported to `onnx` as follows:

```
import torch
from torchvision.models.detection import (
    faster_rcnn,
)

model = faster_rcnn.fasterrcnn_resnet50_fpn_v2(
    pretrained=False, pretrained_backbone=False
)
model.eval()

inputs = torch.rand(1, 3, 640, 640)

torch.onnx.export(
    model,
    inputs,
    "torchvision.onnx",
)
```

This model works fine in `onnxruntime` as is, but I'd like try to quantize it.

First, I get `Exception: Incomplete symbolic shape inference` during the preprocessing step:

    $ python -m onnxruntime.quantization.preprocess --input torchvision.onnx --output torchvision_preprocessed.onnx
  
But adding `--skip_symbolic_shape true` generates a model without error.

Then, I attempt to quantize as in [the example](https://github.com/microsoft/onnxruntime-inference-examples/blob/main/quantization/image_classification/cpu/run.py):

```
...
    quantize_static(
        "./torchvision_preprocessed.onnx",
        "./torchvision_preprocessed_quantized.onnx",
        data_reader,
    )
...
```

And I get the following exception:

`onnxruntime.capi.onnxruntime_pybind11_state.RuntimeException: [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Non-zero status code returned while running ReduceMax node. Name:'/Squeeze_2_output_0_ReduceMax' Status Message:`

Notably, the status message actually _is_ empty.

Running it multiple times produces exceptions at different layers each time:

`onnxruntime.capi.onnxruntime_pybind11_state.RuntimeException: [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Non-zero status code returned while running ReduceMax node. Name:'/roi_heads/box_roi_pool/Gather_6_output_0_ReduceMax' Status Message:`

But it's always a `ReduceMax`. The weird thing is, looking at the model with [Netron](https://netron.app) I can't find these layers in the model at all. Is that expected?

Is there something in the nature of the `RCNN` variants used by `torchvision` that prevents quantization? I'm able to quantize other models successfully.

### To reproduce

See above.

### Urgency

_No response_

### Platform

Mac

### OS Version

14.2.1

### ONNX Runtime Installation

Released Package

### ONNX Runtime Version or Commit ID

1.17.0

### ONNX Runtime API

Python

### Architecture

ARM64

### Execution Provider

Default CPU

### Execution Provider Library Version

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Unable to quantize `torchvision.detection` models #19544

Describe the issue

To reproduce

Urgency

Platform

OS Version

ONNX Runtime Installation

ONNX Runtime Version or Commit ID

ONNX Runtime API

Architecture

Execution Provider

Execution Provider Library Version

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Unable to quantize torchvision.detection models #19544

Description

Describe the issue

To reproduce

Urgency

Platform

OS Version

ONNX Runtime Installation

ONNX Runtime Version or Commit ID

ONNX Runtime API

Architecture

Execution Provider

Execution Provider Library Version

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Unable to quantize `torchvision.detection` models #19544