Skip to content

How to provide calibration data for INT8 quantization with dynamic ONNX input shapes #565

@remobax-hub

Description

@remobax-hub

How would you like to use ModelOpt
I would like to use TensorRT-Model-Optimizer for explicit INT8 quantization of my ONNX model.
The ONNX model input shape is (-1, 3, -1, -1), i.e., batch size, height, and width are dynamic except for the channel dimension.

My specific question:
What is the recommended shape for the calibration dataset when dealing with dynamic input shapes?

Should I fix the batch/spatial dimensions for calibration data, or can multiple different shapes be provided?

Are there best practices or guidelines for preparing calibration data with dynamically shaped models?

I have reviewed the examples and the documentation, but did not find explicit instructions about dynamic input calibration.

Who can help?
If possible, please tag maintainers or experts familiar with INT8 quantization and calibration in ModelOpt.

System information
Container used: nvcr.io/nvidia/tensorrt:25.10-py3
OS: Ubuntu 24.04.3 LTS
CPU architecture: x86_64
GPU name: NVIDIA GeForce RTX 4060 Ti
GPU memory size: 8.0 GB
Number of GPUs: 1
Library versions:
Python: 3.12.3
ModelOpt version: 0.39.0
CUDA: 13.0
PyTorch: 2.9.1+cu128
Transformers: 4.57.1
ONNXRuntime: 1.22.0
TensorRT: 10.13.3.9
TensorRT-LLM: None
Any other details that may help: None at this moment

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions