-
Notifications
You must be signed in to change notification settings - Fork 242
Description
How would you like to use ModelOpt
I would like to use TensorRT-Model-Optimizer for explicit INT8 quantization of my ONNX model.
The ONNX model input shape is (-1, 3, -1, -1), i.e., batch size, height, and width are dynamic except for the channel dimension.
My specific question:
What is the recommended shape for the calibration dataset when dealing with dynamic input shapes?
Should I fix the batch/spatial dimensions for calibration data, or can multiple different shapes be provided?
Are there best practices or guidelines for preparing calibration data with dynamically shaped models?
I have reviewed the examples and the documentation, but did not find explicit instructions about dynamic input calibration.
Who can help?
If possible, please tag maintainers or experts familiar with INT8 quantization and calibration in ModelOpt.
System information
Container used: nvcr.io/nvidia/tensorrt:25.10-py3
OS: Ubuntu 24.04.3 LTS
CPU architecture: x86_64
GPU name: NVIDIA GeForce RTX 4060 Ti
GPU memory size: 8.0 GB
Number of GPUs: 1
Library versions:
Python: 3.12.3
ModelOpt version: 0.39.0
CUDA: 13.0
PyTorch: 2.9.1+cu128
Transformers: 4.57.1
ONNXRuntime: 1.22.0
TensorRT: 10.13.3.9
TensorRT-LLM: None
Any other details that may help: None at this moment