Skip to content

after converting onnx fp32 to int8 engine with custom calibration, the engine layers still show fp32 #4341

@jinhonglu

Description

@jinhonglu

Description

I tried to follow the int8 custom calibration to build my int8 engine from onnx fp32 model.
https://github.com/NVIDIA/TensorRT/tree/main/tools/Polygraphy/examples/cli/convert/01_int8_calibration_in_tensorrt

After building the engine, I used the following to inspect the layer
polygraphy inspect model int8.engine --model-type engine --show layers
However, all the layers still use fp32

Moreover, I tried the debug precision function to investigate the layers' differences to build a mixed-precision engine, the result shows the same and the inference time gets much slower than the onnx fp32 model.
CUDA_VISIBLE_DEVICES=3 polygraphy debug precision fp32_model.onnx --int8 --tactic-sources cublas --verbose -p float32 --calibration-cache int8_calib.cache --check polygraphy run polygraphy_debug.engine --trt --load-inputs golden_input.json --load-outputs golden.json --abs 1e-2

Environment

TensorRT Version: 10.4

NVIDIA GPU: A100

NVIDIA Driver Version: 12.5

CUDA Version:

CUDNN Version:

Operating System:

Python Version (if applicable):

Tensorflow Version (if applicable):

PyTorch Version (if applicable):

Baremetal or Container (if so, version):

Relevant Files

Model link:

Steps To Reproduce

Commands or scripts:

Have you tried the latest release?:

Can this model run on other frameworks? For example run ONNX model with ONNXRuntime (polygraphy run <model.onnx> --onnxrt):

Metadata

Metadata

Assignees

Labels

Module:PolygraphyIssues with PolygraphystalePR or issue hasn't been responded to by the OPtriagedIssue has been triaged by maintainerswaiting for feedbackRequires more information from author of item to make progress on the issue.

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions