You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hallo,
I want to run inference on a resnet50 model prequantized to INT8 in pytorch . Is there a way to deploy this model in TensorRT? As far as I understood only a quantized ONNX model is accepted by TensorRT. However, currently, Pytorch does not support the conversion to quantized ONNX.
Is it possible to deploy a model prequantized in TFLite with TensorRT?
The text was updated successfully, but these errors were encountered:
A future release of TensorRT will be able to import a prequantized model from PyTorch. These models can be quantized using the toolkit we released earlier.
We don’t yet support deploying a prequantized model from TFLite
Pytorch does support export (fake) quantized model to ONNX. Although per channel support hasn't been merged to master because it depends on ONNX opset13. See pytorch/pytorch#42835.
Hallo,
I want to run inference on a resnet50 model prequantized to INT8 in pytorch . Is there a way to deploy this model in TensorRT? As far as I understood only a quantized ONNX model is accepted by TensorRT. However, currently, Pytorch does not support the conversion to quantized ONNX.
Is it possible to deploy a model prequantized in TFLite with TensorRT?
The text was updated successfully, but these errors were encountered: