Closed
Description
Describe the documentation issue
Question: Is 16bit quantization supported by the python tool?
There is no mention of 16bit quantization in the documentation but from the python tool it looks like I can set QuanType
as QInt16
and QUInt16
.
Is it officially supported? Are there any known issues or limitations?
Page / URL
https://onnxruntime.ai/docs/performance/model-optimizations/quantization.html