Hi, can you share best practices for quantization for CNN models?
Are the modelopt quantized PTQ is the way to go with tensorrt for cnn models (resnet retinanet etc)? I was able to quantize retinanet backbone to int8 but the lack of examples and practices makes me wonder if that is the way to go..
Thanks
Hi, can you share best practices for quantization for CNN models?
Are the modelopt quantized PTQ is the way to go with tensorrt for cnn models (resnet retinanet etc)? I was able to quantize retinanet backbone to int8 but the lack of examples and practices makes me wonder if that is the way to go..
Thanks