How to create a quantized model in order to try out QNNPACK? #12
Comments
We haven't released the tooling for converting floating-point models to quantized (8-bit) models yet. You may use QNNPACK with the two pre-trained models that we released: |
I got this error when I tried ResNet-50 model as you mentioned above. |
@biaoxiaoduan In quantized ResNet-50 the input blob is called "gpu_0/data_0" rather than "data". |
Hi, I have the same problem. Do you have any idea to solve it? |
@xiezheng-cs As of today, there's no out-of-the-box solution for Caffe2 or PyTorch, but team is working on it |
Hi, do you have any updates on a tool, or have any suggestions in how to manually convert an onnx model to use quantized 8bit integers? |
ONNX doesn't support quantization as of today. Any chance you can convert the model to caffe2 and quantize it to 8bit using caffe2? |
Do you have any estimate on when quantization will be available on onnx? Our main model was not written for caffe2 unfortunately. |
@hardsetting Quantization support in onnx is unfortunately on hold right now, sorry we don't have any eta on it :( . |
Luckily I just managed to convert my model to Caffe2. What about the conversion tool from regular Caffe2 model to quantized Caffe2 model @Maratyszcza was referring to? Do you have an ETA for that or any news? |
The latest ETA is one to two months :( |
Ok, thank you for the answer |
I have the same problem, Do you have solved it? |
Hi,
I am so excited that Facebook is revealing its own magic on mobile inference framework. After reading the article about QNNPACK, I really want to try it out on my own caffemodel.(I know that you guys have posted quantized mobilenetv2, and it beats the TFLITE one by 2x.) But how can I convert my prototxt and caffemodel to the preferred model format which can be applied to QNNPACK?
The text was updated successfully, but these errors were encountered: