New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hi, will export to QLinear save weights in int8? #81
Comments
ONNX_QNN backend will save model weights in INT8, so make the model smaller. Any model file or something to reproduce the bug ? |
@Tracin Hi, I am trying quantize a simple resnet18 model, I am using QNNX_QNN, it seems can sucssfully trace and insert FakeQuantize ops, but finally can not pass QNNX_QNN pass:
what could be the reason? More log like:
btw, does QNNX QNN saved model can inference via ORT? |
It is not right absolutely, any simple code to reproduce the error? |
@Tracin I am not sure my code can run on your side or not, since I modified some code in mqbench to support torch1.11, (this is the main purpose I trying mqbench for decent pytorch support). Here is the process I runed: extra_prepare_dict:
extra_qconfig_dict:
w_observer: MinMaxObserver
a_observer: EMAMinMaxObserver
w_fakequantize: FixedFakeQuantize
a_fakequantize: FixedFakeQuantize
w_qscheme:
bit: 8
# symmetry: False
symmetry: true
per_channel: True
pot_scale: False
a_qscheme:
bit: 8
# symmetry: False
symmetry: true
per_channel: False
pot_scale: False
quantize:
quantize_type: naive_ptq # support naive_ptq or advanced_ptq
cali_batchsize: 16
# backend: 'Tensorrt'
backend: 'ONNX_QNN'
# backend: 'PPLW8A16'
deploy:
model_name: 'r18.onnx'
output_path: './'
deploy_to_qlinear: true
model: # architecture details
type: resnet18 # model name
kwargs:
num_classes: 1000
path: /path-of-pretrained
data:
path: /path-of-imagenet
batch_size: 64
num_workers: 4
pin_memory: True
input_size: 224
test_resize: 256
process:
seed: 1005 Code:
the model traced seems normally on saved onnx with Fake ops: But, the problems is on the QNN pass can not get pass. And I found these comented code will raise a Assertion: any solution to fix this? |
Resnet18 can pass the test in https://github.com/ModelTC/MQBench/blob/main/test/backend/test_backend.py#L120-L131 |
@Tracin May I ask when will pytorch1.11 support? I do belive your test can run, as I mentioned before, I am runing on pytorch1.11 for upgradation on our product. However, I found the problem caused be initializers. The initializers still old names:
initializers still old names:
what might miss here? |
It is difficult to reproduce your code, to make things simple, can the specific test case pass under your python evn ? |
@Tracin You can have test on pytorch1.11. A lots of things changed including onnx export, fx etc. I think I know where the problem is. |
It will take some time to test. |
@Tracin yes. torch1.11 removed a lot of APIs used in mqbench. |
Using tensorrt backend, will QLinear make the onnx model smaller?
I got some error when trying to save to QLinear:
The text was updated successfully, but these errors were encountered: