-
Notifications
You must be signed in to change notification settings - Fork 137
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: OnnxQuantization #573
Comments
@akarym-sl , what's the config json you used to run the optimization for quantization? |
Here is the config. For the second case I prepend ["onnx_conv", "onnx_quant"] to the "pass_flows" list {
"input_model":{
"type":"PyTorchModel",
"config":{
"model_path":"model.pt",
"model_loader":"load_state_dict",
"model_script":"save.py",
"dummy_inputs_func":"get_dummy_inputs",
"io_config":{
"input_names":[
"input"
],
"output_names":[
"output"
],
"dynamic_axes":{
"input":{
"0":"batch"
},
"output":{
"0":"batch"
}
}
}
}
},
"systems":{
"local_system":{
"type":"LocalSystem",
"config":{
"accelerators":[
"cpu"
]
}
}
},
"evaluators":{
"custom_evaluator":{
"metrics":[
{
"name":"custom",
"type":"custom",
"user_config":{
"user_script":"user_script.py",
"batch_size":1,
"dataloader_func":"create_dataloader",
"evaluate_func":"evaluate"
},
"sub_types":[
{
"name":"latency",
"priority":1,
"higher_is_better":false
},
{
"name":"accuracy",
"priority":2,
"higher_is_better":true
}
]
}
]
}
},
"engine":{
"clean_cache":true,
"cache_dir":".cache",
"output_dir":"optimization",
"host":"local_system",
"target":"local_system",
"execution_providers":[
"CPUExecutionProvider"
],
"evaluator":"custom_evaluator",
"evaluate_input_model":false
},
"passes":{
"onnx_conv":{
"type":"OnnxConversion",
"config":{
"target_opset":15
}
},
"onnx_quant":{
"type":"OnnxQuantization",
"config":{
"user_script":"user_script.py",
"dataloader_func":"create_calibrator",
}
},
"onnx_quant_u":{
"type":"OnnxQuantization",
"config":{
"user_script":"user_script.py",
"dataloader_func":"create_calibrator",
"weight_type":"QUInt8",
"activation_type":"QUInt8"
}
},
},
"pass_flows"[["onnx_conv", "onnx_quant_u"]]
} |
@akarym-sl, do you mean the accuracy from model optimized by pass_flows is different with that one without default onnx_quant? From your description, it seems the accuracy from [["onnx_conv", "onnx_quant_u"]] is different from [ ["onnx_conv", "onnx_quant"], ["onnx_conv", "onnx_quant_u"]]? your question should be the two run for same pass group ["onnx_conv", "onnx_quant_u"] should be same. Is my understanding correct? |
Yes, in my understanding, the previous passes shouldn't affect the current. I observe that adding ["onnx_conv", "onnx_quant"] pass affects the accuracy on the ["onnx_conv", "onnx_quant_u"] pass. My guess would be that the model is not quantized to QUInt in the second pass, as it should, but is rather quantized to QInt or not changed at all and loaded from the last pass. |
@trajepl is helping looking at |
Thanks for raising it up. It should be a bug from olive side. I changed the key to pass name in following PR and tested with bert case, it worked well for me. {
"input_model":{
"type": "PyTorchModel",
"config": {
"hf_config": {
"model_name": "Intel/bert-base-uncased-mrpc",
"task": "text-classification",
"dataset": {
"data_name":"glue",
"subset": "mrpc",
"split": "validation",
"input_cols": ["sentence1", "sentence2"],
"label_cols": ["label"],
"batch_size": 1
}
}
}
},
"evaluators": {
"common_evaluator": {
"metrics":[
{
"name": "accuracy",
"type": "accuracy",
"backend": "huggingface_metrics",
"sub_types": [
{"name": "accuracy", "priority": 1, "goal": {"type": "max-degradation", "value": 0.01}},
{"name": "f1"}
]
},
{
"name": "latency",
"type": "latency",
"sub_types": [
{"name": "avg", "priority": 2, "goal": {"type": "percent-min-improvement", "value": 20}},
{"name": "max"},
{"name": "min"}
]
}
]
}
},
"passes": {
"conversion": {
"type": "OnnxConversion",
"config": {
"target_opset": 13
}
},
"onnx_quant": {
"type": "OnnxQuantization",
"config": {
"data_config": "__input_model_data_config__"
}
},
"onnx_quant_u": {
"type": "OnnxQuantization",
"config": {
"data_config": "__input_model_data_config__",
"weight_type":"QUInt8",
"activation_type":"QUInt8"
}
}
},
"pass_flows": [
["conversion", "onnx_quant_u"]
],
"engine": {
"evaluator": "common_evaluator",
"execution_providers": ["CPUExecutionProvider"],
"cache_dir": "cache",
"output_dir" : "models/bert_ptq_cpu",
"clean_cache": true
}
}
Could you help take a try with this PR? @akarym-sl #577 git clone https://github.com/microsoft/Olive
pip install . |
## Describe your changes This PR is used fix following issue where the same pass with different config appears in one olive run config. #573 [Root Cause]: Olive uses the pass's class name as key to identify the pass instance. When there are passes with same pass_class, the first one defined in olive run config will be picked always. https://github.com/microsoft/Olive/blob/main/olive/engine/engine.py#L436 ## Checklist before requesting a review - [ ] Add unit tests for this change. - [ ] Make sure all tests can pass. - [ ] Update documents if necessary. - [ ] Format your code by running `pre-commit run --all-files` - [ ] Is this a user-facing change? If yes, give a description of this change to be included in the release notes. ## (Optional) Issue link
@akarym-sl, please let us know whether the bug is fixed or not. |
I tested the new version (0.4.0) on the same setup and can confirm that the issue is gone! Therefore, closing the issue. Thank you! |
…598) ## Describe your changes Unit tests for same pass with different config in one olive config. To cover this case #573 ## Checklist before requesting a review - [ ] Add unit tests for this change. - [ ] Make sure all tests can pass. - [ ] Update documents if necessary. - [ ] Format your code by running `pre-commit run --all-files` - [ ] Is this a user-facing change? If yes, give a description of this change to be included in the release notes. ## (Optional) Issue link
What happened?
When running first OnnxQuantization pass with default parameters and a pass then with QUInt weight and activation type, the model parameters are not quantized to QUInt.
To clarify, running pass:
yields different accuracy than running two passes:
Version?
0.3.1
The text was updated successfully, but these errors were encountered: