[Bug]: OnnxQuantization #573

akarym-sl · 2023-09-15T10:27:54Z

What happened?

When running first OnnxQuantization pass with default parameters and a pass then with QUInt weight and activation type, the model parameters are not quantized to QUInt.
To clarify, running pass:

OnnxConversion->OnnxQuantization (QUInt8)

yields different accuracy than running two passes:

OnnxConversion->OnnxQuantization
OnnxConversion->OnnxQuantization (QUInt8)

Version?

0.3.1

guotuofeng · 2023-09-17T02:56:22Z

@akarym-sl , what's the config json you used to run the optimization for quantization?

akarym-sl · 2023-09-18T05:44:53Z

Here is the config. For the second case I prepend ["onnx_conv", "onnx_quant"] to the "pass_flows" list

{
    "input_model":{
        "type":"PyTorchModel",
        "config":{
            "model_path":"model.pt",
            "model_loader":"load_state_dict",
            "model_script":"save.py",
            "dummy_inputs_func":"get_dummy_inputs",
            "io_config":{
                "input_names":[
                    "input"
                ],
                "output_names":[
                    "output"
                ],
                "dynamic_axes":{
                    "input":{
                        "0":"batch"
                    },
                    "output":{
                        "0":"batch"
                    }
                }
            }
        }
    },
    "systems":{
        "local_system":{
            "type":"LocalSystem",
            "config":{
                "accelerators":[
                    "cpu"
                ]
            }
        }
    },
    "evaluators":{
        "custom_evaluator":{
            "metrics":[
                {
                    "name":"custom",
                    "type":"custom",
                    "user_config":{
                        "user_script":"user_script.py",
                        "batch_size":1,
                        "dataloader_func":"create_dataloader",
                        "evaluate_func":"evaluate"
                    },
                    "sub_types":[
                        {
                            "name":"latency",
                            "priority":1,
                            "higher_is_better":false
                        },
                        {
                            "name":"accuracy",
                            "priority":2,
                            "higher_is_better":true
                        }
                    ]
                }
            ]
        }
    },
    "engine":{
        "clean_cache":true,
        "cache_dir":".cache",
        "output_dir":"optimization",
        "host":"local_system",
        "target":"local_system",
        "execution_providers":[
            "CPUExecutionProvider"
        ],
        "evaluator":"custom_evaluator",
        "evaluate_input_model":false
    },
    "passes":{
        "onnx_conv":{
            "type":"OnnxConversion",
            "config":{
                "target_opset":15
            }
        },
        "onnx_quant":{
            "type":"OnnxQuantization",
            "config":{
                "user_script":"user_script.py",
                "dataloader_func":"create_calibrator",
            }
        },
        "onnx_quant_u":{
            "type":"OnnxQuantization",
            "config":{
                "user_script":"user_script.py",
                "dataloader_func":"create_calibrator",
                "weight_type":"QUInt8",
                "activation_type":"QUInt8"
            }
        },
    },
    "pass_flows"[["onnx_conv", "onnx_quant_u"]]
}

guotuofeng · 2023-09-18T07:13:15Z

@akarym-sl, do you mean the accuracy from model optimized by pass_flows is different with that one without default onnx_quant?

From your description, it seems the accuracy from [["onnx_conv", "onnx_quant_u"]] is different from [ ["onnx_conv", "onnx_quant"], ["onnx_conv", "onnx_quant_u"]]? your question should be the two run for same pass group ["onnx_conv", "onnx_quant_u"] should be same. Is my understanding correct?

akarym-sl · 2023-09-18T07:29:14Z

Yes, in my understanding, the previous passes shouldn't affect the current. I observe that adding ["onnx_conv", "onnx_quant"] pass affects the accuracy on the ["onnx_conv", "onnx_quant_u"] pass. My guess would be that the model is not quantized to QUInt in the second pass, as it should, but is rather quantized to QInt or not changed at all and loaded from the last pass.

guotuofeng · 2023-09-18T07:51:09Z

@trajepl is helping looking at

trajepl · 2023-09-18T09:24:27Z

Thanks for raising it up. It should be a bug from olive side.
The root cause is that: olive uses pass's class name(OnnxQuantization) as key to access pass instance(onnx_quant, onnx_quant_u).
When there are the same passes but with different configs(onnx_quant, onnx_quant_u), only the first one is used to run quantization(onnx_quant).

I changed the key to pass name in following PR and tested with bert case, it worked well for me.

{
    "input_model":{
        "type": "PyTorchModel",
        "config": {
            "hf_config": {
                "model_name": "Intel/bert-base-uncased-mrpc",
                "task": "text-classification",
                "dataset": {
                    "data_name":"glue",
                    "subset": "mrpc",
                    "split": "validation",
                    "input_cols": ["sentence1", "sentence2"],
                    "label_cols": ["label"],
                    "batch_size": 1
                }
            }
        }
    },
    "evaluators": {
        "common_evaluator": {
            "metrics":[
                {
                    "name": "accuracy",
                    "type": "accuracy",
                    "backend": "huggingface_metrics",
                    "sub_types": [
                        {"name": "accuracy", "priority": 1, "goal": {"type": "max-degradation", "value": 0.01}},
                        {"name": "f1"}
                    ]
                },
                {
                    "name": "latency",
                    "type": "latency",
                    "sub_types": [
                        {"name": "avg", "priority": 2, "goal": {"type": "percent-min-improvement", "value": 20}},
                        {"name": "max"},
                        {"name": "min"}
                    ]
                }
            ]
        }
    },
    "passes": {
        "conversion": {
            "type": "OnnxConversion",
            "config": {
                "target_opset": 13
            }
        },
        "onnx_quant": {
            "type": "OnnxQuantization",
            "config": {
                "data_config": "__input_model_data_config__"
            }
        },
        "onnx_quant_u": {
            "type": "OnnxQuantization",
            "config": {
                "data_config": "__input_model_data_config__",
                "weight_type":"QUInt8",
                "activation_type":"QUInt8"
            }
        }
    },
    "pass_flows": [
        ["conversion", "onnx_quant_u"]
    ],
    "engine": {
        "evaluator": "common_evaluator",
        "execution_providers": ["CPUExecutionProvider"],
        "cache_dir": "cache",
        "output_dir" : "models/bert_ptq_cpu",
        "clean_cache": true
    }
}

Could you help take a try with this PR? @akarym-sl #577

git clone https://github.com/microsoft/Olive
pip install .

## Describe your changes This PR is used fix following issue where the same pass with different config appears in one olive run config. #573 [Root Cause]: Olive uses the pass's class name as key to identify the pass instance. When there are passes with same pass_class, the first one defined in olive run config will be picked always. https://github.com/microsoft/Olive/blob/main/olive/engine/engine.py#L436 ## Checklist before requesting a review - [ ] Add unit tests for this change. - [ ] Make sure all tests can pass. - [ ] Update documents if necessary. - [ ] Format your code by running `pre-commit run --all-files` - [ ] Is this a user-facing change? If yes, give a description of this change to be included in the release notes. ## (Optional) Issue link

guotuofeng · 2023-09-19T00:08:51Z

@akarym-sl, please let us know whether the bug is fixed or not.

akarym-sl · 2023-09-19T06:56:52Z

I tested the new version (0.4.0) on the same setup and can confirm that the issue is gone! Therefore, closing the issue. Thank you!

…598) ## Describe your changes Unit tests for same pass with different config in one olive config. To cover this case #573 ## Checklist before requesting a review - [ ] Add unit tests for this change. - [ ] Make sure all tests can pass. - [ ] Update documents if necessary. - [ ] Format your code by running `pre-commit run --all-files` - [ ] Is this a user-facing change? If yes, give a description of this change to be included in the release notes. ## (Optional) Issue link

akarym-sl added the bug Something isn't working label Sep 15, 2023

guotuofeng added the waiting for response Waiting for response label Sep 17, 2023

trajepl mentioned this issue Sep 18, 2023

🐛 Distinguish pass instance with name not cls name #577

Merged

5 tasks

guotuofeng added python Pull requests that update Python code and removed waiting for response Waiting for response labels Sep 19, 2023

akarym-sl closed this as completed Sep 19, 2023

trajepl mentioned this issue Sep 22, 2023

🍵 Unit tests for same pass with different config in one olive config #598

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: OnnxQuantization #573

[Bug]: OnnxQuantization #573

akarym-sl commented Sep 15, 2023

guotuofeng commented Sep 17, 2023

akarym-sl commented Sep 18, 2023

guotuofeng commented Sep 18, 2023 •

edited

akarym-sl commented Sep 18, 2023

guotuofeng commented Sep 18, 2023

trajepl commented Sep 18, 2023 •

edited

guotuofeng commented Sep 19, 2023

akarym-sl commented Sep 19, 2023

[Bug]: OnnxQuantization #573

[Bug]: OnnxQuantization #573

Comments

akarym-sl commented Sep 15, 2023

What happened?

Version?

guotuofeng commented Sep 17, 2023

akarym-sl commented Sep 18, 2023

guotuofeng commented Sep 18, 2023 • edited

akarym-sl commented Sep 18, 2023

guotuofeng commented Sep 18, 2023

trajepl commented Sep 18, 2023 • edited

guotuofeng commented Sep 19, 2023

akarym-sl commented Sep 19, 2023

guotuofeng commented Sep 18, 2023 •

edited

trajepl commented Sep 18, 2023 •

edited