Add changes for strided calibration #20949

RuomeiMS · 2024-06-06T10:30:05Z

Context and motivation:
When quantizing large transformer models, we faced OOM issue when the number of calibration samples goes up. To resolve this, in the PR we want to add support for reading quantization data in chunck, calculating ranges for intermediate tensors, then accumulating results for the final ranges.

mailkv23 · 2024-06-06T11:38:52Z

Please fill in motivation and context

adrianlizarraga · 2024-06-06T17:13:49Z

/azp run Linux CPU CI Pipeline, Linux CPU Minimal Build E2E CI Pipeline, Linux GPU CI Pipeline, Linux GPU TensorRT CI Pipeline, Linux OpenVINO CI Pipeline, MacOS CI Pipeline, ONNX Runtime Web CI Pipeline, onnxruntime-binary-size-checks-ci-pipeline, Linux QNN CI Pipeline

adrianlizarraga · 2024-06-06T17:14:06Z

/azp run Windows CPU CI Pipeline, Windows GPU CI Pipeline, Windows GPU TensorRT CI Pipeline, Windows ARM64 QNN CI Pipeline, orttraining-linux-ci-pipeline, orttraining-linux-gpu-ci-pipeline, orttraining-ortmodule-distributed, Windows x64 QNN CI Pipeline, Linux MIGraphX CI Pipeline, Big Models

adrianlizarraga · 2024-06-06T17:14:18Z

/azp run ONNX Runtime React Native CI Pipeline, orttraining-amd-gpu-ci-pipeline, Linux Android Emulator QNN CI Pipeline

azure-pipelines · 2024-06-06T17:14:26Z

Azure Pipelines successfully started running 9 pipeline(s).

azure-pipelines · 2024-06-06T17:14:33Z

Azure Pipelines successfully started running 3 pipeline(s).

azure-pipelines · 2024-06-06T17:14:47Z

Azure Pipelines successfully started running 10 pipeline(s).

onnxruntime/test/python/quantization/test_quantize_static.py

onnxruntime/python/tools/quantization/quantize.py

onnxruntime/test/python/quantization/test_quantize_static.py

onnxruntime/python/tools/quantization/quantize.py

onnxruntime/python/tools/quantization/calibrate.py

adrianlizarraga · 2024-06-07T20:33:08Z

@RuomeiMS It looks like this same feature was already implemented in the past using a different option called CalibMaxIntermediateOutputs. Here's the PR: #17029.

However, that PR was not tested, and several follow-up PRs broke its functionality: #19416

I think that all we have to do is add the following new line to collect_data() in calibrate.py line 397.

    def collect_data(self, data_reader: CalibrationDataReader):
        while True:
            inputs = data_reader.get_next()
            if not inputs:
                break
            self.intermediate_outputs.append(self.infer_session.run(None, inputs))
            if (
                self.max_intermediate_outputs is not None
                and len(self.intermediate_outputs) == self.max_intermediate_outputs
            ):
                self.compute_data()  # ADD this line
                self.clear_collected_data()

If you do this, then we may be able to just use the CalibMaxIntermediateOutputs option. This PR would be pretty small and it would also add useful unit tests.

* reading quant data in chunk and calculate/accumulate ranges for resolving OOM * one unit test test_stride_effect_on_data_collection using ort style

onnxruntime/python/tools/quantization/execution_providers/qnn/quant_config.py

onnxruntime/test/python/quantization/op_test_utils.py

onnxruntime/test/python/quantization/test_quantize_static.py

adrianlizarraga · 2024-06-12T00:35:43Z

/azp run Linux CPU CI Pipeline, Linux CPU Minimal Build E2E CI Pipeline, Linux GPU CI Pipeline, Linux GPU TensorRT CI Pipeline, Linux OpenVINO CI Pipeline, MacOS CI Pipeline, ONNX Runtime Web CI Pipeline, onnxruntime-binary-size-checks-ci-pipeline, Linux QNN CI Pipeline

adrianlizarraga · 2024-06-12T00:35:52Z

/azp run Windows CPU CI Pipeline, Windows GPU CI Pipeline, Windows GPU TensorRT CI Pipeline, Windows ARM64 QNN CI Pipeline, orttraining-linux-ci-pipeline, orttraining-linux-gpu-ci-pipeline, orttraining-ortmodule-distributed, Windows x64 QNN CI Pipeline, Linux MIGraphX CI Pipeline, Big Models

adrianlizarraga · 2024-06-12T00:35:58Z

/azp run ONNX Runtime React Native CI Pipeline, orttraining-amd-gpu-ci-pipeline, Linux Android Emulator QNN CI Pipeline

azure-pipelines · 2024-06-12T00:36:14Z

Azure Pipelines successfully started running 3 pipeline(s).

azure-pipelines · 2024-06-12T00:36:17Z

Azure Pipelines successfully started running 9 pipeline(s).

azure-pipelines · 2024-06-12T00:36:26Z

Azure Pipelines successfully started running 10 pipeline(s).

adrianlizarraga · 2024-06-12T15:05:14Z

/azp run Linux CPU CI Pipeline, Linux CPU Minimal Build E2E CI Pipeline, Linux GPU CI Pipeline, Linux GPU TensorRT CI Pipeline, Linux OpenVINO CI Pipeline, MacOS CI Pipeline, ONNX Runtime Web CI Pipeline, onnxruntime-binary-size-checks-ci-pipeline, Linux QNN CI Pipeline

adrianlizarraga · 2024-06-12T15:05:24Z

/azp run Windows CPU CI Pipeline, Windows GPU CI Pipeline, Windows GPU TensorRT CI Pipeline, Windows ARM64 QNN CI Pipeline, orttraining-linux-ci-pipeline, orttraining-linux-gpu-ci-pipeline, orttraining-ortmodule-distributed, Windows x64 QNN CI Pipeline, Linux MIGraphX CI Pipeline, Big Models

adrianlizarraga · 2024-06-12T15:05:33Z

/azp run ONNX Runtime React Native CI Pipeline, orttraining-amd-gpu-ci-pipeline, Linux Android Emulator QNN CI Pipeline

azure-pipelines · 2024-06-12T15:05:51Z

Azure Pipelines successfully started running 3 pipeline(s).

azure-pipelines · 2024-06-12T15:05:59Z

Azure Pipelines successfully started running 9 pipeline(s).

azure-pipelines · 2024-06-12T15:06:03Z

Azure Pipelines successfully started running 10 pipeline(s).

adrianlizarraga

Looking great!
We need to make sure all linting errors are fixed before we merge. Could you please run the following in your root onnxruntime directory and fix all issues?

lintrunner -a

https://github.com/microsoft/onnxruntime/actions/runs/9478693880/job/26136037422?pr=20949

RuomeiMS · 2024-06-19T17:39:13Z

@RuomeiMS please read the following Contributor License Agreement(CLA). If you agree with the CLA, please reply with the following information.
@microsoft-github-policy-service agree [company="{your company}"]
Options:

(default - no company specified) I have sole ownership of intellectual property rights to my Submissions and I am not making Submissions in the course of work for my employer.
@microsoft-github-policy-service agree
(when company given) I am making Submissions in the course of work for my employer (or my employer has intellectual property rights in my Submissions by contract or applicable law). I have permission from my employer to make Submissions and enter into this Agreement on behalf of my employer. By signing below, the defined term “You” includes me and my employer.
@microsoft-github-policy-service agree company="Microsoft"
Contributor License Agreement

@microsoft-github-policy-service agree

RuomeiMS · 2024-06-19T17:40:19Z

@RuomeiMS the command you issued was incorrect. Please try again.

Examples are:
@microsoft-github-policy-service agree
and
@microsoft-github-policy-service agree company="your company"

@microsoft-github-policy-service agree

adrianlizarraga · 2024-06-19T19:32:59Z

/azp run Linux CPU CI Pipeline, Linux CPU Minimal Build E2E CI Pipeline, Linux GPU CI Pipeline, Linux GPU TensorRT CI Pipeline, Linux OpenVINO CI Pipeline, MacOS CI Pipeline, ONNX Runtime Web CI Pipeline, onnxruntime-binary-size-checks-ci-pipeline, Linux QNN CI Pipeline

adrianlizarraga · 2024-06-19T19:33:06Z

/azp run Windows CPU CI Pipeline, Windows GPU CI Pipeline, Windows GPU TensorRT CI Pipeline, Windows ARM64 QNN CI Pipeline, orttraining-linux-ci-pipeline, orttraining-linux-gpu-ci-pipeline, orttraining-ortmodule-distributed, Windows x64 QNN CI Pipeline, Linux MIGraphX CI Pipeline, Big Models

adrianlizarraga · 2024-06-19T19:33:14Z

/azp run ONNX Runtime React Native CI Pipeline, orttraining-amd-gpu-ci-pipeline, Linux Android Emulator QNN CI Pipeline

azure-pipelines · 2024-06-19T19:33:29Z

Azure Pipelines successfully started running 3 pipeline(s).

azure-pipelines · 2024-06-19T19:33:33Z

Azure Pipelines successfully started running 9 pipeline(s).

azure-pipelines · 2024-06-19T19:33:40Z

Azure Pipelines successfully started running 10 pipeline(s).

jywu-msft requested review from yufenglee and adrianlizarraga June 6, 2024 14:12

github-advanced-security bot found potential problems Jun 6, 2024

View reviewed changes

onnxruntime/test/python/quantization/test_quantize_static.py Fixed Show fixed Hide fixed

github-advanced-security bot found potential problems Jun 6, 2024

View reviewed changes

adrianlizarraga reviewed Jun 6, 2024

View reviewed changes

onnxruntime/python/tools/quantization/quantize.py Show resolved Hide resolved

adrianlizarraga reviewed Jun 6, 2024

View reviewed changes

onnxruntime/test/python/quantization/test_quantize_static.py Show resolved Hide resolved

adrianlizarraga reviewed Jun 7, 2024

View reviewed changes

onnxruntime/python/tools/quantization/quantize.py Outdated Show resolved Hide resolved

adrianlizarraga reviewed Jun 7, 2024

View reviewed changes

onnxruntime/python/tools/quantization/quantize.py Outdated Show resolved Hide resolved

adrianlizarraga reviewed Jun 7, 2024

View reviewed changes

onnxruntime/python/tools/quantization/calibrate.py Show resolved Hide resolved

adrianlizarraga reviewed Jun 7, 2024

View reviewed changes

onnxruntime/python/tools/quantization/calibrate.py Outdated Show resolved Hide resolved

RuomeiMS added 4 commits June 11, 2024 13:11

Add changes for strided calibration

1a654e7

* reading quant data in chunk and calculate/accumulate ranges for resolving OOM * one unit test test_stride_effect_on_data_collection using ort style

refactor quantize.py and calibrate.py changes

a3b0123

Add stride option to static quant config

58ebcbe

Improve unit test

d619de0

RuomeiMS force-pushed the ruomeiyan/strided_calibration branch from c986b09 to d619de0 Compare June 11, 2024 21:49

github-advanced-security bot found potential problems Jun 11, 2024

View reviewed changes

onnxruntime/python/tools/quantization/execution_providers/qnn/quant_config.py Fixed Show fixed Hide fixed

onnxruntime/test/python/quantization/op_test_utils.py Fixed Show fixed Hide fixed

onnxruntime/test/python/quantization/test_quantize_static.py Fixed Show fixed Hide fixed

Fix the seed for random function, improve unit test

88b8065

adrianlizarraga reviewed Jun 14, 2024

View reviewed changes

Fix linting errors

92ab4bf

RuomeiMS force-pushed the ruomeiyan/strided_calibration branch from d570ab4 to 92ab4bf Compare June 19, 2024 19:09

adrianlizarraga approved these changes Jun 21, 2024

View reviewed changes

adrianlizarraga merged commit 7cf9263 into microsoft:main Jun 21, 2024
85 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add changes for strided calibration #20949

Add changes for strided calibration #20949

RuomeiMS commented Jun 6, 2024 •

edited

Loading

mailkv23 commented Jun 6, 2024

adrianlizarraga commented Jun 6, 2024

adrianlizarraga commented Jun 6, 2024

adrianlizarraga commented Jun 6, 2024

azure-pipelines bot commented Jun 6, 2024

azure-pipelines bot commented Jun 6, 2024

azure-pipelines bot commented Jun 6, 2024

adrianlizarraga commented Jun 7, 2024

adrianlizarraga commented Jun 12, 2024

adrianlizarraga commented Jun 12, 2024

adrianlizarraga commented Jun 12, 2024

azure-pipelines bot commented Jun 12, 2024

azure-pipelines bot commented Jun 12, 2024

azure-pipelines bot commented Jun 12, 2024

adrianlizarraga commented Jun 12, 2024

adrianlizarraga commented Jun 12, 2024

adrianlizarraga commented Jun 12, 2024

azure-pipelines bot commented Jun 12, 2024

azure-pipelines bot commented Jun 12, 2024

azure-pipelines bot commented Jun 12, 2024

adrianlizarraga left a comment •

edited

Loading

RuomeiMS commented Jun 19, 2024 •

edited

Loading

RuomeiMS commented Jun 19, 2024

adrianlizarraga commented Jun 19, 2024

adrianlizarraga commented Jun 19, 2024

adrianlizarraga commented Jun 19, 2024

azure-pipelines bot commented Jun 19, 2024

azure-pipelines bot commented Jun 19, 2024

azure-pipelines bot commented Jun 19, 2024

Add changes for strided calibration #20949

Add changes for strided calibration #20949

Conversation

RuomeiMS commented Jun 6, 2024 • edited Loading

mailkv23 commented Jun 6, 2024

adrianlizarraga commented Jun 6, 2024

adrianlizarraga commented Jun 6, 2024

adrianlizarraga commented Jun 6, 2024

azure-pipelines bot commented Jun 6, 2024

azure-pipelines bot commented Jun 6, 2024

azure-pipelines bot commented Jun 6, 2024

adrianlizarraga commented Jun 7, 2024

adrianlizarraga commented Jun 12, 2024

adrianlizarraga commented Jun 12, 2024

adrianlizarraga commented Jun 12, 2024

azure-pipelines bot commented Jun 12, 2024

azure-pipelines bot commented Jun 12, 2024

azure-pipelines bot commented Jun 12, 2024

adrianlizarraga commented Jun 12, 2024

adrianlizarraga commented Jun 12, 2024

adrianlizarraga commented Jun 12, 2024

azure-pipelines bot commented Jun 12, 2024

azure-pipelines bot commented Jun 12, 2024

azure-pipelines bot commented Jun 12, 2024

adrianlizarraga left a comment • edited Loading

Choose a reason for hiding this comment

RuomeiMS commented Jun 19, 2024 • edited Loading

RuomeiMS commented Jun 19, 2024

adrianlizarraga commented Jun 19, 2024

adrianlizarraga commented Jun 19, 2024

adrianlizarraga commented Jun 19, 2024

azure-pipelines bot commented Jun 19, 2024

azure-pipelines bot commented Jun 19, 2024

azure-pipelines bot commented Jun 19, 2024

RuomeiMS commented Jun 6, 2024 •

edited

Loading

adrianlizarraga left a comment •

edited

Loading

RuomeiMS commented Jun 19, 2024 •

edited

Loading