Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add changes for strided calibration #20949

Merged

Conversation

RuomeiMS
Copy link
Contributor

@RuomeiMS RuomeiMS commented Jun 6, 2024

Context and motivation:
When quantizing large transformer models, we faced OOM issue when the number of calibration samples goes up. To resolve this, in the PR we want to add support for reading quantization data in chunck, calculating ranges for intermediate tensors, then accumulating results for the final ranges.

@mailkv23
Copy link

mailkv23 commented Jun 6, 2024

Please fill in motivation and context

@adrianlizarraga
Copy link
Contributor

/azp run Linux CPU CI Pipeline, Linux CPU Minimal Build E2E CI Pipeline, Linux GPU CI Pipeline, Linux GPU TensorRT CI Pipeline, Linux OpenVINO CI Pipeline, MacOS CI Pipeline, ONNX Runtime Web CI Pipeline, onnxruntime-binary-size-checks-ci-pipeline, Linux QNN CI Pipeline

@adrianlizarraga
Copy link
Contributor

/azp run Windows CPU CI Pipeline, Windows GPU CI Pipeline, Windows GPU TensorRT CI Pipeline, Windows ARM64 QNN CI Pipeline, orttraining-linux-ci-pipeline, orttraining-linux-gpu-ci-pipeline, orttraining-ortmodule-distributed, Windows x64 QNN CI Pipeline, Linux MIGraphX CI Pipeline, Big Models

@adrianlizarraga
Copy link
Contributor

/azp run ONNX Runtime React Native CI Pipeline, orttraining-amd-gpu-ci-pipeline, Linux Android Emulator QNN CI Pipeline

Copy link

Azure Pipelines successfully started running 9 pipeline(s).

Copy link

Azure Pipelines successfully started running 3 pipeline(s).

Copy link

Azure Pipelines successfully started running 10 pipeline(s).

@adrianlizarraga
Copy link
Contributor

@RuomeiMS It looks like this same feature was already implemented in the past using a different option called CalibMaxIntermediateOutputs. Here's the PR: #17029.

However, that PR was not tested, and several follow-up PRs broke its functionality: #19416

I think that all we have to do is add the following new line to collect_data() in calibrate.py line 397.

    def collect_data(self, data_reader: CalibrationDataReader):
        while True:
            inputs = data_reader.get_next()
            if not inputs:
                break
            self.intermediate_outputs.append(self.infer_session.run(None, inputs))
            if (
                self.max_intermediate_outputs is not None
                and len(self.intermediate_outputs) == self.max_intermediate_outputs
            ):
                self.compute_data()  # ADD this line
                self.clear_collected_data()

If you do this, then we may be able to just use the CalibMaxIntermediateOutputs option. This PR would be pretty small and it would also add useful unit tests.

* reading quant data in chunk and calculate/accumulate ranges for resolving OOM
* one unit test test_stride_effect_on_data_collection using ort style
@RuomeiMS RuomeiMS force-pushed the ruomeiyan/strided_calibration branch from c986b09 to d619de0 Compare June 11, 2024 21:49
@adrianlizarraga
Copy link
Contributor

/azp run Linux CPU CI Pipeline, Linux CPU Minimal Build E2E CI Pipeline, Linux GPU CI Pipeline, Linux GPU TensorRT CI Pipeline, Linux OpenVINO CI Pipeline, MacOS CI Pipeline, ONNX Runtime Web CI Pipeline, onnxruntime-binary-size-checks-ci-pipeline, Linux QNN CI Pipeline

@adrianlizarraga
Copy link
Contributor

/azp run Windows CPU CI Pipeline, Windows GPU CI Pipeline, Windows GPU TensorRT CI Pipeline, Windows ARM64 QNN CI Pipeline, orttraining-linux-ci-pipeline, orttraining-linux-gpu-ci-pipeline, orttraining-ortmodule-distributed, Windows x64 QNN CI Pipeline, Linux MIGraphX CI Pipeline, Big Models

@adrianlizarraga
Copy link
Contributor

/azp run ONNX Runtime React Native CI Pipeline, orttraining-amd-gpu-ci-pipeline, Linux Android Emulator QNN CI Pipeline

Copy link

Azure Pipelines successfully started running 3 pipeline(s).

Copy link

Azure Pipelines successfully started running 9 pipeline(s).

Copy link

Azure Pipelines successfully started running 10 pipeline(s).

@adrianlizarraga
Copy link
Contributor

/azp run Linux CPU CI Pipeline, Linux CPU Minimal Build E2E CI Pipeline, Linux GPU CI Pipeline, Linux GPU TensorRT CI Pipeline, Linux OpenVINO CI Pipeline, MacOS CI Pipeline, ONNX Runtime Web CI Pipeline, onnxruntime-binary-size-checks-ci-pipeline, Linux QNN CI Pipeline

@adrianlizarraga
Copy link
Contributor

/azp run Windows CPU CI Pipeline, Windows GPU CI Pipeline, Windows GPU TensorRT CI Pipeline, Windows ARM64 QNN CI Pipeline, orttraining-linux-ci-pipeline, orttraining-linux-gpu-ci-pipeline, orttraining-ortmodule-distributed, Windows x64 QNN CI Pipeline, Linux MIGraphX CI Pipeline, Big Models

@adrianlizarraga
Copy link
Contributor

/azp run ONNX Runtime React Native CI Pipeline, orttraining-amd-gpu-ci-pipeline, Linux Android Emulator QNN CI Pipeline

Copy link

Azure Pipelines successfully started running 3 pipeline(s).

Copy link

Azure Pipelines successfully started running 9 pipeline(s).

Copy link

Azure Pipelines successfully started running 10 pipeline(s).

Copy link
Contributor

@adrianlizarraga adrianlizarraga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking great!
We need to make sure all linting errors are fixed before we merge. Could you please run the following in your root onnxruntime directory and fix all issues?

lintrunner -a

https://github.com/microsoft/onnxruntime/actions/runs/9478693880/job/26136037422?pr=20949

@RuomeiMS
Copy link
Contributor Author

RuomeiMS commented Jun 19, 2024

@RuomeiMS please read the following Contributor License Agreement(CLA). If you agree with the CLA, please reply with the following information.

@microsoft-github-policy-service agree [company="{your company}"]

Options:

  • (default - no company specified) I have sole ownership of intellectual property rights to my Submissions and I am not making Submissions in the course of work for my employer.
@microsoft-github-policy-service agree
  • (when company given) I am making Submissions in the course of work for my employer (or my employer has intellectual property rights in my Submissions by contract or applicable law). I have permission from my employer to make Submissions and enter into this Agreement on behalf of my employer. By signing below, the defined term “You” includes me and my employer.
@microsoft-github-policy-service agree company="Microsoft"

Contributor License Agreement

@microsoft-github-policy-service agree

@RuomeiMS
Copy link
Contributor Author

@RuomeiMS the command you issued was incorrect. Please try again.

Examples are:

@microsoft-github-policy-service agree

and

@microsoft-github-policy-service agree company="your company"

@microsoft-github-policy-service agree

@RuomeiMS RuomeiMS force-pushed the ruomeiyan/strided_calibration branch from d570ab4 to 92ab4bf Compare June 19, 2024 19:09
@adrianlizarraga
Copy link
Contributor

/azp run Linux CPU CI Pipeline, Linux CPU Minimal Build E2E CI Pipeline, Linux GPU CI Pipeline, Linux GPU TensorRT CI Pipeline, Linux OpenVINO CI Pipeline, MacOS CI Pipeline, ONNX Runtime Web CI Pipeline, onnxruntime-binary-size-checks-ci-pipeline, Linux QNN CI Pipeline

@adrianlizarraga
Copy link
Contributor

/azp run Windows CPU CI Pipeline, Windows GPU CI Pipeline, Windows GPU TensorRT CI Pipeline, Windows ARM64 QNN CI Pipeline, orttraining-linux-ci-pipeline, orttraining-linux-gpu-ci-pipeline, orttraining-ortmodule-distributed, Windows x64 QNN CI Pipeline, Linux MIGraphX CI Pipeline, Big Models

@adrianlizarraga
Copy link
Contributor

/azp run ONNX Runtime React Native CI Pipeline, orttraining-amd-gpu-ci-pipeline, Linux Android Emulator QNN CI Pipeline

Copy link

Azure Pipelines successfully started running 3 pipeline(s).

Copy link

Azure Pipelines successfully started running 9 pipeline(s).

Copy link

Azure Pipelines successfully started running 10 pipeline(s).

@adrianlizarraga adrianlizarraga merged commit 7cf9263 into microsoft:main Jun 21, 2024
85 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants