[Quantization] Apply workaround for crash when using histogram-based calibrators by adrianlizarraga · Pull Request #21972 · microsoft/onnxruntime

adrianlizarraga · 2024-09-03T22:44:38Z

Description

Applies a workaround that prevents the histogram-based calibrators (percentile, entropy, distribution) from crashing. The workaround involves copying inference outputs that come directly from model inputs. A description of the bug is here: Corrupted value for model outputs that are also model inputs #21922. This PR does not fix the root bug, but instead provides a workaround to unblock users using histogram-based calibration.
Adds a unit test that runs all histogram-based calibrators to help catch future regressions. We didn't have unit tests that ran these calibration methods.

Motivation and Context

Trying to quantize a model with the percentile, entropy, or distribution calibration methods raises an exception:

  File "/.../site-packages/onnxruntime/quantization/quantize.py", line 691, in quantize
    quantize_static(
  File "/.../site-packages/onnxruntime/quantization/quantize.py", line 525, in quantize_static
    calibrator.collect_data(calibration_data_reader)
  File "/.../site-packages/onnxruntime/quantization/calibrate.py", line 571, in collect_data
    self.collector.collect(clean_merged_dict)
  File "/.../site-packages/onnxruntime/quantization/calibrate.py", line 746, in collect
    return self.collect_value(name_to_arr)
  File "/.../site-packages/onnxruntime/quantization/calibrate.py", line 836, in collect_value
    hist, hist_edges = np.histogram(data_arr, self.num_bins, range=(-threshold, threshold))
  File "<__array_function__ internals>", line 180, in histogram
  File ".../site-packages/numpy/lib/histograms.py", line 793, in histogram
    bin_edges, uniform_bins = _get_bin_edges(a, bins, range, weights)
  File "/.../site-packages/numpy/lib/histograms.py", line 426, in _get_bin_edges
    first_edge, last_edge = _get_outer_edges(a, range)
  File "/.../site-packages/numpy/lib/histograms.py", line 315, in _get_outer_edges
    raise ValueError(
ValueError: supplied range of [nan, nan] is not finite

The calibrators create an augmented model with all tensors (including model inputs) set as model outputs. The data for outputs that are also model inputs is corrupted as described in #21922. The corrupted data sometimes contains NaN values that cause numpy's histogram utilities to raise an exception.

…ibration

yufenglee · 2024-09-06T16:58:47Z

The returned python output(python object) is a copy of the output previously. Not only the histogram-based calibrators, other calibrators should also have been impacted.

adrianlizarraga · 2024-09-06T17:04:32Z

The returned python output(python object) is a copy of the output previously. Not only the histogram-based calibrators, other calibrators should also have been impacted.

There is only 1 other calibrator (MinMax). MinMax currently works because it adds ReduceMin/ReduceMax before all graph outputs in the augmented graph. So, this bug is not manifested in the MinMax calibrator.

All other calibrators are histogram-based and crash because the augmented model has outputs that are also inputs (no reduce op added before outputs).

The returned python output(python object) is a copy of the output previously.

There was a PR a few months ago that changed the way OrtValues are wrapped into numpy objects. The root bug could be related. In any case, the bug is very easy to reproduce without using the quantization tool. It just requires running a model via onnxruntime. Please refer to the issue I filed for an example model: #21922

Apply workaround for crash when using histogram-based calibrators

1902097

adrianlizarraga added the quantization issues related to quantization label Sep 3, 2024

adrianlizarraga added 5 commits September 4, 2024 10:16

Merge branch 'main' into adrianl/fix-histogram-based-quantization-cal…

84e3312

…ibration

Merge branch 'main' into adrianl/fix-histogram-based-quantization-cal…

de10074

…ibration

Add unit test to check that all histogram calibrators actually run

ef80e59

Dont hardcode number of tensors in graph

88a268b

Dont call append multiple times

bffc8b9

adrianlizarraga marked this pull request as ready for review September 6, 2024 00:34

adrianlizarraga requested review from chilo-ms, jywu-msft and yufenglee September 6, 2024 00:34

adrianlizarraga added the ep:QNN issues related to QNN exeution provider label Sep 6, 2024

yufenglee reviewed Sep 6, 2024

View reviewed changes

Comment thread onnxruntime/python/tools/quantization/calibrate.py

yufenglee approved these changes Sep 9, 2024

View reviewed changes

adrianlizarraga merged commit c7ae9b9 into main Sep 9, 2024

adrianlizarraga deleted the adrianl/fix-histogram-based-quantization-calibration branch September 9, 2024 19:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Quantization] Apply workaround for crash when using histogram-based calibrators#21972

[Quantization] Apply workaround for crash when using histogram-based calibrators#21972
adrianlizarraga merged 6 commits into
mainfrom
adrianl/fix-histogram-based-quantization-calibration

adrianlizarraga commented Sep 3, 2024 •

edited

Loading

Uh oh!

yufenglee commented Sep 6, 2024

Uh oh!

adrianlizarraga commented Sep 6, 2024 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

adrianlizarraga commented Sep 3, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Motivation and Context

Uh oh!

yufenglee commented Sep 6, 2024

Uh oh!

adrianlizarraga commented Sep 6, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

adrianlizarraga commented Sep 3, 2024 •

edited

Loading

adrianlizarraga commented Sep 6, 2024 •

edited

Loading