Skip to content

[Quantization] Apply workaround for crash when using histogram-based calibrators#21972

Merged
adrianlizarraga merged 6 commits into
mainfrom
adrianl/fix-histogram-based-quantization-calibration
Sep 9, 2024
Merged

[Quantization] Apply workaround for crash when using histogram-based calibrators#21972
adrianlizarraga merged 6 commits into
mainfrom
adrianl/fix-histogram-based-quantization-calibration

Conversation

@adrianlizarraga
Copy link
Copy Markdown
Contributor

@adrianlizarraga adrianlizarraga commented Sep 3, 2024

Description

  • Applies a workaround that prevents the histogram-based calibrators (percentile, entropy, distribution) from crashing. The workaround involves copying inference outputs that come directly from model inputs. A description of the bug is here: Corrupted value for model outputs that are also model inputs #21922. This PR does not fix the root bug, but instead provides a workaround to unblock users using histogram-based calibration.
  • Adds a unit test that runs all histogram-based calibrators to help catch future regressions. We didn't have unit tests that ran these calibration methods.

Motivation and Context

Trying to quantize a model with the percentile, entropy, or distribution calibration methods raises an exception:

  File "/.../site-packages/onnxruntime/quantization/quantize.py", line 691, in quantize
    quantize_static(
  File "/.../site-packages/onnxruntime/quantization/quantize.py", line 525, in quantize_static
    calibrator.collect_data(calibration_data_reader)
  File "/.../site-packages/onnxruntime/quantization/calibrate.py", line 571, in collect_data
    self.collector.collect(clean_merged_dict)
  File "/.../site-packages/onnxruntime/quantization/calibrate.py", line 746, in collect
    return self.collect_value(name_to_arr)
  File "/.../site-packages/onnxruntime/quantization/calibrate.py", line 836, in collect_value
    hist, hist_edges = np.histogram(data_arr, self.num_bins, range=(-threshold, threshold))
  File "<__array_function__ internals>", line 180, in histogram
  File ".../site-packages/numpy/lib/histograms.py", line 793, in histogram
    bin_edges, uniform_bins = _get_bin_edges(a, bins, range, weights)
  File "/.../site-packages/numpy/lib/histograms.py", line 426, in _get_bin_edges
    first_edge, last_edge = _get_outer_edges(a, range)
  File "/.../site-packages/numpy/lib/histograms.py", line 315, in _get_outer_edges
    raise ValueError(
ValueError: supplied range of [nan, nan] is not finite

The calibrators create an augmented model with all tensors (including model inputs) set as model outputs. The data for outputs that are also model inputs is corrupted as described in #21922. The corrupted data sometimes contains NaN values that cause numpy's histogram utilities to raise an exception.

@adrianlizarraga adrianlizarraga added the quantization issues related to quantization label Sep 3, 2024
@adrianlizarraga adrianlizarraga marked this pull request as ready for review September 6, 2024 00:34
@adrianlizarraga adrianlizarraga added the ep:QNN issues related to QNN exeution provider label Sep 6, 2024
@yufenglee
Copy link
Copy Markdown
Member

The returned python output(python object) is a copy of the output previously. Not only the histogram-based calibrators, other calibrators should also have been impacted.

@adrianlizarraga
Copy link
Copy Markdown
Contributor Author

adrianlizarraga commented Sep 6, 2024

The returned python output(python object) is a copy of the output previously. Not only the histogram-based calibrators, other calibrators should also have been impacted.

There is only 1 other calibrator (MinMax). MinMax currently works because it adds ReduceMin/ReduceMax before all graph outputs in the augmented graph. So, this bug is not manifested in the MinMax calibrator.

All other calibrators are histogram-based and crash because the augmented model has outputs that are also inputs (no reduce op added before outputs).

The returned python output(python object) is a copy of the output previously.

There was a PR a few months ago that changed the way OrtValues are wrapped into numpy objects. The root bug could be related. In any case, the bug is very easy to reproduce without using the quantization tool. It just requires running a model via onnxruntime. Please refer to the issue I filed for an example model: #21922

Comment thread onnxruntime/python/tools/quantization/calibrate.py
@adrianlizarraga adrianlizarraga merged commit c7ae9b9 into main Sep 9, 2024
@adrianlizarraga adrianlizarraga deleted the adrianl/fix-histogram-based-quantization-calibration branch September 9, 2024 19:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ep:QNN issues related to QNN exeution provider quantization issues related to quantization

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants