[Quantization] Apply workaround for crash when using histogram-based calibrators#21972
Conversation
|
The returned python output(python object) is a copy of the output previously. Not only the histogram-based calibrators, other calibrators should also have been impacted. |
There is only 1 other calibrator (MinMax). MinMax currently works because it adds ReduceMin/ReduceMax before all graph outputs in the augmented graph. So, this bug is not manifested in the MinMax calibrator. All other calibrators are histogram-based and crash because the augmented model has outputs that are also inputs (no reduce op added before outputs).
There was a PR a few months ago that changed the way OrtValues are wrapped into numpy objects. The root bug could be related. In any case, the bug is very easy to reproduce without using the quantization tool. It just requires running a model via onnxruntime. Please refer to the issue I filed for an example model: #21922 |
Description
Motivation and Context
Trying to quantize a model with the percentile, entropy, or distribution calibration methods raises an exception:
The calibrators create an augmented model with all tensors (including model inputs) set as model outputs. The data for outputs that are also model inputs is corrupted as described in #21922. The corrupted data sometimes contains
NaNvalues that cause numpy's histogram utilities to raise an exception.