Quantize: Implement functional e2e test cases and fix issues found during test#608
Merged
Conversation
xieofxie
reviewed
May 13, 2026
timenick
reviewed
May 13, 2026
xieofxie
approved these changes
May 14, 2026
timenick
approved these changes
May 14, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #504
Fix issue
Issue 1
When user specify
--precision int16and config.json but noweight_typeoractivation_type, the quantize is using uint8 as the weight and activation.This is because the config resolution logic uses the default value of the data classes instead of the user specified value in CLI. This issue is fixed by only loading the user specified values in the config json file.
Issue 2
The
winml quantizecommand doesn't take model id, causing the calibration always using RandomDataset. The fix is to add the option.This doesn't impact
buildcommand, because build command calls the internalquantize_onnxpython method directly which has model id parameter.Issue 3
When user provides a non-exist output directory, the command fails. The fix is to create that directory.
Add e2e test
Precision routing
winml quantize -m <tiny> --samples 4UINT8; input metadata surviveswinml quantize -m <tiny> --precision int8 --samples 4UINT8(preset → uint8/uint8)winml quantize -m <tiny> --precision int16 --samples 4INT16(skip ORT inference)winml quantize -m <tiny> --precision w8a16 --samples 4UINT16(skip ORT inference)winml quantize -m <tiny> --precision int8 --weight-type int8 --activation-type uint8 --samples 4INT8(explicit beats preset)winml quantize -m <tiny> --precision fp16 --samples 4UINT8(documented silent fallback)Calibration method
winml quantize -m <tiny> --method minmax --samples 4winml quantize -m <tiny> --method entropy --samples 4winml quantize -m <tiny> --method percentile --samples 4Quant options
winml quantize -m <tiny> --per-channel --samples 4winml quantize -m <tiny> --symmetric --weight-type int8 --samples 4Per-task calibration datasets (one per
TASK_DATASET_MAPPINGclass)Each row constructs the real
DatasetCalibrationReaderinstance for the given task using theHuggingFace preprocessor of the supplied model.
winml quantize -m <tiny> --task random --samples 4RandomDatasetpathwinml quantize -m <onnx_imgcls> --task image-classification --model-id microsoft/resnet-50 --samples 4AutoImageProcessorImageDatasetpathwinml quantize -m <onnx_txtcls> --task text-classification --model-id Intel/bert-base-uncased-mrpc --samples 4AutoTokenizerTextDatasetpathwinml quantize -m <onnx_objdet> --task object-detection --model-id hustvl/yolos-small --samples 4ObjectDetectionDatasetpathwinml quantize -m <onnx_imgseg> --task image-segmentation --model-id nvidia/segformer-b0-finetuned-ade-512-512 --samples 4ImageSegmentationDatasetpathwinml quantize -m <tiny> --task automatic-speech-recognition --samples 4falling back to RandomDatasetOutput behavior
winml quantize -m <tiny> --samples 4<tiny.parent>/<tiny.stem>_qdq.onnxwinml quantize -m <tiny> -o <tmp>/out/custom.onnx --samples 4-opathwinml quantize -m <tiny> -o <tmp>/missing/nested/custom.onnx --samples 4FileNotFoundErrorfromos.chdir)winml quantize -m <tiny_ext>.onnx -o <tmp>/out/quant_ext.onnx --samples 4quant_ext.onnxandquant_ext.onnx.dataexistBuild-config precedence (CLI vs config file)
winml quantize -m <tiny> --config bc.json --samples 4bc.json={"quant":{"samples":50,"calibration_method":"entropy"}}Samples: 4(CLI wins) andMethod: entropy(config used)winml quantize -m <tiny> --config bc.json --precision int16 --samples 4bc.json={"quant":{}}INT16(regression: explicit--precisionbeats empty config)Build-config key absorption sweep
For each
quant.*key, assert it is consumed when CLI omits it. Verified by structuralinspection of the produced model (not stdout).
winml quantize -m <tiny> --config bc.json --samples 4bc.json={"quant":{"weight_type":"int8"}}INT8winml quantize -m <tiny> --config bc.json --samples 4bc.json={"quant":{"per_channel":true}}winml quantize -m <tiny> --config bc.json --samples 4bc.json={"quant":{"symmetric":true,"weight_type":"int8"}}winml quantize -m <tiny> --config bc.json --samples 4bc.json={"quant":{"task":"automatic-speech-recognition"}}falling back to RandomDataset(config task flowed to dataset selection)Verbose
winml quantize -m <tiny> -o <out> --samples 4 -vvs same without-vErrors
winml quantizeMissing option .*--modelwinml quantize -m <tmp>/nope.onnxdoes not existwinml quantize -m <tiny> --method gaussianInvalid value for '--method'winml quantize -m <tiny> --weight-type float8Invalid value for '--weight-type'winml quantize -m <bad>.onnx --samples 4Quantization failedAND a parse-related substring (parse/protobuf/decode/load/invalid)Total: 6 + 3 + 2 + 6 + 4 + 2 + 4 + 1 + 5 = 33 cases. All passing.