Skip to content

refactor: unify quantization config and integrate quant module into compiler #71

@DingmaomaoBJTU

Description

@DingmaomaoBJTU

Summary

The compiler module has its own internal quantization pipeline (calibrate → qdq stages) that duplicates the quant module's functionality with bugs and config mismatches. This needs to be unified.

Key Findings

1. Compiler's internal quantization is broken

CalibrateStage collects calibration ranges via ORT's MinMaxCalibrater, but QDQStage ignores them entirely — it creates a PrecomputedCalibrationReader with all-ones dummy data and passes that to quantize_static(). The calibration work is wasted. The quant module's get_qdq_config() + quantize() does not have this bug.

2. Duplicate configs with different defaults

QDQConfig (compiler) and WinMLQuantizationConfig (quant) overlap on 4 fields:

  • weight_type: compiler defaulted to int8, quant defaults to uint8caused QNN EP to reject LayerNorm nodes (30 CPU fallback partitions instead of 1 EPContext). Fixed in winml sys --list-device --format compact: --format flag silently ignored #232 but root cause is the duplication.
  • activation_type, per_channel, symmetric: same semantics, different classes.
  • samples: quant defaults to 10, compiler's CalibrationConfig defaults to 100.

3. compile.quantize=True is misleading

WinMLCompileConfig.to_dict() always emits quantize: True when qdq_config is not None. But DetectStage silently overrides this when Q/DQ nodes already exist (the normal build pipeline path). The flag appears to control behavior but is actually a no-op in production.

Proposed Design

Step 1: Unify quantization config

Make WinMLQuantizationConfig the single source of truth. Remove QDQConfig + CalibrationConfig from compiler. Port missing compiler-only fields (distribution, seed, load_path) to WinMLQuantizationConfig.

Step 2: Replace compiler's calibrate+qdq with quant module call

Replace CalibrateStage + QDQStage with a single QuantizeStage that calls quantize_onnx() from the quant module. Fixes the broken calibration pipeline.

Step 3: Thread quant_config through CompileContext

Add quant_config: WinMLQuantizationConfig | None to CompileContext. Update context.quantize to check self.quant_config is not None.

Step 4: Fix serialization round-trip

WinMLCompileConfig.to_dict() / from_dict() should serialize/deserialize via WinMLQuantizationConfig.to_dict() instead of the flat field extraction that caused the int8/uint8 mismatch.

Step 5: Add build config validation

When both config.quant and config.compile.quant_config are set, raise ValueError — the build pipeline runs quant before compile, so they can't both be active.

Step 6: Propagate QuantizeResult into build manifest

Thread nodes_quantized, calibration_time_seconds, qdq_insertion_time_seconds from QuantizeResult into the build manifest for observability.

Additional: wmk build for pre-exported ONNX models

The build command currently requires a HuggingFace model ID and runs Export → Optimize → Analyze → Quantize → Compile. We also need a path where:

# Build from existing ONNX (skip export/optimize/analyze)
wmk build --onnx model.onnx --device qnn --precision uint8 -o output/

This would:

  • Accept a pre-exported ONNX model as input (--onnx flag)
  • Skip export, optimize, and analyze stages
  • Accept --device (qnn, dml, cpu) and --precision (uint8, int8, fp16, fp32)
  • Map --device + --precision to the appropriate WinMLQuantizationConfig + WinMLCompileConfig
  • Invoke quantize → compile with the unified config

This enables the "I already have an ONNX model, just quantize and compile it for my target device" workflow.

Acceptance Criteria

  • QDQConfig and CalibrationConfig removed from compiler
  • WinMLQuantizationConfig is the single quant config used by both modules
  • Compiler uses quantize_onnx() internally (no more broken calibrate+qdq)
  • wmk build --onnx model.onnx --device qnn --precision uint8 works
  • Build manifest records quantization metrics
  • No regression in wmk perf for all existing models

🤖 Generated with Claude Code

Metadata

Metadata

Assignees

Labels

refactorCode refactoring

Type

No fields configured for Task.

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions