Skip to content

bug: winml build shows misleading ValueError (opset domain) when disk is full during quantize step #259

@DingmaomaoBJTU

Description

@DingmaomaoBJTU

Summary

When disk space is exhausted during winml build, the optimize step writes a corrupted/truncated ONNX file. The subsequent quantize step then fails with ValueError: Failed to find proper ai.onnx domain, giving no indication that the real cause is an out-of-disk-space condition. Users will chase a phantom code bug.

Repro

# Fill disk to near-zero free space, then:
winml build -c config.json -m microsoft/resnet-50 -o model_a_config/

Error

ValueError: Failed to find proper ai.onnx domain in
onnxruntime.quantization.quant_utils.get_opset_version

Stack trace: quantizer.py:166 → quantize.py:909 → quantize.py:702 → quant_utils.py:984 → quant_utils.py:977 → ValueError

Root Cause

The optimize step writes a truncated/zero-byte ONNX output due to OSError (disk full). The corrupted file is passed to the quantizer, which fails trying to parse the opset domain. The OSError is swallowed or not surfaced to the user.

Expected behavior

ModelKit should either:

  1. Check available disk space before winml build starts and warn if below a threshold, or
  2. Catch OSError during ONNX file writes and surface a clear error: "Insufficient disk space — unable to write output file."

Notes

  • Severity: P2
  • Reporter: Agency bug bash feedback
  • Context: Full bug bash consumes significant disk — venv (~1.4 GB), HuggingFace cache (~14.4 GB), per-model artifacts (~200–400 MB each). Minimum ~20 GB free disk recommended.

Metadata

Metadata

Assignees

Labels

P2Medium — minor bug or non-critical improvementbugSomething isn't workingdev experienceDeveloper experience improvementsgood first issueGood for newcomerstriagedIssue has been triaged

Type

No fields configured for Bug.

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions