Description
Long-running commands that trigger large downloads (calibration datasets, model weights) start the download silently with no upfront warning, no size estimate, and no time estimate. A developer on slow or metered connectivity has no chance to abort before the transfer is underway.
Confirmed reproduction with winml build + calibration dataset:
winml build downloads the entire timm/mini-imagenet dataset (~7 GB — 13 train + 3 validation + 2 test parquet files) even though the config specifies "samples": 10. Quantization took 896 s (~15 min), almost entirely spent downloading. The user sees no warning and no estimated time before this begins.
Steps to Reproduce
winml build -c config.json -m ProsusAI/finbert -o output/
Where config.json specifies:
{ "quant": { "dataset_name": "timm/mini-imagenet", "samples": 10 } }
Expected Behavior
Before any large download begins, the CLI prints a warning with size and estimated time:
⚠ Downloading calibration dataset timm/mini-imagenet (~7.0 GB).
Estimated time on 10 Mbps: ~95 min | 100 Mbps: ~10 min
Press Ctrl+C to cancel.
If the dataset size cannot be determined ahead of time, at minimum print the dataset name and that it may be large, before streaming begins.
The same pre-download warning should apply to:
- Model weight downloads (
winml build -m <huggingface_id>)
- Any other command that triggers a network fetch > a configurable threshold (e.g., 500 MB)
Actual Behavior
No warning is printed. The download starts immediately and silently inside the quantize StageLive block. The first visible signal is the spinner; the user has no indication of how long it will run or how much data will be transferred.
Root Cause (initial analysis)
The datasets library streams or caches parquet shards for the full dataset split regardless of how many samples are consumed downstream. The quantize_onnx call in _run_quantize_stage (build.py) does not query dataset size before fetching, and StageLive suppresses datasets progress bars to keep the display clean — removing the only secondary signal the user might have seen.
Two independent fixes are needed:
- Pre-download warning (UX): Query the Hugging Face Hub API for dataset / model size before fetching and print a structured warning with size + time estimate. Block for 3 s (or until Ctrl+C) to give the user a chance to abort.
- Lazy / partial download (efficiency): Investigate whether
datasets streaming mode or shard-level access can be used to fetch only the N calibration samples without pulling all parquet files first. If feasible, this eliminates the problem for the samples-bounded case entirely.
Environment
Additional Context
This issue affects any command with a slow first-run experience:
| Command |
Download trigger |
Typical size |
winml build (HF model) |
Model weights |
0.1 – 10 GB |
winml build (calibration) |
Dataset parquet shards |
1 – 50 GB |
winml quantize |
Same as above |
1 – 50 GB |
winml eval |
Eval dataset |
variable |
A developer running on coffee shop WiFi or a metered mobile hotspot will abandon the tool after one silent 15-minute hang. The pre-download warning is a low-cost, high-trust fix that should be prioritized independently of the lazy-download optimization.
Description
Long-running commands that trigger large downloads (calibration datasets, model weights) start the download silently with no upfront warning, no size estimate, and no time estimate. A developer on slow or metered connectivity has no chance to abort before the transfer is underway.
Confirmed reproduction with
winml build+ calibration dataset:winml builddownloads the entiretimm/mini-imagenetdataset (~7 GB — 13 train + 3 validation + 2 test parquet files) even though the config specifies"samples": 10. Quantization took 896 s (~15 min), almost entirely spent downloading. The user sees no warning and no estimated time before this begins.Steps to Reproduce
Where
config.jsonspecifies:{ "quant": { "dataset_name": "timm/mini-imagenet", "samples": 10 } }Expected Behavior
Before any large download begins, the CLI prints a warning with size and estimated time:
If the dataset size cannot be determined ahead of time, at minimum print the dataset name and that it may be large, before streaming begins.
The same pre-download warning should apply to:
winml build -m <huggingface_id>)Actual Behavior
No warning is printed. The download starts immediately and silently inside the quantize
StageLiveblock. The first visible signal is the spinner; the user has no indication of how long it will run or how much data will be transferred.Root Cause (initial analysis)
The
datasetslibrary streams or caches parquet shards for the full dataset split regardless of how many samples are consumed downstream. Thequantize_onnxcall in_run_quantize_stage(build.py) does not query dataset size before fetching, andStageLivesuppressesdatasetsprogress bars to keep the display clean — removing the only secondary signal the user might have seen.Two independent fixes are needed:
datasetsstreaming mode or shard-level access can be used to fetch only the N calibration samples without pulling all parquet files first. If feasible, this eliminates the problem for thesamples-bounded case entirely.Environment
Additional Context
This issue affects any command with a slow first-run experience:
winml build(HF model)winml build(calibration)winml quantizewinml evalA developer running on coffee shop WiFi or a metered mobile hotspot will abandon the tool after one silent 15-minute hang. The pre-download warning is a low-cost, high-trust fix that should be prioritized independently of the lazy-download optimization.