fix: suppress ORT native stderr, fix HANDLE bug, clean up warnings#709
Merged
Conversation
ORT's pybind module writes "Init provider bridge failed." directly to native stderr (fd 2 / Win32 STD_ERROR_HANDLE), bypassing Python's logging/warnings systems so standard filters have no effect. Add _suppress_ep_registration_stderr() in utils/constants.py — the earliest point in the import chain where onnxruntime is first imported (via sysinfo → device.py → constants.py). The context manager redirects fd 2 to /dev/null and also updates Win32 STD_ERROR_HANDLE so both the CRT and Win32 API layers are silenced for the duration. Also applied in session.py around register_to_ort() to cover EP DLL registration warnings on subsequent WinMLSession initialization.
timenick
approved these changes
May 22, 2026
vortex-captain
approved these changes
May 22, 2026
Replace /dev/null redirect with a pipe so ORT native messages are
captured rather than dropped. Each line is re-emitted via
logger.debug('[ORT] ...') with ANSI escape codes stripped.
Because the first onnxruntime import fires during module initialisation
(before any logging handler is registered), messages are also buffered
in _ort_startup_logs. sysinfo() flushes the buffer immediately after
configuring its Rich handler, making the ORT lines visible under
--verbose / --debug.
- HF symlinks UserWarning: INFO → DEBUG (cosmetic, cache still works) - optimum TasksManager architecture-mismatch WARNING → DEBUG (expected behaviour for WinML models, not actionable) - tqdm download progress bars: set HF_HUB_DISABLE_PROGRESS_BARS=1 by default (tqdm writes directly to stderr, cannot be routed through Python logging; override with HF_HUB_DISABLE_PROGRESS_BARS=0) Update test_downgrade_to_info → test_downgrade_to_debug to match.
timenick
reviewed
May 22, 2026
Collaborator
timenick
left a comment
There was a problem hiding this comment.
Reviewed the ORT stderr suppression. The intent is good, but I want to flag one real correctness bug (HANDLE truncation) plus four substantive design/coverage issues before this lands.
Python logging filters on a parent logger are NOT applied to records propagated from child loggers (callHandlers bypasses parent.handle()). Change filter target from 'optimum' to 'optimum.exporters.tasks', the actual logger that emits the TasksManager message.
Header and summary were gated by verbose=True, making them invisible in normal (non -v) mode. self.console is always a valid Rich Console regardless of verbose, so removing the guard makes both sections always visible. Step-by-step progress output (ConsoleWriter) remains verbose-only.
- HF symlinks UserWarning: WARNING → INFO (cosmetic, not actionable) - optimum TasksManager mismatch: WARNING → INFO (expected behaviour) - transformers 'weights not used': WARNING → INFO (expected for checkpoint loading, e.g. pooler dropped in sequence classifiers) Reverts previous DEBUG choice for symlinks/TasksManager.
…ged optim config (#702) Fixes #697. ## Summary - **EP/device selection** — replace the large mode-matrix branching with a straightforward `eps × devices` construction filtered against `EP_SUPPORTED_DEVICES`. When `device=auto and ep=auto`, restrict to locally-available pairs; when only one of them is auto, keep all pairs but warn about ones not supported in the local environment instead of silently dropping them. - **Op-check skipped table** — render a title-only "Skipped - no rule data" table for EP/device pairs with no rule data, so the per-pair output stays consistent. - **Analysis summary** — fix a key-shape mismatch where `ep_instance_counts` was being rewrapped with `"EP@DEVICE"` string keys while the renderer looked them up by `(ep, device)` tuple, which always missed and printed `0/0/0`. Pass the tuple-keyed dict directly. - **Counts format** — always show four numbers `S/P/U/Unk` (Supported / Partial / Unsupported / Unknown) in both the OP CHECK table column and the ANALYSIS SUMMARY line. Update column header and legend. - **Optimization config** — when multiple analysis pairs are selected, merge their `get_optimization_config(ep=…)` outputs into a single config (union of keys). On conflicting values for the same key, log a warning naming each pair's value; the merged config keeps the first pair's value. - Drop the now-unused `pair_hints` plumbing and redundant local-vs-rule hint lines. --------- Co-authored-by: Yi Ren <reny@microsoft.com>
…706) ## Summary - Added `WinMLSession._get_precision()` — a best-effort, operator-schema-based precision estimator that runs over the already-loaded `ModelProto` (no extra I/O, no model-name hardcoding). Exposed through `io_config["precision"]`. - Detection ladder, first match wins: 1. **QDQ** (`QuantizeLinear` / `DequantizeLinear`) — dominant `zero_point` initializer bit width per side; weight-side when the source tensor is an initializer. Returns `int{n}` or `w{w}a{a}`. 2. **Block-wise quant** (`MatMulNBits` / `GatherBlockQuantized`) — schema `bits` attribute + dominant float bit width for activations → `w{w}a{a}`. 3. **Dominant float dtype** among initializers → `fp32` / `fp16` / `bf16`. 4. No signal → `None`. - `_print_model_info` in `commands/perf.py` now shows `Precision:` between `Task:` and `Inputs:`; the line is suppressed when precision is `None` so unknown cases produce no noise. - 6 new unit tests in `tests/unit/session/test_winml_session.py` cover fp32, fp16, int8 QDQ, mixed `w8a16` QDQ, `MatMulNBits` `w4a16`, and the no-signal-returns-`None` path. --------- Co-authored-by: hualxie <hualxie@microsoft.com>
Address PR #709 review comments: - Extract fd/HANDLE manipulation from constants.py to utils/native_stderr.py - Fix 64-bit HANDLE truncation: set proper argtypes/restype via ctypes.wintypes - Fix restore ordering: capture GetStdHandle before dup2, restore after dup2 (UCRT's dup2 for fds 0-2 internally calls SetStdHandle) - Platform-gate: no-op on non-Windows (no pipe/dup2 overhead on Linux/macOS) - Public replay_ort_startup_logs() replaces private _ort_startup_logs access - Add unit tests for capture, ANSI strip, fd restore, Win32 HANDLE restore
Also bumps version to 0.0.6.
…out_warning # Conflicts: # src/winml/modelkit/session/session.py
suppress_native_stderr - discards to devnull, for startup/EP noise capture_native_stderr - captures via pipe and re-logs, for compilation Removes the startup buffer/replay mechanism (startup noise is discarded, not captured) and the combined parameter interface.
timenick
approved these changes
May 25, 2026
timenick
reviewed
May 26, 2026
timenick
approved these changes
May 26, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Init provider bridge failed.directly to native stderr (fd 2 / Win32STD_ERROR_HANDLE), bypassing Python'slogging/warningssystems entirelyutils/native_stderr.py— a dedicated module for capturing and replaying native stderr output from ORT/QNNsuppress_ep_registration_stderr()context manager redirects fd 2 via pipe, re-emits captured lines through Python loggingreplay_ort_startup_logs()public API for deferred replay after logging is configuredargtypes/restypeviactypes.wintypesforGetStdHandleandSetStdHandledup2for fds 0-2 internally callsSetStdHandle, soGetStdHandlemust be captured beforedup2and restored afterimport onnxruntimewith zero fd overheadconstants.pyrestored to leaf-level constants-only moduleReference: #477