Skip to content

fix: let HTTPClientConfig and template regen work on non-Linux#291

Merged
nv-alicheng merged 3 commits intomainfrom
fix/viraatc-macos-template-regen
Apr 22, 2026
Merged

fix: let HTTPClientConfig and template regen work on non-Linux#291
nv-alicheng merged 3 commits intomainfrom
fix/viraatc-macos-template-regen

Conversation

@viraatc
Copy link
Copy Markdown
Collaborator

@viraatc viraatc commented Apr 22, 2026

Summary

Two independent issues broke pre-commit run (and any HTTPClientConfig() construction) on macOS. Fixing both:

(A) src/inference_endpoint/endpoint_client/config.py_get_auto_num_workers now catches UnsupportedPlatformError from get_current_numa_node / get_cpus_in_numa_node and falls back to min_workers=10. Matches the existing "NUMA not discoverable" branch. Unbricks the CLI on macOS too.

(B) scripts/regenerate_templates.py_dump_defaults was documented to avoid running model validators, but it called default_factory() on every field, which constructs nested models (and runs their validators) whenever the factory is a BaseModel subclass. Now it recurses into _dump_defaults(factory) when the factory itself is a BaseModel subclass. Factories that dynamically pick a concrete subclass (e.g. TransportConfig.create_default → ZMQTransportConfig) are not types, so they still get called and the resulting instance is walked via the existing branch.

Either fix alone unbricks pre-commit on macOS — (A) makes the validator safe to run, (B) stops the script from running it. Together they're belt-and-suspenders: (A) fixes the product, (B) hardens the tooling against future validators adding side effects.

Template output is unchanged (and platform-independent)

Generated templates are byte-identical before and after this change, and identical across macOS / Linux. _dump_defaults emits class-level declared defaults (e.g. num_workers: -1 — the "auto" sentinel), not the validator-resolved values. Even on Linux before this PR, the resolved value from HTTPClientConfig() was silently discarded because the serializer re-walks the class (not the instance). So the validator's NUMA-aware resolution never landed in the template file — it was only ever a runtime concern. This PR just stops the pointless construction.

Alternatives considered

  • LBYL platform guard (exporting IS_LINUX from cpu_affinity.py and checking it at the top of _get_auto_num_workers): arguably more idiomatic Python for a non-racy condition, and could replace the try/except in (A). Kept the current form for minimal diff; happy to swap in a follow-up.
  • Push fallback into cpu_affinity.py (drop @require_linux from read-only queries, keep it on mutators like pin_loadgen): the most principled shape — "errors for exceptional conditions, sentinels for absent data" — but a broader change that needs AttributeError handling for os.sched_getaffinity on macOS. Deferred.

Test plan

  • Linux: pre-commit run --all-files passes, generated templates are byte-identical
  • Simulated non-Linux (UnsupportedPlatformError patched to always raise): HTTPClientConfig() constructs with num_workers=10; regen script completes without entering the NUMA code path
  • Real macOS: pre-commit run --all-files on darwin

🤖 Generated with Claude Code

Two independent issues that together broke pre-commit (and any
HTTPClientConfig construction) on macOS:

1. HTTPClientConfig()._resolve_defaults calls _get_auto_num_workers when
   num_workers=-1, which invokes get_current_numa_node(). That function
   is @require_linux and raises UnsupportedPlatformError on darwin.
   Fix: catch UnsupportedPlatformError and fall back to min_workers=10,
   matching the existing "NUMA not discoverable" branch.

2. scripts/regenerate_templates.py::_dump_defaults was documented to
   avoid running model validators, but called default_factory() for
   every field — which constructs the nested model (and runs its
   validators) whenever the factory is a BaseModel subclass. Fix:
   when the factory is a BaseModel subclass, recurse into
   _dump_defaults(factory) instead. Factories that dynamically pick a
   concrete subclass (e.g. TransportConfig.create_default -> ZMQTransportConfig)
   are not types, so they still get called as before and the concrete
   subclass is walked via the existing isinstance(default, BaseModel)
   branch.

Verified on Linux: pre-commit passes, templates are byte-identical.
Verified against simulated non-Linux (UnsupportedPlatformError patched
to always raise): HTTPClientConfig() constructs, and the regen script
completes without entering the NUMA code path at all.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@viraatc viraatc requested review from a team and Copilot April 22, 2026 00:40
@github-actions github-actions Bot requested review from arekay-nv and nvzhihanj April 22, 2026 00:40
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 22, 2026

MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes macOS/non-Linux breakages by making HTTP client worker auto-detection resilient to unsupported NUMA probing and by preventing template regeneration from constructing nested Pydantic models (and thus running validators) when extracting defaults.

Changes:

  • Catch UnsupportedPlatformError during NUMA probing in _get_auto_num_workers() and fall back to min_workers.
  • Update scripts/regenerate_templates.py::_dump_defaults() to recurse into BaseModel subclasses used as default_factory instead of calling them.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
src/inference_endpoint/endpoint_client/config.py Makes HTTPClientConfig(num_workers=-1) safe on non-Linux by handling unsupported NUMA probing.
scripts/regenerate_templates.py Avoids running validators during template regeneration by not calling BaseModel class factories.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/inference_endpoint/endpoint_client/config.py
Comment thread scripts/regenerate_templates.py
@viraatc viraatc changed the title fix: let HTTPClientConfig and template regen work on non-Linux [WIP] fix: let HTTPClientConfig and template regen work on non-Linux Apr 22, 2026
Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request improves cross-platform compatibility and prevents side effects during template generation. Specifically, scripts/regenerate_templates.py was updated to recurse into BaseModel subclasses when used as a default_factory to avoid triggering validators. Additionally, src/inference_endpoint/endpoint_client/config.py now handles UnsupportedPlatformError during NUMA probing, allowing the client configuration to be instantiated on non-Linux systems. I have no feedback to provide as the review comments were purely explanatory.

@viraatc viraatc changed the title [WIP] fix: let HTTPClientConfig and template regen work on non-Linux fix: let HTTPClientConfig and template regen work on non-Linux Apr 22, 2026
viraatc and others added 2 commits April 22, 2026 12:14
Adds unit tests for both fixes in this PR:

- tests/unit/endpoint_client/test_http_client_config.py: patches
  get_current_numa_node / get_cpus_in_numa_node to raise
  UnsupportedPlatformError and asserts _get_auto_num_workers and
  HTTPClientConfig() fall back to min_workers=10.

- tests/unit/config/test_regenerate_templates.py: defines a BaseModel
  with a counter-incrementing model_validator used as default_factory,
  and asserts _dump_defaults does not invoke the validator while still
  emitting the nested defaults. Also covers the non-BaseModel callable
  factory path (lambda) to ensure it still gets called.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Personal editor/settings files (.claude.local.md, .nvimrc.lua) were
pulled in by git add -A in the previous commit. They should stay
untracked.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings April 22, 2026 19:14
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@nv-alicheng
Copy link
Copy Markdown
Collaborator

Verified on MacOS, thanks!

@nv-alicheng nv-alicheng merged commit ed1704e into main Apr 22, 2026
11 checks passed
@github-actions github-actions Bot locked and limited conversation to collaborators Apr 22, 2026
@viraatc viraatc deleted the fix/viraatc-macos-template-regen branch April 22, 2026 22:03
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants