Skip to content

feat: Constrain HttpModelClient to single concurrency mode...#439

Open
nabinchha wants to merge 3 commits intomainfrom
nm/overhaul-model-facade-guts-pr5
Open

feat: Constrain HttpModelClient to single concurrency mode...#439
nabinchha wants to merge 3 commits intomainfrom
nm/overhaul-model-facade-guts-pr5

Conversation

@nabinchha
Copy link
Contributor

Summary

Fifth PR in the model facade overhaul series (plan, architecture notes). Constrains each HttpModelClient instance to a single execution mode — sync or async — at construction time, eliminating the dual-mode lifecycle complexity that caused transport leaks and cross-mode teardown bugs surfaced during PR-4 review (#426). Adds ModelRegistry.arun_health_check() so health checks use the async path when DATA_DESIGNER_ASYNC_ENGINE=1.

Previous PRs:

Changes

Added

  • ClientConcurrencyMode StrEnum (http_model_client.py) — replaces Literal["sync", "async"] type alias with a proper enum for runtime type identity and IDE autocomplete
  • ModelRegistry.arun_health_check() (registry.py) — async mirror of run_health_check() that calls agenerate / agenerate_text_embeddings / agenerate_image on model facades
  • Async health check dispatch (column_wise_builder.py) — submits arun_health_check() to the background event loop via asyncio.run_coroutine_threadsafe when DATA_DESIGNER_ASYNC_ENGINE=1
  • PR-5 architecture notes (plans/343/model-facade-overhaul-pr-5-architecture-notes.md)

Changed

  • HttpModelClient (http_model_client.py) — constructor accepts concurrency_mode parameter; _get_sync_client() / _get_async_client() raise RuntimeError if called in the wrong mode; close() and aclose() simplified to single-mode teardown (cross-mode calls are no-ops)
  • Factory chainclient_concurrency_mode parameter threaded through create_model_clientcreate_model_registrycreate_resource_provider, derived from DATA_DESIGNER_ASYNC_ENGINE env var
  • ensure_async_engine_loop (async_concurrency.py) — renamed from _ensure_async_engine_loop (now public, used cross-module)
  • Test helpers (test_anthropic.py, test_openai_compatible.py) — auto-derive concurrency_mode from which mock client is injected
  • PR-4 architecture notes — updated planned follow-on section to reflect PR-5 scope change

Fixed

  • Transport leak: close() on a dual-mode client left the async transport open; aclose() never touched the transport at all
  • Cross-mode teardown: close() could not await aclient.aclose(); aclose() had to also handle sync cleanup
  • Health check mode mismatch: async-engine registries ran sync health checks, hitting mode enforcement guards

Attention Areas

Reviewers: Please pay special attention to the following:

  • http_model_client.py — mode enforcement guards in _get_sync_client / _get_async_client and simplified close() / aclose()
  • Factory chain threadingconcurrency_mode flows from env var through resource_provider.pymodels/factory.pyclients/factory.py → adapter constructors
  • registry.pyarun_health_check() mirrors run_health_check() with async facade methods
  • column_wise_builder.py — async health check dispatch via run_coroutine_threadsafe

Test plan

  • uv run ruff check on all changed source files
  • uv run pytest on all new and updated test files
  • Lifecycle tests: sync close, async aclose, idempotency, cross-mode no-ops
  • Mode enforcement tests: wrong-mode access raises RuntimeError
  • Factory forwarding tests: client_concurrency_mode reaches adapter constructors
  • Async health check tests: success and auth error propagation

Made with Cursor

Constrain each HttpModelClient instance to sync or async at
construction time, eliminating dual-mode lifecycle complexity
that caused transport leaks and cross-mode teardown bugs.

- Add ClientConcurrencyMode StrEnum replacing Literal type alias
- Add concurrency_mode constructor param with mode enforcement
  guards on _get_sync_client / _get_async_client
- Simplify close()/aclose() to single-mode teardown (cross-mode
  calls are no-ops)
- Thread client_concurrency_mode through factory chain from
  DATA_DESIGNER_ASYNC_ENGINE env var
- Add ModelRegistry.arun_health_check() async mirror and wire
  async dispatch in ColumnWiseDatasetBuilder
- Make ensure_async_engine_loop public (used cross-module)
- Fix test helpers to derive concurrency mode from injected client
- Add PR-5 architecture notes
@nabinchha nabinchha requested a review from a team as a code owner March 19, 2026 18:21
@nabinchha nabinchha changed the title feat: Constrain HttpModelClient to single concurrency mode with async health checks feat: Constrain HttpModelClient to single concurrency mode... Mar 19, 2026
@greptile-apps
Copy link
Contributor

greptile-apps bot commented Mar 19, 2026

Greptile Summary

This PR (PR-5 in the model facade overhaul series) constrains each HttpModelClient instance to a single concurrency mode — sync or async — set at construction time via the new ClientConcurrencyMode enum. It eliminates dual-mode lifecycle complexity (transport leaks, cross-mode teardown) that surfaced in the PR-4 review, and adds ModelRegistry.arun_health_check() so health checks exercise the correct code path when DATA_DESIGNER_ASYNC_ENGINE=1.

Key changes:

  • ClientConcurrencyMode StrEnum replaces Literal["sync", "async"]; constructor validates that injected clients match the declared mode
  • _get_sync_client() / _get_async_client() raise RuntimeError immediately if called in the wrong mode
  • close() and aclose() are simplified to single-mode teardown using the correct if/elif pattern (avoids the double-close bug addressed in PR-4 review)
  • ensure_async_engine_loop renamed from _ensure_async_engine_loop (now public, used cross-module for health check dispatch)
  • client_concurrency_mode threaded through the full factory chain: resource_provider.pymodels/factory.pyclients/factory.py → adapter constructors
  • Architecture notes (plans/343/model-facade-overhaul-pr-5-architecture-notes.md) document the design clearly, though the timeout value stated in the notes (timeout=180) does not match the implementation (timeout=300)
  • Test coverage is comprehensive: sync/async lifecycle, mode enforcement, cross-mode no-ops, constructor validation, lazy initialization, factory forwarding, and async health check paths

Confidence Score: 4/5

  • PR is safe to merge; the core lifecycle fix is sound and well-tested, with only minor documentation and style issues remaining.
  • The single-mode enforcement design is solid and all previous review issues have been addressed. The if/elif transport teardown pattern is correct. Factory chain threading is complete and tested. The only open items are minor: _transport is typed without Optional (requiring two # type: ignore[assignment] suppressions), the architecture-notes document timeout=180 while the code uses timeout=300, and _SYNC_CLIENT_CASES is used to parametrize async tests which is slightly confusing. None of these affect correctness or runtime behavior.
  • http_model_client.py (_transport type annotation), column_wise_builder.py (timeout value vs. architecture notes)

Important Files Changed

Filename Overview
packages/data-designer-engine/src/data_designer/engine/models/clients/adapters/http_model_client.py Core of the PR: adds ClientConcurrencyMode enum and enforces single-mode lifecycle in HttpModelClient. Mode guards in _get_sync_client/_get_async_client, simplified close()/aclose() with if/elif pattern, and constructor mismatch validation all look correct. Minor: _transport should be typed Optional to avoid two # type: ignore[assignment] suppressions.
packages/data-designer-engine/src/data_designer/engine/dataset_builders/column_wise_builder.py Health check dispatch now branches on DATA_DESIGNER_ASYNC_ENGINE: sync path calls run_health_check() directly, async path submits arun_health_check() via run_coroutine_threadsafe. Logic is correct; timeout=300 (5 min) is reasonable but contradicts architecture-notes docs which say timeout=180.
packages/data-designer-engine/src/data_designer/engine/models/registry.py arun_health_check() is a clean async mirror of run_health_check() — same structure, same skip logic, same exception re-raise, using the async facade methods. No issues found.
packages/data-designer-engine/src/data_designer/engine/models/clients/factory.py client_concurrency_mode parameter correctly threaded to OpenAICompatibleClient and AnthropicClient constructors; LiteLLMBridgeClient intentionally ignores it. Docstring added. No issues.
packages/data-designer-engine/src/data_designer/engine/resources/resource_provider.py DATA_DESIGNER_ASYNC_ENGINE env var correctly read at call time inside create_resource_provider to derive ClientConcurrencyMode and pass it to create_model_registry. No issues.
packages/data-designer-engine/tests/engine/models/clients/test_native_http_clients.py Tests comprehensively cover single-mode lifecycle, mode enforcement, cross-mode no-ops, and constructor validation. _SYNC_CLIENT_CASES name is misleading since it is also used to parametrize async-mode tests (e.g. test_async_aclose_delegates_to_httpx_async_client).

Sequence Diagram

sequenceDiagram
    participant RP as resource_provider.py
    participant MF as models/factory.py
    participant CF as clients/factory.py
    participant HC as HttpModelClient
    participant CWB as column_wise_builder.py
    participant MR as ModelRegistry
    participant EL as AsyncEngine EventLoop

    RP->>MF: create_model_registry(client_concurrency_mode)
    MF->>CF: create_model_client(client_concurrency_mode)
    CF->>HC: Adapter(concurrency_mode=ASYNC or SYNC)
    Note over HC: Constructor rejects mismatched injected client
    Note over HC: _get_sync_client raises if mode==ASYNC
    Note over HC: _get_async_client raises if mode==SYNC
    Note over HC: close() no-op if async mode
    Note over HC: aclose() no-op if sync mode

    CWB->>CWB: _run_model_health_check_if_needed()
    alt DATA_DESIGNER_ASYNC_ENGINE=1
        CWB->>EL: ensure_async_engine_loop()
        CWB->>EL: run_coroutine_threadsafe(arun_health_check)
        EL->>MR: await arun_health_check(model_aliases)
        MR->>HC: await agenerate / agenerate_text_embeddings / agenerate_image
        CWB->>CWB: future.result(timeout=300)
    else DATA_DESIGNER_ASYNC_ENGINE=0
        CWB->>MR: run_health_check(model_aliases)
        MR->>HC: generate / generate_text_embeddings / generate_image
    end
Loading
Prompt To Fix All With AI
This is a comment left during a code review.
Path: plans/343/model-facade-overhaul-pr-5-architecture-notes.md
Line: 1088

Comment:
**Timeout mismatch between architecture notes and implementation**

The architecture notes on this line document `future.result(timeout=180)` with a "3-minute wall-clock guard", but the actual implementation in `column_wise_builder.py` uses `timeout=300` (5 minutes). This makes the architecture notes misleading for future readers.

The [prior review thread](https://github.com/NVIDIA-NeMo/DataDesigner/pull/439) reply also referenced `timeout=180` as the fix. The value in the code (`300`) may be intentional (e.g. health checks for image models can be slow), but the documentation should match.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: packages/data-designer-engine/src/data_designer/engine/models/clients/adapters/http_model_client.py
Line: 130

Comment:
**`_transport` should be typed as `Optional` to avoid `# type: ignore`**

`self._transport` is inferred as `RetryTransport` (non-optional) from the `__init__` assignment, so both `close()` (line 130) and `aclose()` (line 145) require `# type: ignore[assignment]` when setting it to `None`. This is a code smell — the suppressed error hides a real type inconsistency.

The cleaner fix is to declare the field explicitly as `RetryTransport | None` in `__init__`:

```python
self._transport: RetryTransport | None = create_retry_transport(self._retry_config)
```

This would let both `close()` and `aclose()` set `self._transport = None` cleanly without suppressing type checker warnings.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: packages/data-designer-engine/tests/engine/models/clients/test_native_http_clients.py
Line: 541-542

Comment:
**Misleading parametrize variable name for async tests**

`_SYNC_CLIENT_CASES` is also used to parametrize multiple async-mode lifecycle and enforcement tests (e.g. `test_async_aclose_delegates_to_httpx_async_client`, `test_async_mode_blocks_sync_methods`, `test_close_is_noop_on_async_mode_client`). The variable name suggests it is exclusive to sync-mode tests, which will confuse future readers.

Consider renaming to `_CLIENT_FACTORY_CASES` (or similar) to make clear it provides the full set of adapter factories that are exercised in both modes.

How can I resolve this? If you propose a fix, please make it concise.

Last reviewed commit: "Merge branch 'main' ..."

- Fix transport double-close in close()/aclose() by delegating
  teardown to the httpx client when one exists (if/elif pattern);
  only close transport directly if no client was ever created
- Reject mismatched client/mode injection in constructor (e.g.
  async_client on a sync-mode instance raises ValueError)
- Add 5-minute wall-clock timeout to future.result() in async
  health check dispatch
- Add constructor validation tests for both mismatch directions
- Update PR-5 architecture notes

Made-with: Cursor
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant