feat: canonical model client types, protocols, and LiteLLM bridge adapter by nabinchha · Pull Request #359 · NVIDIA-NeMo/DataDesigner

nabinchha · 2026-02-28T00:11:57Z

📋 Summary

Introduces a provider-agnostic ModelClient adapter boundary under engine/models/clients/ as the foundation for replacing LiteLLM with native provider adapters. This is PR-1 of the model facade overhaul described in plans/343/model-facade-overhaul-plan-step-1.md.

PR-1 is intentionally non-invasive: no ModelFacade call-site changes, no provider routing changes, no retry/throttle migration.

🔄 Changes

✨ Added

Canonical request/response types (clients/types.py) — ChatCompletionRequest/Response, EmbeddingRequest/Response, ImageGenerationRequest/Response, plus supporting types (Usage, ToolCall, ImagePayload, AssistantMessage)
ModelClient protocol (clients/base.py) — Composable protocols for chat completion, embeddings, and image generation with sync/async parity, capability checks, and close/aclose lifecycle methods
Canonical error taxonomy (clients/errors.py) — ProviderError with 13 ProviderErrorKind values, deterministic HTTP status mapping (401→AUTHENTICATION, 403→PERMISSION_DENIED, etc.), and proper Python exception chaining via __cause__
LiteLLM bridge adapter (clients/adapters/litellm_bridge.py) — Wraps the existing LiteLLM router and normalizes responses into canonical types. Handles all three operations (chat, embeddings, images) with robust parsing for tool calls, image format waterfall (data URIs, base64, URLs, dicts), list-of-blocks content coercion, and usage extraction
Architecture notes (model-facade-overhaul-pr-1-architecture-notes.md) — Documents canonical adapter boundary contract, bridge purpose, and planned follow-on
33 unit tests across test_client_errors.py and test_litellm_bridge.py — Covers error mapping (12 parametrized status codes), exception chaining, sync/async completion, embeddings, image generation (chat + diffusion paths), list-content coercion, tool-call argument serialization, empty-choices edge case, and lifecycle methods

🔍 Attention Areas

⚠️ Reviewers: Please pay special attention to the following:

clients/errors.py — _extract_response_text tries structured JSON extraction before falling back to raw text. This matters because httpx response.text contains the full body (including raw JSON), so without this priority order every error message from providers with structured error bodies would be a raw JSON blob
clients/adapters/litellm_bridge.py — Image parsing waterfall in _parse_image_payload handles 5+ response formats. Verify coverage against known provider response shapes
clients/base.py — ModelClient protocol includes close/aclose lifecycle methods for PR-2 registry/resource teardown wiring

🤖 Generated with AI

…acade-guts-pr1

greptile-apps · 2026-02-28T00:16:06Z

Greptile Summary

This PR establishes a provider-agnostic ModelClient adapter boundary under engine/models/clients/ as PR-1 of the model-facade overhaul. It introduces canonical request/response types (ChatCompletionRequest/Response, EmbeddingRequest/Response, ImageGenerationRequest/Response), a ModelClient composable Protocol hierarchy, a canonical error taxonomy (ProviderError + 13 ProviderErrorKind values with deterministic HTTP-status mapping), a LiteLLMBridgeClient that wraps the existing LiteLLM router, and shared parsing helpers with full sync/async parity. The changes are intentionally non-invasive (no call-site changes to ModelFacade) and are well-covered by 33 unit tests.

Previously flagged critical issues have been verified as resolved:

HTTP-date Retry-After parsing now handles both delay-seconds and HTTP-date formats
FastAPI list-format detail arrays are correctly extracted in error messages
acompletion no longer blocks the event loop (uses aparse_chat_completion_response)
Image download failures log at WARNING level for operator visibility
total_tokens auto-computation uses coerced token values

One remaining concern:

_handle_non_provider_errors sets ProviderError.message = str(exc), which can expose raw JSON blobs for LiteLLM errors that embed full HTTP response bodies in their string representation. The project already has _extract_structured_message and _extract_response_text helpers in errors.py, but they require an HttpResponse object unavailable at this call site.

Confidence Score: 4/5

Safe to merge as a non-invasive PR-1 foundation; one minor style concern about error message formatting.
This PR establishes a solid foundation for the model facade overhaul with well-tested, non-invasive changes. All previously-flagged critical issues (event-loop blocking, HTTP-date retry parsing, FastAPI error extraction, token computation) have been resolved. The remaining concern is a style-level issue (error message clarity when exceptions are stringified) that doesn't affect correctness but would improve observability. No existing call-sites are modified, and the new code is comprehensively tested.
packages/data-designer-engine/src/data_designer/engine/models/clients/adapters/litellm_bridge.py (error message extraction in _handle_non_provider_errors)

Sequence Diagram

sequenceDiagram
    participant Caller
    participant LiteLLMBridgeClient
    participant _handle_non_provider_errors
    participant LiteLLMRouter
    participant parsing

    Caller->>LiteLLMBridgeClient: completion(ChatCompletionRequest)
    LiteLLMBridgeClient->>_handle_non_provider_errors: enter context
    LiteLLMBridgeClient->>LiteLLMRouter: router.completion(model, messages, **optional_fields)
    alt Provider error
        LiteLLMRouter-->>_handle_non_provider_errors: raises Exception (w/ status_code)
        _handle_non_provider_errors-->>Caller: raises ProviderError(kind, message=str(exc))
    else Success
        LiteLLMRouter-->>LiteLLMBridgeClient: raw LiteLLM response
        LiteLLMBridgeClient->>_handle_non_provider_errors: exit (no error)
        LiteLLMBridgeClient->>parsing: parse_chat_completion_response(response)
        parsing->>parsing: extract_tool_calls, extract_images_from_chat_message, extract_usage
        parsing-->>LiteLLMBridgeClient: ChatCompletionResponse
        LiteLLMBridgeClient-->>Caller: ChatCompletionResponse
    end

    Caller->>LiteLLMBridgeClient: generate_image(ImageGenerationRequest)
    alt messages is not None
        LiteLLMBridgeClient->>LiteLLMRouter: router.completion(model, messages, **image_kwargs)
        LiteLLMRouter-->>LiteLLMBridgeClient: chat response
        LiteLLMBridgeClient->>parsing: extract_images_from_chat_response(response)
    else messages is None
        LiteLLMBridgeClient->>LiteLLMRouter: router.image_generation(prompt, model, **image_kwargs)
        LiteLLMRouter-->>LiteLLMBridgeClient: image response
        LiteLLMBridgeClient->>parsing: extract_images_from_image_response(response)
    end
    parsing-->>LiteLLMBridgeClient: list[ImagePayload]
    LiteLLMBridgeClient-->>Caller: ImageGenerationResponse

_{Last reviewed commit: 61024c0}

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

andreatgretel

(moved to line comments below)

packages/data-designer-engine/tests/engine/models/clients/test_litellm_bridge.py

packages/data-designer-engine/tests/engine/models/clients/test_client_errors.py

andreatgretel

Ran some smoke tests against edge cases in the bridge helpers. Most things held up well -- malformed responses, missing choices, None messages, non-list embeddings all degrade gracefully. Plain text content doesn't false-positive as an image either.

One thing that came up: _extract_usage returns a Usage(input_tokens=None, output_tokens=None, ...) object (not None) when providers return token counts as strings (e.g. "10" instead of 10). That's because _to_int_or_none only handles int and float, so the string passes the is None guard on line 319 but then gets coerced to None by _to_int_or_none. End result: callers can't rely on if usage is not None to check if usage data was actually captured. Probably worth adding string handling to _to_int_or_none.

...ages/data-designer-engine/src/data_designer/engine/models/clients/adapters/litellm_bridge.py

andreatgretel

Thinking ahead to PR-3+: a bunch of the parsing helpers in the bridge (_parse_image_payload, _extract_images_from_chat_message, _extract_usage, _coerce_message_content) will be needed by the native OpenAI adapter too -- the image extraction waterfall in the plan is basically the same logic. Right now they're private functions inside the bridge module, which is supposed to be temporary. Might be worth pulling them into a shared clients/parsing.py (or similar) before PR-2 adds dependencies on the bridge. Easier to move now than later.

packages/data-designer-engine/src/data_designer/engine/models/clients/errors.py

...ages/data-designer-engine/src/data_designer/engine/models/clients/adapters/litellm_bridge.py

andreatgretel

Requesting changes for 2 small but meaningful things before we merge:

In LiteLLMBridgeClient, raw router/LiteLLM exceptions can still leak out. At this layer, we should normalize those into ProviderError/ProviderErrorKind so callers only see the canonical error shape.
There’s a small usage parsing edge case: if a provider returns token counts as strings (like "10"), _extract_usage can return a Usage object where token fields end up None. That makes usage is not None look like we captured usage when we didn’t. Can we either parse numeric strings in _to_int_or_none or return None when normalized usage is empty?

Everything else I left is optional cleanup/follow-up.

…provements - Wrap all LiteLLM router calls in try/except to normalize raw exceptions into canonical ProviderError at the bridge boundary (blocking review item) - Extract reusable response-parsing helpers into clients/parsing.py for shared use across future native adapters - Add async image parsing path using httpx.AsyncClient to avoid blocking the event loop in agenerate_image - Add retry_after field to ProviderError for future retry engine support - Fix _to_int_or_none to parse numeric strings from providers - Create test conftest.py with shared mock_router/bridge_client fixtures - Parametrize duplicate image generation and error mapping tests - Add tests for exception wrapping across all bridge methods

nabinchha · 2026-03-04T16:56:22Z

Requesting changes for 2 small but meaningful things before we merge:

In LiteLLMBridgeClient, raw router/LiteLLM exceptions can still leak out. At this layer, we should normalize those into ProviderError/ProviderErrorKind so callers only see the canonical error shape.

There’s a small usage parsing edge case: if a provider returns token counts as strings (like "10"), _extract_usage can return a Usage object where token fields end up None. That makes usage is not None look like we captured usage when we didn’t. Can we either parse numeric strings in _to_int_or_none or return None when normalized usage is empty?

Everything else I left is optional cleanup/follow-up.

Thanks @andreatgretel! Addressed in ec5ed9b

...ages/data-designer-engine/src/data_designer/engine/models/clients/adapters/litellm_bridge.py

packages/data-designer-engine/src/data_designer/engine/models/clients/errors.py

packages/data-designer-engine/src/data_designer/engine/models/clients/parsing.py

…larity - Parse RFC 7231 HTTP-date strings in Retry-After header (used by Azure and Anthropic during rate-limiting) in addition to numeric delay-seconds - Clarify collect_non_none_optional_fields docstring explaining why f.default is None is the correct check for optional field forwarding - Add tests for HTTP-date and garbage Retry-After values

...ages/data-designer-engine/src/data_designer/engine/models/clients/adapters/litellm_bridge.py

packages/data-designer-engine/src/data_designer/engine/models/clients/errors.py

packages/data-designer-engine/src/data_designer/engine/models/clients/parsing.py

- Fix misleading comment about prompt field defaults in _IMAGE_EXCLUDE - Handle list-format detail arrays in _extract_structured_message for FastAPI/Pydantic validation errors - Document scope boundary for vision content in collect_raw_image_candidates

greptile-apps · 2026-03-04T20:35:28Z

packages/data-designer-engine/src/data_designer/engine/models/clients/parsing.py

+def parse_image_payload(raw_image: Any) -> ImagePayload | None:
+    try:
+        result = resolve_image_payload(raw_image)
+        if isinstance(result, str):
+            return ImagePayload(b64_data=load_image_url_to_base64(result), mime_type=None)
+        return result
+    except Exception:
+        logger.debug("Unable to parse image payload from response object.", exc_info=True)
+        return None


Silent image download failures produce empty response with no diagnostic

parse_image_payload (and its async twin aparse_image_payload) catch every exception from load_image_url_to_base64 and log it only at DEBUG level before returning None. Since parse_image_list filters out None values, a caller receives an ImageGenerationResponse.images = [] when all images were URL-format and every download failed — identical to the response for a provider that simply returned no images.

DALL-E 3 (and similar diffusion APIs) commonly return temporary presigned URLs when response_format="url" is requested. If the network download fails (transient 5xx, expired URL, DNS failure), the entire image result is silently lost. There is no WARNING/ERROR log, no raised exception, and no way for callers to distinguish "no images generated" from "images generated but download failed".

Consider logging at WARNING level so operators can diagnose missing images in production:

def parse_image_payload(raw_image: Any) -> ImagePayload | None: try: result = resolve_image_payload(raw_image) if isinstance(result, str): return ImagePayload(b64_data=load_image_url_to_base64(result), mime_type=None) return result except Exception: logger.warning("Failed to parse image payload from response object; image dropped.", exc_info=True) return None

Same change applies to aparse_image_payload at line 131.

Prompt To Fix With AI

This is a comment left during a code review. Path: packages/data-designer-engine/src/data_designer/engine/models/clients/parsing.py Line: 120-128 Comment: **Silent image download failures produce empty response with no diagnostic** `parse_image_payload` (and its async twin `aparse_image_payload`) catch every exception from `load_image_url_to_base64` and log it only at `DEBUG` level before returning `None`. Since `parse_image_list` filters out `None` values, a caller receives an `ImageGenerationResponse.images = []` when all images were URL-format and every download failed — identical to the response for a provider that simply returned no images. DALL-E 3 (and similar diffusion APIs) commonly return temporary presigned URLs when `response_format="url"` is requested. If the network download fails (transient 5xx, expired URL, DNS failure), the entire image result is silently lost. There is no `WARNING`/`ERROR` log, no raised exception, and no way for callers to distinguish "no images generated" from "images generated but download failed". Consider logging at `WARNING` level so operators can diagnose missing images in production: ```python def parse_image_payload(raw_image: Any) -> ImagePayload | None: try: result = resolve_image_payload(raw_image) if isinstance(result, str): return ImagePayload(b64_data=load_image_url_to_base64(result), mime_type=None) return result except Exception: logger.warning("Failed to parse image payload from response object; image dropped.", exc_info=True) return None ``` Same change applies to `aparse_image_payload` at line 131. How can I resolve this? If you propose a fix, please make it concise.

packages/data-designer-engine/src/data_designer/engine/models/clients/parsing.py

...ages/data-designer-engine/src/data_designer/engine/models/clients/adapters/litellm_bridge.py

packages/data-designer-engine/src/data_designer/engine/models/clients/parsing.py

andreatgretel

looks good! all the original feedback was addressed. left two minor nits on the image generation exception scope and the total_tokens coercion order, neither blocking.

...ages/data-designer-engine/src/data_designer/engine/models/clients/adapters/litellm_bridge.py

nabinchha · 2026-03-05T19:25:35Z

looks good! all the original feedback was addressed. left two minor nits on the image generation exception scope and the total_tokens coercion order, neither blocking.

Thanks @andreatgretel! All should be resolved now.

...ages/data-designer-engine/src/data_designer/engine/models/clients/adapters/litellm_bridge.py

nabinchha and others added 11 commits February 19, 2026 15:50

plans for model facade overhaul

ab30a2d

update plan

43824ea

add review

2a5f1e4

address feedback + add more details after several self reviews

f945d5b

update plan doc

dfa3817

Merge branch 'main' into nm/overhaul-model-facade-guts

5b18f74

address nits

0f449a7

Merge branch 'nm/overhaul-model-facade-guts' into nm/overhaul-model-f…

37f092a

…acade-guts-pr1

Add cannonical objects

08e57f8

Merge branch 'main' into nm/overhaul-model-facade-guts-pr1

3ab18ee

self-review feedback + address

34349c7

nabinchha requested a review from a team as a code owner February 28, 2026 00:11

nabinchha and others added 5 commits February 27, 2026 17:17

add LiteLLMRouter protocol to strongly type bridge router param

6aae4b6

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

simplify some things

2a53d37

add a protol for http response like object

4e2f3af

move HttpResponse

b1c85f2

update PR-1 architecture notes for lifecycle and router protocol

f6dc769

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

andreatgretel reviewed Mar 2, 2026

View reviewed changes

andreatgretel reviewed Mar 3, 2026

View reviewed changes

...ages/data-designer-engine/src/data_designer/engine/models/clients/adapters/litellm_bridge.py Outdated Show resolved Hide resolved

andreatgretel reviewed Mar 3, 2026

View reviewed changes

packages/data-designer-engine/src/data_designer/engine/models/clients/errors.py Show resolved Hide resolved

andreatgretel reviewed Mar 3, 2026

View reviewed changes

...ages/data-designer-engine/src/data_designer/engine/models/clients/adapters/litellm_bridge.py Outdated Show resolved Hide resolved

andreatgretel requested changes Mar 3, 2026

View reviewed changes

Merge branch 'main' into nm/overhaul-model-facade-guts-pr1

b6b4028

nabinchha requested a review from andreatgretel March 4, 2026 16:56

greptile-apps bot reviewed Mar 4, 2026

View reviewed changes

Use contextlib to dry out some code

ba22397

greptile-apps bot reviewed Mar 4, 2026

View reviewed changes

nabinchha added 2 commits March 4, 2026 13:27

Merge branch 'main' into nm/overhaul-model-facade-guts-pr1

c390912

greptile-apps bot reviewed Mar 4, 2026

View reviewed changes

Merge branch 'main' into nm/overhaul-model-facade-guts-pr1

e527503

andreatgretel reviewed Mar 5, 2026

View reviewed changes

...ages/data-designer-engine/src/data_designer/engine/models/clients/adapters/litellm_bridge.py Show resolved Hide resolved

andreatgretel reviewed Mar 5, 2026

View reviewed changes

packages/data-designer-engine/src/data_designer/engine/models/clients/parsing.py Outdated Show resolved Hide resolved

andreatgretel previously approved these changes Mar 5, 2026

View reviewed changes

greptile-apps bot reviewed Mar 5, 2026

View reviewed changes

...ages/data-designer-engine/src/data_designer/engine/models/clients/adapters/litellm_bridge.py Show resolved Hide resolved

address feedback

61024c0

nabinchha dismissed andreatgretel’s stale review via 61024c0 March 5, 2026 19:24

nabinchha requested a review from andreatgretel March 5, 2026 19:30

greptile-apps bot reviewed Mar 5, 2026

View reviewed changes

...ages/data-designer-engine/src/data_designer/engine/models/clients/adapters/litellm_bridge.py Show resolved Hide resolved

andreatgretel approved these changes Mar 5, 2026

View reviewed changes

nabinchha merged commit 68a7f71 into main Mar 5, 2026
47 checks passed

Conversation

nabinchha commented Feb 28, 2026

📋 Summary

🔄 Changes

✨ Added

🔍 Attention Areas

Uh oh!

greptile-apps bot commented Feb 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 4/5

Sequence Diagram

Uh oh!

andreatgretel left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

andreatgretel left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

andreatgretel left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

andreatgretel left a comment

Choose a reason for hiding this comment

Uh oh!

nabinchha commented Mar 4, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

greptile-apps bot Mar 4, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

andreatgretel left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

nabinchha commented Mar 5, 2026

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

greptile-apps bot commented Feb 28, 2026 •

edited

Loading

andreatgretel left a comment •

edited

Loading