Skip to content

[Feature] Enforce Agent output_schema for OpenAI-compatible / third-party BaseLlm providers (DeepSeek, NVIDIA NIM, etc.) #6021

@majsterkovic

Description

@majsterkovic

🔴 Required Information

Is your feature request related to a specific problem?

Yes. In a multi-stage ADK SequentialAgent pipeline we rely on Agent(output_schema=PydanticModel) for mechanical JSON extraction (e.g. a StructuredResearch schema with nested lists, enums, optional fields).

This works reliably with native Google models (Gemini / Gemma via the Google API): ADK passes the schema to the API and the response is constrained at generation time.

It does not work reliably when the same Agent uses a custom BaseLlm backed by OpenAI-compatible third-party endpoints (we use DeepSeek and previously NVIDIA NIM):

  1. ADK forwards response_schema on LlmRequest.config, but there is no official ADK mapping from that schema to provider-specific strict structured-generation APIs.
  2. Our custom connector can only fall back to OpenAI response_format: {"type": "json_object"} — which guarantees some JSON object, not conformance to the Pydantic schema (field names, types, required fields, enums).
  3. Models frequently return invalid shapes anyway: prose prefixes, ```json fences, wrong keys (date instead of period), missing required fields (sentiment), nested dicts instead of lists — causing model_validate_json failures in the agent loop.
  4. Provider-specific hacks (e.g. NVIDIA NIM extra_body.nvext.guided_json) are outside ADK, brittle, and still failed in production (JSON wrapped in markdown despite guided_json).

Concrete production impact: we had to move a structuring stage from NVIDIA Llama 3.3 70B (NIM) back to Gemma via Gemini API solely because output_schema is effectively a first-class feature only for Google-native connectors. DeepSeek stages still need duplicated schema text in prompts + heavy post-normalization.

Example agent setup (works on Gemini, fragile on custom BaseLlm):

structuring = Agent(
    model=get_gemini_structuring(),  # Gemma — reliable output_schema
    name="StructuringAgent",
    output_schema=StructuredResearch,
    output_key="structured",
)

scoring = Agent(
    model=get_deepseek_connector(),  # custom BaseLlm — json_object only
    name="ScoringAgent",
    output_schema=CompanyResponse,
    output_key="company_json",
)

Describe the Solution You'd Like

We would like ADK to treat Agent.output_schema as a portable contract across model backends, not only Google API models.

Minimum viable solution:

  1. Document how output_schema / LlmRequest.config.response_schema should be honored by custom BaseLlm implementations.
  2. Provide an official or reference OpenAI-compatible connector that maps ADK schema → provider capabilities:
    • OpenAI / Azure: response_format: { type: "json_schema", json_schema: { name, schema, strict: true } }
    • NVIDIA NIM: extra_body.nvext.guided_json (or current NIM structured-generation API)
    • DeepSeek / other OpenAI-compat: best-effort strict mode where supported, clear fallback otherwise
  3. Capability flags on BaseLlm (e.g. supports_strict_json_schema, supports_json_object) so ADK can choose strategy and log when schema enforcement is degraded.
  4. Optional but valuable: a single ADK sanitization hook before Pydantic validation (strip markdown fences / prose wrappers) when providers return quasi-JSON — so every custom connector does not reimplement this.

Ideal API shape (pseudo-code):

class OpenAICompatibleLlm(BaseLlm):
    structured_output_mode: Literal["strict_schema", "json_object", "none"] = "strict_schema"

    async def generate_content_async(self, llm_request, stream=False):
        schema = adk_schema_to_json_schema(llm_request.config.response_schema)
        if schema and self.structured_output_mode == "strict_schema":
            kwargs["response_format"] = {
                "type": "json_schema",
                "json_schema": {"name": schema.name, "schema": schema.dict, "strict": True},
            }
        elif schema:
            kwargs["response_format"] = {"type": "json_object"}
        ...

For custom providers (NVIDIA NIM), ADK could expose an extension point:

def apply_structured_output(provider: str, schema: dict, request_kwargs: dict) -> dict:
    if provider == "nvidia_nim":
        request_kwargs.setdefault("extra_body", {})["nvext"] = {"guided_json": schema}
    ...

Impact on your work

  • High impact on production multi-agent pipelines that mix Google models (search/tools) with cheaper external models (reasoning/scoring).
  • Without this, teams must either:
    • (a) duplicate schemas in prompts and maintain custom sanitizers/normalizers per provider, or
    • (b) route all structured stages to Google models, increasing cost/quota pressure on GEMINI_API_KEY.

Timeline: not a hard deadline, but this is a practical blocker for using ADK as a provider-agnostic orchestration layer.

Willingness to contribute

Yes — we can contribute a reference BaseLlm implementation and regression tests (Pydantic schema + OpenAI-compat provider matrix) if the ADK team defines the intended extension interface.


🟡 Recommended Information

Describe Alternatives You've Considered

Alternative Why insufficient
Prompt-only schema (embed model_json_schema() in instructions) Models still drift; duplicates schema; does not enforce at decode time.
response_format: json_object only Returns any JSON object; frequent field/type mismatches; requires heavy normalize_llm_output() in application code.
NVIDIA guided_json in custom connector Provider-specific; outside ADK; still returned prose + fenced JSON in our CI/production runs.
LiteLLM proxy We migrated away from LiteLLM to direct ADK connectors; proxy adds another failure layer and does not standardize ADK output_schema semantics.
Use Gemma/Gemini for all structured stages Works today, but defeats purpose of using DeepSeek/NIM for cost/latency; concentrates quota on one API key.
Post-hoc JSON extraction + Pydantic repair Fragile, hard to test, masks model quality issues; every team reimplements the same sanitizers.

Proposed API / Implementation

  1. BaseLlm structured output contract

    • When Agent(output_schema=Model) is set, ADK always populates LlmRequest.config.response_schema consistently (already happens).
    • Add documented obligation: connectors should use strict schema enforcement when possible.
  2. Reference connector: OpenAICompatibleLlm in ADK (or contrib/)

    • Map Pydantic / JSON Schema → OpenAI Structured Outputs (json_schema + strict: true).
    • Pluggable StructuredOutputAdapter per host (api.openai.com, api.deepseek.com, integrate.api.nvidia.com).
  3. Validation pipeline

    Agent.output_schema
      → LlmRequest.config.response_schema
      → Provider adapter (strict / guided_json / json_object)
      → Optional ADK sanitize_json_response()
      → Pydantic model_validate_json (existing ADK behavior)
    
  4. Developer-visible telemetry

    • Log when falling back from strict_schemajson_objectnone, with model id and provider.

Additional Context

Environment

  • ADK Python, google.adk.agents.llm_agent.Agent, google.adk.models.base_llm.BaseLlm
  • Providers tested: Gemini API (gemini-2.5-flash, gemma-4-31b-it), DeepSeek (deepseek-chat, deepseek-reasoner), NVIDIA NIM (meta/llama-3.3-70b-instruct)
  • Pattern: custom OpenAICompatibleLlm subclassing BaseLlm, using openai.AsyncOpenAI with custom base_url

Observed failure modes (NVIDIA Llama + custom connector, pre-migration)

  • Response: `Oto wynik analizy...\n```json\n{...}\n````
  • ADK validates raw string → JSONDecodeError at column 1
  • Even with guided_json, field aliases (date vs period) and missing enum fields required app-side normalizers

What works today (Google-native only)

  • Agent(output_schema=StructuredResearch) + Gemini(model="gemma-4-31b-it") → stable structured extraction in E2E CI (149 tests, live pipeline)

What remains fragile (DeepSeek)

  • Agent(output_schema=CompanyResponse) + custom BaseLlm → relies on json_object + prompt schema duplication + Pydantic validators

Happy to share a minimal reproduction repository / test case if useful.

Metadata

Metadata

Assignees

Labels

models[Component] Issues related to model support
No fields configured for Feature.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions