fix(oci): sanitize tool JSON Schema for Gemini Proto validator#283
Merged
Conversation
OCI's V1 OpenAI-compat transport (OCIChatCompletionsModel) serves every non-Cohere-R / non-DAC model including google.gemini-*. Pydantic emits type: ["string", "null"] for Optional[str] and type: "any" for Any, and Gemini's Proto schema validator rejects both with a hard 400: Unknown name 'type' at 'tools[0].function_declarations[N]...': Proto field is not repeating, cannot start list. The new sanitize_oci_tool_schema() helper mirrors OCIUtils.sanitize_schema from Oracle's own langchain-oci package - the validated reference pattern used in production against this endpoint. It collapses list types to a single string, remaps "any" -> "object", strips metadata keys (title / const / x-* / default:None), promotes const -> enum, and defaults missing items on arrays. Recurses correctly through properties / \$defs / definitions so user-defined field names survive. Wired into _sanitize_tools_for_oci() and applied unconditionally to tools[*].function.parameters in both complete() and stream() - matching langchain-oci's design; the stripped keys are pure metadata that non-Gemini vendors ignore. The earlier speculative response_format strip (vendor-gated additionalProperties / patternProperties / dependencies) was targeting a different OCI endpoint and has been removed; only the now-validated \$ref inline for Gemini response_format is preserved. Verified live against oci:google.gemini-2.5-flash in us-chicago-1: a tool with Optional[str], Optional[int], Any, and a nested array-of-objects (the deepest-nesting shape from the issue's error path) clears the validator and the call completes. 12 unit tests cover the helper + wiring; 2 live integration tests prove the end-to-end fix. Fixes #281 Signed-off-by: Federico Kamelhar <federico.kamelhar@oracle.com>
starlette 1.2.0 (2026-05-28) makes its TestClient require ``httpx2``
instead of ``httpx``. Locus pins ``httpx>=0.27,<1.0`` (because
``OCIRequestSigner`` and ``BearerAuth`` subclass the soon-removed
top-level ``httpx.Auth``), so the resolver can't satisfy starlette
1.2's TestClient transitively, and ``tests/unit/test_a2a_protocol.py``
+ ``tests/unit/test_server_app_full.py`` fail at collection time
with::
ModuleNotFoundError: No module named 'httpx2'
StarletteDeprecationWarning: Using ``httpx`` with
``starlette.testclient`` is deprecated; install ``httpx2``
instead.
Cap ``starlette<1.2`` alongside the existing ``httpx<1.0`` pin.
Lift both together when ``OCIRequestSigner`` / ``BearerAuth`` are
ported to the httpx 2.x auth API.
This is a drive-by fix bundled with #281 to unblock its CI; the
break is unrelated to the schema-sanitisation work in that PR — it
just landed on PyPI between main's last green CI (11:35Z) and
#281's first CI run (12:03Z).
Verified locally: 102/102 tests in the two affected files collect
and pass under starlette 1.0.0.
Signed-off-by: Federico Kamelhar <federico.kamelhar@oracle.com>
4 tasks
fede-kamel
added a commit
that referenced
this pull request
May 28, 2026
…t safety net (#284) Two Gemini-on-V1-transport fixes that landed since b22: * fix(agent): force summary call when assistant content is empty (closes #280) — #282 * fix(oci): sanitize tool JSON Schema for Gemini Proto validator (closes #281) — #283 Plus one drive-by ``starlette<1.2`` cap to unblock CI after upstream broke the existing ``httpx<1.0`` pin on 2026-05-28. No breaking changes. The schema sanitiser is applied unconditionally on the V1 (OpenAI-compat) transport, but the strip-list mirrors ``langchain_oci.common.utils.OCIUtils.sanitize_schema`` — the keys it removes are pure metadata that non-Gemini vendors ignore. The empty-content safety net costs at most one extra LLM call per agent run, and only when the model already returned an iteration with no tool calls AND no content. Uses ``auxiliary_model`` if set so operators can route the recovery call to a cheaper model. Signed-off-by: Federico Kamelhar <federico.kamelhar@oracle.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
OCI's V1 OpenAI-compat transport (
OCIChatCompletionsModel) serves every non-Cohere-R / non-DAC model on OCI, includinggoogle.gemini-*. Pydantic emitstype: ["string", "null"]forOptional[str]andtype: "any"forAny— both rejected by Gemini's Proto schema validator with a hard 400 before the model gets a turn:This PR ports the validated sanitisation pattern from Oracle's own
langchain-oci(OCIUtils.sanitize_schemainlibs/oci/langchain_oci/common/utils.py) into Locus, so every tool'sfunction.parametersschema is normalised before the V1 call leaves Locus.Fixes #281.
What changed (commit
88c4d79)New helper —
locus.core.structured.sanitize_oci_tool_schema():type: ["string", "null"]→ first non-null type (Gemini's Proto requires a single string).type: "any"→"object"(Gemini doesn't know"any").title,const,default: None, anyx-*namespace.const: "x"→enum: ["x"](OCI accepts enum, not const).items→{"type": "object"}whentype == "array".properties/$defs/definitionsas dicts of named sub-schemas, so user-defined field names (e.g. a field literally namedtitle) survive even though schema-leveltitleis stripped.Wiring —
OCIChatCompletionsModel._sanitize_tools_for_oci()is invoked from bothcomplete()andstream(), applied unconditionally totools[*].function.parameters. Matcheslangchain-oci's design — the stripped keys are pure metadata that non-Gemini vendors ignore.Removed — the earlier speculative
_requires_gemini_schema_strip+_munge_response_format_for_geminikeyword strip (vendor-gatedadditionalProperties/patternProperties/dependencies). It was targeting a different OCI endpoint and didn't match the observed bug shape. The now-validated$refinline for Geminiresponse_formatis preserved.Drive-by (commit
6ee95fc) —chore(deps): cap starlette <1.2The first CI run on this PR failed with
ModuleNotFoundError: No module named 'httpx2'intests/unit/test_a2a_protocol.pyandtests/unit/test_server_app_full.py. Neither file is touched by the #281 work.Root cause:
starlette 1.2.0was published on 2026-05-28 (betweenmain's last green CI at 11:35Z and this PR's first CI run at 12:03Z) and makes its TestClient requirehttpx2instead ofhttpx. Locus pinshttpx<1.0(becauseOCIRequestSignerandBearerAuthsubclass the soon-removed top-levelhttpx.Auth), so the resolver can't satisfy starlette 1.2's TestClient transitively.Capped
starlette<1.2alongside the existinghttpx<1.0pin; verified the 102 affected tests collect + pass under starlette 1.0.0. Lift both pins together when the auth classes are ported to httpx 2.x.If you'd prefer the dep pin in its own PR, happy to split — but bundling unblocks CI on this one in the same round-trip.
Why mirror langchain-oci
The
langchain-ocipackage ships Oracle's own LangChain integration against the same OCI Generative AI endpoint Locus's V1 transport hits. Itssanitize_schemais the reference pattern used in production by every Oracle-hosted LangChain agent. Re-deriving the sanitiser independently would just create a parallel set of mistakes; copying the proven pattern keeps the bug surface narrow.Tests
Unit (
tests/unit/test_oci_tool_schema_sanitize.py, 12 tests, all passing):test_optional_field_collapses_type_list_to_single_string— Pydantic-generatedOptional[str]schema clears the sanitiser.test_explicit_type_list_collapses_to_first_non_null— handcrafted["string","null"]/["null","number"]/ degenerate["null"].test_type_any_normalises_to_object.test_metadata_keys_stripped—title/const/x-*/default:Noneall gone;const: "search"promoted toenum.test_user_field_named_title_survives_recursion— defends theproperties/$defsrecursion seam.test_array_without_items_gets_default_items.test_does_not_mutate_input.test_nested_recursion_through_items_properties— the exact deepest-nesting shape from the issue's error path._sanitize_tools_for_oci: applies to each tool, returnsNone/[]for empty inputs, passes through non-function tools, doesn't mutate caller's list.Integration (
tests/integration/test_oci_gemini_tool_schema_live.py, gated byRUN_LIVE_OCI=1):test_gemini_accepts_optional_field_tool_schema— sends a tool withOptional[str],Optional[int],Any, and a nested array-of-objects to real Gemini 2.5 Flash; pre-fix raised 400, post-fix completes.test_gemini_accepts_clean_schema_unchanged— control: a tool with no Optional fields still works (sanitiser is idempotent on clean schemas).Both integration tests verified live against
oci:google.gemini-2.5-flashinus-chicago-1.Regression sweep: 149 OCI-related unit tests passing (
test_oci_*.py+test_tools_*.py), no breakage from removing the speculativeresponse_formatstrip.Test plan
uv run pytest tests/unit/test_oci_tool_schema_sanitize.py -v— 12/12RUN_LIVE_OCI=1 uv run pytest tests/integration/test_oci_gemini_tool_schema_live.py -v— 2/2 live against real Geminiuv run pytest tests/unit/test_oci_openai_compat*.py tests/unit/test_tools_*.py— 149/149, no regressionsuv run pytest tests/unit/test_a2a_protocol.py tests/unit/test_server_app_full.py— 102/102 with starlette<1.2 pin