fix(proxy): pre-validate strict function tool schemas by Soju06 · Pull Request #658 · Soju06/codex-lb

Soju06 · 2026-05-15T13:17:10Z

Summary

Strict-mode schema violations on a function tool (e.g. tools[i].function.parameters with strict: true but missing additionalProperties: false) currently reach the upstream Codex backend, which closes the WebSocket with close_code=1000. codex-lb surfaces this as a generic 502 server_error / upstream_rejected_input. Real OpenAI returns a deterministic 400 invalid_function_parameters for the same payload (OpenAI Structured Outputs docs, github/github-mcp-server#376).

This PR fixes two related bugs around strict function tools:

/v1/responses returns 5xx where real OpenAI returns 4xx. Well-behaved retry/failover loops classify 5xx as transient and retry into a flood for a permanently-broken request. The skill notes this exact pattern: "the 502-vs-400 quirk is particularly nasty because well-behaved retry loops classify 5xx as transient." enforce_strict_text_format already pre-validates text.format.json_schema; function tools were the only un-validated strict surface left.
/v1/chat/completions silently masks the same violation. _normalize_chat_tools builds the responses-side tool item by explicit field selection but never copies the strict field, so chat-completions clients with violating schemas appear to succeed (200 + tool_calls). This both (a) diverges from real OpenAI and /v1/responses, and (b) lets schema bugs hide indefinitely from the client.

Root Cause

Endpoint	Same invalid strict tool payload	codex-lb (pre-fix)	Real OpenAI
`/v1/chat/completions`	`_normalize_chat_tools` drops `strict` → upstream gets schema without strict mode	200 + `tool_calls` (strict silently lost)	400 `invalid_function_parameters`
`/v1/responses`	strict pass-through → upstream rejects (`close_code=1000`)	502 `upstream_rejected_input`	400 `invalid_function_parameters`

Both paths go through to_responses_request() → enforce_strict_* pre-validation. The fix is local and symmetric:

Add validate_strict_function_tool_schema (mirrors the existing validate_strict_json_schema, distinct envelope code=invalid_function_parameters).
Add enforce_strict_function_tools_format(request, *, param_template=...) and call it in both endpoint handlers + normalize_responses_request_payload. The chat handler passes the chat-style param_template="tools[{index}].function.parameters"; everyone else uses the default tools[{index}].parameters.
Stop dropping strict on chat function tools — preserve it so the responses-side enforcement applies via the shared coercion pipeline.

Changes

app/core/openai/strict_schema.py — new validate_strict_function_tool_schema() re-using the shared _find_violation walker.
app/modules/proxy/request_policy.py — new enforce_strict_function_tools_format(); wired into normalize_responses_request_payload().
app/modules/proxy/api.py — call enforce_strict_function_tools_format() in both v1_responses and v1_chat_completions handlers (the chat handler passes the chat-style param template).
app/core/openai/chat_requests.py — _normalize_chat_tools() now preserves the function-level strict boolean.
scripts/verify_v1_responses_openai_sdk.py — case_tool_call_streaming now uses a spec-compliant strict schema (additionalProperties: false). Suite is now deterministic 5/5.

Spec

openspec/changes/validate-strict-function-tool-schema/specs/responses-api-compat/spec.md — ADDED Requirement: Strict function tool parameter schemas are pre-validated. Five scenarios covering missing additionalProperties, additionalProperties:true, missing required, compliant strict schema, and strict:false/omitted skip.
openspec/changes/validate-strict-function-tool-schema/specs/chat-completions-compat/spec.md — ADDED Requirement: _normalize_chat_tools preserves the function tool strict flag. Four scenarios covering compliant strict 200, violating strict 400, strict=false preservation, and built-in tool unaffected.

Tests

16 new unit cases in tests/unit/test_strict_schema_validation.py:
- Validator (6): missing additionalProperties, additionalProperties=true, missing required, valid schema, nested violation, anonymous-name fallback.
- Enforce hook (6): rejection envelope (code/param/type/message), correct index reporting, strict=false skipped, omitted strict skipped, compliant strict accepted, chat-style param template.
- Chat coercion (4): strict=true preserved, strict=true violating pre-validates, strict=false preserved, omitted strict has no key.
2 new integration cases in tests/integration/test_openai_compat_features.py:
- test_v1_chat_completions_rejects_strict_function_tool_violation (asserts tools[0].function.parameters).
- test_v1_responses_rejects_strict_function_tool_violation (asserts tools[0].parameters).

Validation

✅ openspec validate validate-strict-function-tool-schema → "is valid"

✅ Targeted sweep (per codex-lb-development skill §5): 241 passed

pytest tests/unit/test_strict_schema_validation.py tests/unit/test_chat_request_mapping.py \
  tests/unit/test_openai_requests.py tests/integration/test_proxy_chat_completions.py \
  tests/integration/test_proxy_responses.py tests/integration/test_openai_client_compat.py \
  tests/integration/test_openai_compat_features.py

✅ SDK-level e2e (skill §5): 20 passed (tests/e2e/test_openai_sdk_compat.py tests/e2e/test_v1_responses_openai_sdk.py)
✅ uvx ruff check . && uvx ruff format --check . && uv run ty check: all green

✅ Live four-cell matrix on codex-lb-dev-server with a real account: 8/8 PASS

Endpoint	Cell	strict	additionalProperties	Result
`/v1/chat/completions`	A	true	missing	✓ 400 `invalid_function_parameters`, `tools[0].function.parameters`
`/v1/chat/completions`	B	true	`false`	✓ 200
`/v1/chat/completions`	C	false	missing	✓ 200
`/v1/chat/completions`	D	omitted	missing	✓ 200
`/v1/responses`	A	true	missing	✓ 400 `invalid_function_parameters`, `tools[0].parameters`
`/v1/responses`	B	true	`false`	✓ 200
`/v1/responses`	C	false	missing	✓ 200
`/v1/responses`	D	omitted	missing	✓ 200

✅ scripts/verify_v1_responses_openai_sdk.py against the dev container: 5/5 PASS (was 4/5 pre-fix because the script itself sent a spec-violating payload).

Notes

Not a regression — 1.17.0 and earlier had the same behavior. This is the inverse of the 502→400 fix surface that fix(proxy): make /v1/responses streams parseable by the OpenAI SDK #639 / mask-single-http-previous-response-miss / treat-upstream-overloaded-as-retryable already addressed for other code paths; function tools were just the last un-validated strict surface.
tool_choice, tool-type registry, and built-in tools (web_search, image_generation, …) are unchanged. Only function tools with strict:true and a non-None parameters are touched.
The param_template parameter keeps the error envelope's param field aligned with whatever shape the client originally sent, mirroring real OpenAI's behavior across the two endpoints.

Strict-mode JSON schema violations on a function tool (e.g. `strict:true` with no `additionalProperties:false`) reached the upstream Codex backend, which closed the WebSocket with `close_code=1000` and surfaced as a generic `502 upstream_rejected_input`. Real OpenAI returns deterministic `400 invalid_function_parameters` for the same payload — and a 5xx on a permanently-broken request triggers retry/failover loops in clients. Two parts: 1. Add `validate_strict_function_tool_schema` + `enforce_strict_function_tools_format` mirroring the existing `text.format.json_schema` strict pre-validator. Wired into both `/v1/responses` and `/v1/chat/completions` handlers plus `normalize_responses_request_payload` for the WS path. Param shape matches the source endpoint: native responses callers see `tools[i].parameters`; chat-completions callers see `tools[i].function.parameters`. 2. Stop dropping the function-tool `strict` field in `_normalize_chat_tools`. Previously chat-completions silently masked strict tool schema violations (200 + tool_calls), diverging from both real OpenAI and `/v1/responses`. With the flag preserved, the chat path enters the same strict pre-validation pipeline. Side note: `scripts/verify_v1_responses_openai_sdk.py::case_tool_call_streaming` was sending a strict tool schema without `additionalProperties:false`, matching the new 400. Fix the script to use a spec-compliant strict schema — verify suite is now deterministic 5/5. Tests: 16 new unit cases in `test_strict_schema_validation.py` (validator + enforce + chat coercion preservation) and 2 new integration cases covering both endpoints' 400 envelope shape. Validation: - openspec validate validate-strict-function-tool-schema → "is valid" - targeted sweep: 241 passed - SDK e2e: 20 passed - uvx ruff check / ruff format --check / uv run ty check: all green - live four-cell matrix on `codex-lb-dev-server` with a real account: 8/8 PASS (A: 400 + correct code/param both endpoints; B/C/D: 200) - verify_v1_responses_openai_sdk.py vs the dev container: 5/5 PASS (was 4/5 due to the script's own spec-violating payload)

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f7ebcfb198

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-05-15T13:21:12Z

+    """
+    if not request.tools:
+        return
+    for index, tool in enumerate(request.tools):


Use original tool index in chat strict-schema errors

enforce_strict_function_tools_format builds param from the index in request.tools, but for chat-completions that list has already been normalized by _normalize_chat_tools, which can drop entries (for example function tools with missing/empty names). In that case a strict-schema violation is reported against tools[i].function.parameters with a shifted i, so clients receive a deterministic 400 that points to the wrong input element and cannot reliably map the error back to their original payload.

Useful? React with 👍 / 👎.

Codex review on PR #658 flagged that `enforce_strict_function_tools_format` on the chat-completions path walked `responses_payload.tools` — which is the output of `_normalize_chat_tools`, a normalizer that silently drops invalid entries (non-dict tools, function tools with missing/empty `name`). When the normalizer dropped any earlier entry, the rejected `param` ("tools[i].function.parameters") referenced an index that no longer matched the client's inbound payload. Refactor: - Switch `enforce_strict_function_tools_format` to take a raw `tools: list[JsonValue]` plus a `nested: bool` flag. `nested=False` reads parameters/strict from the flat /v1/responses shape; `nested=True` reads them from the chat-completions `tool["function"][...]` shape. - Chat handler now validates the *raw* `payload.tools` list before `to_responses_request()` runs, so the reported index always lines up with what the client sent. Compliant payloads continue through the existing coercion path unchanged. - /v1/responses handler and `normalize_responses_request_payload` are updated to pass `responses.tools` directly (flat shape, default template). Regression coverage: - `tests/unit/test_strict_schema_validation.py`: - `test_chat_strict_violation_param_uses_original_index_when_normalizer_drops` - `test_chat_strict_enforcement_runs_against_raw_payload_in_handler_path` - existing chat coercion tests adjusted to call the new signature. - `tests/integration/test_openai_compat_features.py`: - `test_v1_chat_completions_strict_violation_param_uses_original_index` posts [dropped, violating] and asserts `param == "tools[1].function.parameters"`. Validation: - ruff check, ruff format --check, ty check: clean - pytest tests/unit tests/integration: 2180 passed - live docker /v1/chat/completions + /v1/responses 9-cell matrix (4 original + 1 shifted-index regression + 4 responses): 9/9 PASS - verify_v1_responses_openai_sdk.py: 5/5 PASS Refs: PR #658 codex review feedback.

Soju06 · 2026-05-16T00:03:16Z

Addressed codex P2 feedback in 48d7889.

Root cause — enforce_strict_function_tools_format was walking responses_payload.tools (the output of _normalize_chat_tools), which silently drops invalid entries (non-dict tools, function tools with missing/empty name). When an earlier entry was dropped, the reported param ("tools[i].function.parameters") referenced a shifted i that no longer matched the client's inbound payload.

Fix — refactored the helper to take a raw tools: list[JsonValue] plus nested: bool instead of a ResponsesRequest. The chat handler now validates the raw payload.tools list before to_responses_request() runs _normalize_chat_tools, so the reported index always lines up with the inbound request. The /v1/responses handler is unaffected behaviorally (passes responses.tools directly).

Regression coverage

Unit:
- test_chat_strict_violation_param_uses_original_index_when_normalizer_drops — payload [dropped@0, violation@1], asserts param == 'tools[1].function.parameters'.
- test_chat_strict_enforcement_runs_against_raw_payload_in_handler_path — confirms compliant payload validates pre-coercion.
Integration: test_v1_chat_completions_strict_violation_param_uses_original_index exercises the same scenario through the FastAPI handler.

Validation

ruff / ty: clean
pytest tests/unit tests/integration: 2180 passed
Live docker matrix on codex-lb-dev-server with a real account (Pro): 9/9 PASS (4 chat × strict combos + new shifted-index regression cell + 4 responses × strict combos)
scripts/verify_v1_responses_openai_sdk.py: 5/5 PASS

Soju06 · 2026-05-16T01:29:35Z

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 48d7889758

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-05-16T01:33:38Z

+        if tool.get("type") != "function":
+            continue


Treat missing tool type as function during strict validation

The chat pre-validator now skips any tool whose type is not exactly "function", but _normalize_chat_tools still coerces nested function tools to "function" when type is omitted ("type": tool_type or "function" in app/core/openai/chat_requests.py). With the new flow (validate raw chat tools before normalization), a payload like { "function": {..., "strict": true, "parameters": <invalid schema>} } bypasses pre-validation, gets normalized into a strict function tool, and can still fail upstream as a 5xx instead of the intended local 400 invalid_function_parameters.

Useful? React with 👍 / 👎.

Second codex review pass on PR #658 (commit 48d7889) flagged that the chat-side pre-validator's `tool.get("type") != "function"` guard is out of sync with `_normalize_chat_tools`. The normalizer coerces any tool whose `function` value is a dict into a function tool — including ones with no top-level `type` (`"type": tool_type or "function"` in chat_requests.py:198). That meant a payload like {"function": {"strict": true, "parameters": <invalid>}} was skipped by pre-validation, normalized into a strict function tool, sent upstream, and rejected with a 5xx instead of the intended local 400 `invalid_function_parameters`. Fix: in the `nested=True` branch of `enforce_strict_function_tools_format`, anchor detection on the presence of a `tool["function"]` dict — exactly the rule `_normalize_chat_tools` uses. `type" check stays in the flat (`nested=False`) branch for the /v1/responses path, where the request model already requires an explicit type. Regression coverage: - Unit (tests/unit/test_strict_schema_validation.py): - test_chat_strict_violation_when_type_omitted_but_function_dict_present - test_chat_strict_skips_tool_without_function_wrapper_even_with_type - test_responses_path_still_requires_type_function - Integration (tests/integration/test_openai_compat_features.py): - test_v1_chat_completions_strict_violation_when_type_omitted_in_chat_tool Validation: - ruff / format / ty: clean - pytest tests/unit tests/integration: 2184 passed - Live docker matrix on codex-lb-dev-server with real account: 12/12 PASS — A-H chat (incl. F=type-omitted strict violation -> 400, G=type-omitted compliant -> 200, H=type-omitted strict=false -> 200) + A-D responses - verify_v1_responses_openai_sdk.py: 5/5 PASS Refs: PR #658 codex review pass 2.

Soju06 · 2026-05-16T01:44:59Z

Addressed second codex review pass (P2 on request_policy.py:280) in 008d502.

Root cause — chat pre-validator's tool.get("type") != "function" guard was out of sync with _normalize_chat_tools, which coerces any tool with "function": {...} into a function tool ("type": tool_type or "function" at chat_requests.py:198). A payload like {"function": {"strict": true, "parameters": <invalid>}} (no top-level type) was skipped by pre-validation, normalized into a strict function tool, sent upstream, and surfaced as a misleading 5xx instead of the local 400.

Fix — in the nested=True branch of enforce_strict_function_tools_format, anchor function-tool detection on the presence of the tool["function"] dict (matching the normalizer's rule). The flat /v1/responses branch (nested=False) keeps requiring an explicit type == "function" since the request model already enforces it.

Regression coverage (3 unit + 1 integration):

test_chat_strict_violation_when_type_omitted_but_function_dict_present — exact payload codex described, asserts 400 with param=tools[0].function.parameters.
test_chat_strict_skips_tool_without_function_wrapper_even_with_type — symmetric guardrail.
test_responses_path_still_requires_type_function — defensive check that flat-shape behavior didn't shift.
test_v1_chat_completions_strict_violation_when_type_omitted_in_chat_tool — same scenario via FastAPI handler.

Validation

ruff / format / ty: clean
pytest tests/unit tests/integration: 2184 passed
Live docker matrix (A–H chat + A–D responses): 12/12 PASS, including the new F (type-omitted nested strict violation → 400), G (type-omitted compliant → 200), and H (type-omitted strict=false → 200).
verify_v1_responses_openai_sdk.py: 5/5 PASS

@codex review

chatgpt-codex-connector · 2026-05-16T01:47:48Z

Codex Review: Didn't find any major issues. 🚀

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Release 1.18.0 — v1.17.0 → main, 27 PR. ## Pre-merge verification (release-gate e2e) - ✅ CI 18/18 (mergeStateStatus CLEAN) - ✅ baseline release-gate-smoke: 10/10 (`v1.models`, chat non-stream/stream/tool_call, responses non-stream/stream, unknown-msg-key drop, transient codes, dashboard 72-byte, alembic index) - ✅ per-PR e2e extra: 14/14 (#658 strict tool × chat valid/invalid/type-omitted + responses invalid, #650-652 trim replay markers, #516 prev-resp miss mask, #558 prolite plan, #600 token_expired, #635 device poll, #647 frontend 72B, #594 account selection, #640 alembic head, #653/#654 conv archive, #655/#656 weekly pace) - ✅ pytest 562 passed / 3 PG-only skip (unit + integration suite) - ✅ SDK e2e 20/20 (`tests/e2e/test_openai_sdk_compat.py` + `tests/e2e/test_v1_responses_openai_sdk.py`) - ✅ `scripts/verify_v1_responses_openai_sdk.py` 5/5 against live container with real account - ✅ ruff check + ruff format --check + uv run ty check — All clean - ✅ prod parity probe: #637 (unknown-key drop) + #658 (strict tool 400) both confirmed fixed vs prod 1.17.0 ## Behavior change to note in announcements - **#658**: `strict: true` function tool schemas missing `additionalProperties: false` (or other strict-mode requirements) now return a local **400 `invalid_function_parameters`** with `param=tools[N].function.parameters` (chat) / `tools[N].parameters` (responses). Pre-1.18.0 these were silently forwarded upstream; 1.18.0 enforces OpenAI's documented strict-mode contract locally before forwarding. OpenSpec changes shipped in this release (to be archived after tag publish): - deactivate-on-token-expired (#600) - drop-unknown-message-fields (#637) - mask-single-http-previous-response-miss (#516) - normalize-v1-responses-openai-sdk-stream (#639) - start-device-oauth-polling (#635) - support-prolite-plan-capacity (#558) - treat-upstream-overloaded-as-retryable (#601) - validate-password-length (#598) - validate-strict-function-tool-schema (#658)

Archive changes shipped in v1.18.0 and v1.18.1 per AGENTS.md "release 이후 archive" workflow. All 12 changes verified shipped in production at 10.0.0.113 (image 1.18.1, sha256:0431a474432d). 1.18.0 (9 changes): - deactivate-on-token-expired (#600) - drop-unknown-message-fields (#637) - mask-single-http-previous-response-miss (#516) - normalize-v1-responses-openai-sdk-stream (#639) - start-device-oauth-polling (#635) [delta header fixed: MODIFIED -> ADDED] - support-prolite-plan-capacity (#558) [delta header fixed: MODIFIED -> ADDED] - treat-upstream-overloaded-as-retryable (#601) - validate-password-length (#598) - validate-strict-function-tool-schema (#658) 1.18.1 (3 changes): - harden-long-codex-websocket-turns (#586) - suppress-duplicate-side-effect-tool-calls (#586) - hotfix-db-pool-pre-ping-and-firewall-cache-ttl (#679) Specs updated: 9 capabilities (admin-auth, api-firewall, chat-completions-compat, chat-completions-extensions, database-backends, frontend-architecture, responses-api-compat, usage-refresh-policy). openspec validate --specs: 19/19 passed. Co-authored-by: Hermes Agent <hermes-agent@nous.ai>

chatgpt-codex-connector Bot reviewed May 15, 2026

View reviewed changes

Komzpa mentioned this pull request May 15, 2026

Tested live aggregate stack and suggested merge order #554

Closed

chatgpt-codex-connector Bot reviewed May 16, 2026

View reviewed changes

Komzpa added the 🤖 codex: ok [@codex review] says no issues found. label May 16, 2026

Soju06 merged commit 0998cac into main May 17, 2026
18 checks passed

Soju06 deleted the fix/validate-strict-function-tool-schema branch May 17, 2026 04:56

Soju06 mentioned this pull request May 16, 2026

chore(main): release 1.18.0 #599

Merged

Soju06 mentioned this pull request May 17, 2026

chore(openspec): archive 12 shipped changes from 1.18.0 + 1.18.1 #682

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(proxy): pre-validate strict function tool schemas#658

fix(proxy): pre-validate strict function tool schemas#658
Soju06 merged 3 commits into
mainfrom
fix/validate-strict-function-tool-schema

Soju06 commented May 15, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot May 15, 2026

Uh oh!

Soju06 commented May 16, 2026

Uh oh!

Soju06 commented May 16, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot May 16, 2026

Uh oh!

Soju06 commented May 16, 2026

Uh oh!

chatgpt-codex-connector Bot commented May 16, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Soju06 commented May 15, 2026

Summary

Root Cause

Changes

Spec

Tests

Validation

Notes

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot May 15, 2026

Choose a reason for hiding this comment

Uh oh!

Soju06 commented May 16, 2026

Uh oh!

Soju06 commented May 16, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot May 16, 2026

Choose a reason for hiding this comment

Uh oh!

Soju06 commented May 16, 2026

Uh oh!

chatgpt-codex-connector Bot commented May 16, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants