Skip to content

fix(codex): make codex provider usable from Claude Code CLI (#16, #51)#59

Merged
CaddyGlow merged 3 commits intomainfrom
fix/issue-51-codex-metadata-strip
Apr 13, 2026
Merged

fix(codex): make codex provider usable from Claude Code CLI (#16, #51)#59
CaddyGlow merged 3 commits intomainfrom
fix/issue-51-codex-metadata-strip

Conversation

@CaddyGlow
Copy link
Copy Markdown
Owner

Summary

Make the /codex provider actually usable from the Claude Code CLI. Closes #51 and addresses the feature request in #16 ("use codex credential in claude code").

When ANTHROPIC_BASE_URL points Claude Code at ccproxy's /codex endpoint, several independent bugs combined to break the flow end-to-end: the very first request failed with HTTP 400, and even after that, multi-turn conversations with tool calls were silently dropped or mangled on the way to the Codex Responses API. This PR fixes all of them.

What was broken

  1. metadata rejected by upstream (use in claude code #51): the Anthropic -> Responses converter copied Anthropic's metadata.user_id into the Responses body as metadata, which chatgpt.com/backend-api/codex/responses rejects with 400 {"detail":"Unsupported parameter: metadata"}. Every Claude Code -> codex request failed.
  2. {"detail": "..."} errors became 502s: convert_anthropic_to_openai_error raised ValidationError on Codex's FastAPI-style error shape, so upstream 400s were upgraded to 502s in the non-streaming error path.
  3. Only the last user message was sent: _build_responses_payload_from_anthropic_request dropped the entire conversation history and emitted just the last user text as input. Multi-turn chats effectively started from scratch each turn.
  4. Tool-use cycles didn't round-trip: assistant tool_use and user tool_result blocks were never translated into Responses API function_call / function_call_output items, so tool usage from Claude Code broke as soon as a tool was involved.
  5. Interleaved assistant text/tool_use ordering was lost: when the builder was extended to emit tool items, assistant text was collapsed into one message before all function_calls, which the Responses API treats as a different conversation shape.
  6. Long call_ids exceeded OpenAI's 64-char limit: some upstream ids are longer than 64 chars, causing the Responses API to reject function_call/function_call_output pairs.
  7. LegacyCustomTool was silently dropped: the custom-tool mapping only accepted Tool, so Claude Code's legacy tool shape was skipped.
  8. tool_use streaming event violated Anthropic spec: content_block_start for a tool_use was emitted with the full input attached. Per the Anthropic streaming spec, input must start as {} and the arguments JSON is streamed via input_json_delta.partial_json. Official Anthropic SDKs ignore the inline input, so downstream consumers never saw tool arguments.
  9. OpenAI tool_call continuation chunks were non-spec: continuation chunks re-emitted id, type, and function.name. Per the OpenAI Chat streaming spec, those fields only appear on the first chunk for a given tool call. Some strict consumers rejected the chunks.
  10. Delta accumulator concatenated identity fields: the Codex Responses -> Chat adapter re-sends id/type/name/call_id on every chunk. The generic string-concat branch in accumulate_delta merged them into "shellshell..." / "fc_abc_fc_abc...", breaking ChatCompletionChunk validation.

What this PR does

Codex adapter / error handling (#51)

  • ccproxy/plugins/codex/adapter.py: add "metadata" to the unsupported-key strip list in _sanitize_provider_body, alongside max_tokens/temperature/etc. Validated live: outgoing body no longer contains metadata, upstream returns 200.
  • ccproxy/services/adapters/simple_converters.py: convert_anthropic_to_openai_error now coerces non-Anthropic error shapes (FastAPI {"detail": "..."}, arbitrary dicts) into a minimal Anthropic ErrorResponse envelope instead of raising ValidationError.

Anthropic -> Responses conversion (#16)

  • ccproxy/llms/formatters/anthropic_to_openai/requests.py:
    • New _build_responses_input_items translates the full message list into Responses API input items (message / function_call / function_call_output), preserving interleaved text and tool_use ordering within assistant turns.
    • New deterministic _clamp_call_id (call_ + sha1) keeps tool_use.id and tool_result.tool_use_id paired after clamping to 64 chars.
    • Accept LegacyCustomTool alongside Tool in the custom-tool mapping (both Chat and Responses paths).
    • Small dedup helpers (_block_type, _block_field) so dict/pydantic branches share one code path.

Streaming spec alignment

  • ccproxy/llms/formatters/common/streams.py: emit_anthropic_tool_use_events now emits content_block_start with input={} and streams arguments via input_json_delta.partial_json, per the Anthropic streaming spec.
  • ccproxy/llms/formatters/openai_to_openai/streams.py: tool_call continuation chunks no longer re-emit id/name. Only the first chunk carries them, per the OpenAI Chat streaming spec.
  • ccproxy/llms/models/openai.py: FunctionCall.name is now str | None to support the above.

Delta accumulation

  • ccproxy/services/adapters/delta_utils.py: identity/discriminator fields (index, type, id, name, call_id) are overwritten instead of merged. Comment explains the provider behavior driving this and why index must be in the list (int-add would double non-zero indices).

Test plan

New / updated tests:

  • tests/plugins/codex/unit/test_adapter.py::TestCodexAdapter::test_sanitize_provider_body_strips_metadata
  • tests/unit/services/adapters/test_simple_converters.py - error converter covers Anthropic-native, FastAPI detail, and arbitrary-dict shapes
  • tests/unit/llms/formatters/test_anthropic_to_openai_helpers.py:
    • ..._request_tool_cycle - full user -> assistant(text+tool_use) -> user(tool_result) cycle
    • ..._request_long_call_id - deterministic clamp keeps the pair intact
    • ..._request_legacy_custom_tools - LegacyCustomTool accepted
    • ..._request_tool_result_mixed_content - list-form tool_result with text+image parts
    • ..._request_pending_text_after_tool_result - text after tool_result in same user message is flushed
    • ..._request_assistant_interleaved_ordering - interleaved text/tool_use preserves order
  • tests/unit/llms/formatters/test_openai_to_anthropic_chat_response.py and test_streaming_converters_samples.py - tool_use now carries input={} at start and arguments via input_json_delta
  • tests/unit/services/adapters/test_delta_utils.py - identity fields (id/name/type/call_id) not concatenated across chunks
  • make pre-commit passes
  • Validated live against make dev with ANTHROPIC_BASE_URL=http://127.0.0.1:8000/codex claude - the metadata 400 is gone and multi-turn tool flows round-trip correctly

Issues

…er (#51)

When Claude Code CLI points at ccproxy's /codex endpoint, the
anthropic.messages -> openai.responses converter copies Anthropic's
metadata.user_id into the Responses payload as "metadata". The Codex
upstream (chatgpt.com/backend-api/codex/responses) rejects this with
"Unsupported parameter: metadata", so every Claude Code -> codex
request was failing with HTTP 400.

- codex adapter: add "metadata" to the unsupported-key strip list in
  _sanitize_provider_body so it is removed before the upstream call,
  same as max_tokens/temperature.
- simple_converters: convert_anthropic_to_openai_error now coerces
  non-Anthropic error shapes (e.g. Codex's FastAPI-style
  {"detail": "..."}) into a minimal ErrorResponse envelope instead of
  raising ValidationError. Without this, upstream 400s were being
  upgraded to 502s in the non-streaming error path.

Adds regression tests for both the metadata strip and the error
converter (three shapes: Anthropic-native, FastAPI detail, and
arbitrary dict).
The OpenAI Responses -> Anthropic and OpenAI Chat -> Anthropic stream
converters were emitting ContentBlockDeltaEvent with
delta=TextBlock(type="text") instead of delta=TextDelta(type="text_delta").
The Pydantic model accepts either (TextBlock is a tolerated fallback),
but the real Anthropic wire protocol and the Claude Code CLI's SDK
parser require type="text_delta". The effect was that Claude Code CLI
pointed at ccproxy's /codex endpoint received a 200 OK stream, parsed
message_start/content_block_start/content_block_stop/message_stop
correctly, but silently dropped every text delta — the user saw nothing.

Adds two regression tests pinning the on-the-wire type to text_delta
for both the Responses and Chat converters, including a check against
model_dump(by_alias=True) so the serialized payload can't drift.
When Claude Code CLI targets /codex, conversations with history and
tool-use cycles were dropped or mangled, and tool streaming events did
not match the official specs. Rewrite the Anthropic -> Responses input
translation and align tool streaming with Anthropic/OpenAI specs.

- anthropic_to_openai/requests.py: translate the full message list
  into Responses API input items (message / function_call /
  function_call_output), preserving interleaved text and tool_use
  ordering within assistant turns. Add a deterministic
  _clamp_call_id so tool_use/tool_result pairs stay intact when ids
  exceed OpenAI's 64-char limit. Accept LegacyCustomTool alongside
  Tool in the custom-tool mapping.

- common/streams.py: emit tool_use content_block_start with empty
  input {} and stream arguments via input_json_delta.partial_json,
  per Anthropic streaming spec. Official SDKs ignore input attached
  directly to the start event.

- openai_to_openai/streams.py: tool_call continuation chunks no
  longer re-emit id/name. Per OpenAI Chat streaming spec, those
  fields only appear on the first chunk for a given tool call.

- models/openai.py: FunctionCall.name is now Optional to support
  the continuation chunks above.

- services/adapters/delta_utils.py: identity fields (index, type,
  id, name, call_id) are overwritten instead of merged. Without
  this, providers that re-send these per chunk (e.g. the Codex
  Responses->Chat adapter) produced "shellshell..." /
  "fc_abc_fc_abc..." and broke downstream validation.

Tests cover the full tool cycle, interleaved assistant ordering,
list-form tool_result content, pending user text after a tool
result, long call_id clamping, LegacyCustomTool acceptance, tool_use
streaming events, and delta_utils identity handling.
Copilot AI review requested due to automatic review settings April 13, 2026 12:14
@CaddyGlow CaddyGlow merged commit 1913bc3 into main Apr 13, 2026
21 checks passed
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes multiple interoperability issues preventing Claude Code CLI from successfully using ccproxy’s /codex provider end-to-end, including request sanitization, error-shape handling, Anthropic→Responses message/tool translation, streaming spec compliance, and streaming delta accumulation correctness.

Changes:

  • Strip unsupported metadata (and other keys) from Codex-bound payloads and coerce non-Anthropic upstream error shapes to avoid misclassifying upstream 4xx as 5xx.
  • Rebuild Anthropic→OpenAI Responses request conversion to include full conversation history and properly round-trip tool-use cycles (with deterministic call-id clamping).
  • Align streaming event shapes (Anthropic tool input streaming + text delta types; OpenAI tool-call continuation chunks) and harden delta accumulation to avoid concatenating identity fields.

Reviewed changes

Copilot reviewed 14 out of 14 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
ccproxy/plugins/codex/adapter.py Sanitizes outgoing Codex payloads by stripping unsupported keys like metadata.
ccproxy/services/adapters/simple_converters.py Makes Anthropic→OpenAI error conversion resilient to non-Anthropic error payloads (e.g., FastAPI detail).
ccproxy/llms/formatters/anthropic_to_openai/requests.py Reworks Anthropic→Responses request building to preserve history + tool interleaving; clamps long tool call IDs; accepts legacy tool definitions.
ccproxy/llms/formatters/common/streams.py Emits Anthropic tool-use streaming events per spec (input={} then input_json_delta).
ccproxy/llms/formatters/openai_to_anthropic/streams.py Ensures Anthropic wire streaming uses text_delta events (not TextBlock(type="text")).
ccproxy/llms/formatters/openai_to_openai/streams.py Adjusts tool-call streaming chunk emission to better match OpenAI chat streaming expectations.
ccproxy/llms/models/openai.py Makes FunctionCall.name optional to support spec-compliant continuation chunks.
ccproxy/services/adapters/delta_utils.py Prevents identity/discriminator fields from being concatenated during streaming delta accumulation.
tests/plugins/codex/unit/test_adapter.py Adds regression coverage for stripping metadata in Codex adapter sanitization.
tests/unit/services/adapters/test_simple_converters.py Adds coverage for error-shape coercion in convert_anthropic_to_openai_error.
tests/unit/services/adapters/test_delta_utils.py Adds regression coverage preventing concatenation of tool-call identity fields across chunks.
tests/unit/llms/formatters/test_anthropic_to_openai_helpers.py Adds coverage for Responses request construction: tool cycles, long call IDs, legacy tools, ordering, mixed tool_result content.
tests/unit/llms/formatters/test_openai_to_anthropic_chat_response.py Updates assertions for tool-use streaming input behavior (input_json_delta).
tests/unit/llms/formatters/test_streaming_converters_samples.py Adds/updates sample-based streaming tests for tool-use input streaming and text_delta wire type.
Comments suppressed due to low confidence (1)

ccproxy/llms/formatters/openai_to_openai/streams.py:400

  • The initial tool_call chunk now uses id=state.call_id or state.id, but later chunks in this adapter still emit id based on state.id in other branches. If call_id and id differ, clients/accumulators may see the tool call identifier change mid-stream. Consider consistently using the same identifier for all emitted Chat tool_call chunks (or omitting id after the first chunk).
                    if not state.initial_emitted:
                        tool_call = openai_models.ToolCallChunk(
                            index=state.index,
                            id=state.call_id or state.id,
                            type="function",
                            function=openai_models.FunctionCall(
                                name=state.name or "",
                                arguments=arguments or "",
                            ),
                        )

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 445 to 452
if state.initial_emitted:
tool_call = openai_models.ToolCallChunk(
index=state.index,
id=state.id,
type="function",
function=openai_models.FunctionCall(
name=state.name or "",
arguments=delta_segment,
),
)
Copy link

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tool-call continuation chunks still set type="function". Per the OpenAI Chat streaming spec (and as noted in the PR description), id/type/function.name should only be present on the first chunk for a given tool call; subsequent chunks should generally include only function.arguments (plus index). Consider omitting type on continuation chunks here to avoid strict client parsers rejecting the stream.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

use in claude code

2 participants