feat: forward cache routing headers for Responses API prompt caching by zicochaos · Pull Request #165 · mcowger/plexus

zicochaos · 2026-04-15T01:08:38Z

Summary

forward session_id and x-client-request-id headers from incoming Responses API requests to upstream providers for prompt cache routing
add /v1/codex/responses route alias so Codex CLI and pi-agent clients can connect through plexus without URL mismatch
fall back to prompt_cache_key from the request body when client headers are absent

Details

OpenAI's Codex backend uses session_id and x-client-request-id headers to route requests to the same backend server that holds cached prompt prefixes. Without these headers, prompt_cache_key in the body is ineffective and every request is a full-price cache miss.
The official Codex CLI (codex-rs/codex-api/src/requests/headers.rs) sends these headers unconditionally. Third-party clients (pi-agent, hermes-agent) also send them. Plexus was silently dropping them in setupHeaders().
Cache routing headers are captured in responses.ts route handler, attached to UnifiedChatRequest.cacheRoutingHeaders, and forwarded by dispatcher.ts setupHeaders(). When client headers are absent, prompt_cache_key from the body is used as fallback.
The /v1/codex/responses alias is needed because pi-agent's openai-codex-responses wire API appends /codex/responses to the base URL. The handler is shared -- no code duplication.

Files changed

File	Change
`packages/backend/src/types/unified.ts`	add `cacheRoutingHeaders` to `UnifiedChatRequest` interface
`packages/backend/src/routes/inference/responses.ts`	capture incoming `session_id`/`x-client-request-id` headers with `prompt_cache_key` fallback; extract handler for dual-route registration; add `/v1/codex/responses` alias
`packages/backend/src/services/dispatcher.ts`	forward `cacheRoutingHeaders` to upstream in `setupHeaders()`

Configuration note

No config schema changes needed. Providers that support the Responses API can already be configured using the existing api_base_url record format:

clawbay-direct:
  api_key: "ca_v1.YOUR_TOKEN"
  api_base_url:
    responses: "https://api.theclawbay.com/backend-api/codex"
  models:
    gpt-5.4:
      access_via: ["responses"]

This makes getProviderTypes() return ['responses'], the incoming type matches directly, pass-through optimization activates, and the cache routing headers are forwarded.

OAuth-based providers (api_base_url: oauth://) also work -- the OAuth path handles codex responses natively.

Verification

bun run typecheck -- zero type errors from changed files (all errors are pre-existing in test files, frontend, and index.ts bundle types)
cd packages/backend && bun test src/routes/inference/__tests__/auth.test.ts src/services/__tests__/dispatcher-failover.test.ts -- 35 tests pass, 0 failures
deployed to local plexus instance at 192.168.66.12:4000 against live config/SQLite:
- /v1/codex/responses accepts requests and returns responses correctly
- pi-agent with openai-codex-responses wire API (baseUrl http://192.168.66.12:4000/v1) successfully connects through plexus to theclawbay OAuth provider
- session_id and x-client-request-id headers forwarded to upstream
- no regression on existing /v1/responses endpoint -- pass-through optimization active for responses-to-responses routing

zicochaos · 2026-04-15T01:17:15Z

The claude-review check failure is unrelated to this PR -- it's an OIDC token permissions issue in the Claude Code Review workflow (added in #164). The error is:

Could not fetch an OIDC token. Did you remember to add `id-token: write` to your workflow permissions?

The workflow likely needs id-token: write in its permissions block and/or the ANTHROPIC_API_KEY secret configured for fork PRs.

mcowger

Overall looks good, but it appears you accidentally captured some changes to the workflow definition.

Can you remove those (thanks for the note about the token fix) and then I'll merge?

Clients (Codex CLI, pi-agent, hermes-agent) send session_id and x-client-request-id headers for server-side cache routing. Without these headers, upstream providers (theclawbay, OpenAI) cannot route requests to the same backend server that holds cached prompt prefixes, causing every request to be a cache miss. Changes: - types/unified.ts: add cacheRoutingHeaders to UnifiedChatRequest - routes/inference/responses.ts: capture session_id and x-client-request-id from incoming request headers, with fallback to prompt_cache_key from the body - services/dispatcher.ts: forward cache routing headers to upstream in setupHeaders() Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

The Codex CLI and pi-agent with openai-codex-responses wire API send requests to /v1/codex/responses. Register the same handler on both /v1/responses and /v1/codex/responses so codex clients can go through plexus without URL mismatch. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

zicochaos · 2026-04-15T07:48:14Z

Removed the workflow file changes -- force-pushed with only the 3 intended files:

packages/backend/src/types/unified.ts
packages/backend/src/routes/inference/responses.ts
packages/backend/src/services/dispatcher.ts

Those workflow files came from a sync merge with upstream before branching. Sorry about that.

…ders

github-actions bot force-pushed the main branch from 7a4d4a3 to 4abdd08 Compare April 15, 2026 03:06

mcowger requested changes Apr 15, 2026

View reviewed changes

Sebastian Bochna and others added 2 commits April 15, 2026 08:47

zicochaos force-pushed the feat/forward-cache-routing-headers branch from 6d8eb96 to 543b95c Compare April 15, 2026 07:48

mcowger merged commit aaac2c6 into mcowger:main Apr 15, 2026
1 check failed

github-actions bot pushed a commit that referenced this pull request Apr 16, 2026

Merge pull request #165 from zicochaos/feat/forward-cache-routing-hea…

1014f6f

…ders

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: forward cache routing headers for Responses API prompt caching#165

feat: forward cache routing headers for Responses API prompt caching#165
mcowger merged 2 commits intomcowger:mainfrom
zicochaos:feat/forward-cache-routing-headers

zicochaos commented Apr 15, 2026 •

edited

Loading

Uh oh!

zicochaos commented Apr 15, 2026

Uh oh!

mcowger left a comment

Uh oh!

zicochaos commented Apr 15, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

zicochaos commented Apr 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Details

Files changed

Configuration note

Verification

Uh oh!

zicochaos commented Apr 15, 2026

Uh oh!

mcowger left a comment

Choose a reason for hiding this comment

Uh oh!

zicochaos commented Apr 15, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

zicochaos commented Apr 15, 2026 •

edited

Loading