Skip to content

WIP: Codex WebSocket transport + ChatGPT routing + AUTH_TOKEN strip + self-loop guard#33

Merged
lis186 merged 12 commits into
mainfrom
codex-ws-merged
May 22, 2026
Merged

WIP: Codex WebSocket transport + ChatGPT routing + AUTH_TOKEN strip + self-loop guard#33
lis186 merged 12 commits into
mainfrom
codex-ws-merged

Conversation

@lis186
Copy link
Copy Markdown
Owner

@lis186 lis186 commented May 22, 2026

Draft. Combines #29 + shhtheonlyperson#6 + 2 new commits:

  • a5d28f0 Strip AUTH_TOKEN ?token= from forwarded URLs and entry logs
  • 06e0f56 Hard-block self-loop on CHATGPT_BASE_URL

Excludes b6c7cac (PR #6 evidence files in docs/pr-6-screenshots/).

npm test: 480 pass / 0 fail.

Pending: manual verification of ChatGPT-auth Codex (Y6-W2), API-key Codex abnormal close (normalizeCloseCode), Claude regression, and AUTH_TOKEN strip end-to-end. Do NOT merge until those are done.

shhtheonlyperson and others added 11 commits May 8, 2026 23:00
- Arm idle timer in wss.handleUpgrade callback so a stalled upstream
  (accepts TCP, never sends 101) is bounded by IDLE_TIMEOUT_MS instead
  of hanging forever.
- Cap client→upstream send buffer at CCXRAY_WS_MAX_QUEUE_BYTES (default
  4 MiB) and close 1009 on overflow; the previous queue was unbounded.
- Destroy the upstream HTTP request/response on unexpected-response so
  the underlying socket doesn't leak. ws library hands ownership to the
  user once a listener is attached.
- Drop the unreachable clientQueue path: clientWs is OPEN inside the
  handleUpgrade callback, so only client→upstream ever needs buffering.
- Clamp WS close reasons to 120 bytes (spec cap is 123); the ws library
  throws RangeError on overflow.
- Cover the new behavior with tests: pre-handshake stall timeout, auth
  token gating, accepted bearer, non-OpenAI 404, subprotocol forwarding.
- Document ws-proxy.js / openai-session.js modules and the
  CCXRAY_WS_IDLE_TIMEOUT_MS / CCXRAY_WS_MAX_QUEUE_BYTES tunables.
- Comment detectOpenAISession's intentional behavior: header session_id
  is honored even when parsedBody is null (covers WS upgrades and
  body-less HTTP retries).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ccxray accepts `?token=<AUTH_TOKEN>` as an alternative to the
`Authorization: Bearer` header (server/auth.js). The token was previously
preserved on the URL when ccxray forwarded the request upstream and when
it wrote `url` into log entries on disk and SSE broadcasts. That meant a
client authenticating with `?token=...` would send ccxray's secret to
OpenAI/Anthropic/ChatGPT on every request, and the same secret would be
persisted to `~/.ccxray/logs/{id}_req.json` indefinitely.

This was preexisting in the HTTP forward path. PR #29 added a WebSocket
upgrade path that inherited the same bug. Fix both with a single helper.

- Add server/url-sanitize.js: `stripAuthParams(url)` deletes ccxray's
  own auth query params (currently just `token`). Upstream API keys
  travel in Authorization headers, not query params, so this never
  affects upstream auth.
- Apply in server/forward.js at the upstream path build, console log,
  and three entry-record sites.
- Apply in server/ws-proxy.js at the upstream WS URL build and both
  entry/reqLog record sites.
- Test: 10 cases covering single param, mixed params, empty value,
  repeated params, substring-name protection, encoding round-trip, and
  non-string inputs.
PR #6 promoted `chatgpt_base_url` to a first-class launcher config
(injected on every codex spawn). The existing startup self-loop guard
in startServer() only checked the agent's primary upstream
(`UPSTREAMS[upstreamFamily]`), so a misconfigured `CHATGPT_BASE_URL`
pointing at ccxray would only emit a warn from `resolveChatGPTUpstream`
and then silently loop the proxy into itself — burning CPU until the
process was killed.

Extend the candidate list so it also includes `UPSTREAMS.openaiChatGPT`
when the source is user-configured (`CHATGPT_BASE_URL` or
`CODEX_CHATGPT_BASE_URL`), and skips it when the source is the built-in
default (`chatgpt.com:443`) which can never loop.

The override path (`--allow-upstream-loop` / `CCXRAY_ALLOW_UPSTREAM_LOOP`)
still applies; the guard refuses startup by default.

Tests:
- New: exits with helpful error when CHATGPT_BASE_URL self-loops.
- New: built-in ChatGPT default never triggers the guard.
…ader forwarding

Three integration tests that lock in behaviors verified during PR #33 sign-off:

- test/auth-token-strip.e2e.test.js — spawns ccxray with AUTH_TOKEN set against
  a fake Anthropic upstream, sends `?token=...&trace=keepme`, and asserts the
  secret never reaches the upstream URL, SSE broadcasts, disk entry logs, or
  console output, while non-auth params are preserved (covers a5d28f0).

- test/socket-error-survival.e2e.test.js — exercises both the client-abort
  mid-SSE path and the upstream `socket.destroy()` path against a slow fake
  upstream, asserting the proxy stays alive (follow-up probe returns 200) and
  stderr contains no uncaughtException trace (covers efd4a70).

- test/websocket-headers-forward.e2e.test.js — opens a real WebSocket through
  ccxray to a fake WS upstream with `chatgpt-account-id` set, and asserts the
  custom and openai-beta headers reach upstream intact, host is rewritten,
  and ChatGPT routing transforms `/v1/realtime` to `/backend-api/codex/realtime`
  (covers PR #29 + 0ff5507).

npm test: 480 → 483 pass, 0 fail.
Discovered during PR #33 verification: real codex CLI (ChatGPT-auth) sends
its main session traffic as a WebSocket upgrade on POST /v1/responses with
`openai-beta: responses_websockets=*`, not on /v1/realtime. Both paths are
already routed to the openai upstream by config.js, so this is a no-op for
runtime behavior, but the assumption that /v1/realtime is the primary codex
path is wrong and easy to encode into future routing changes.

Also note the `chatgpt-account-id` header routes to CHATGPT_BASE_URL via
getUpstreamForRequestAndHeaders, which is how ChatGPT-auth codex sessions
end up at chatgpt.com/backend-api/codex instead of api.openai.com.
@lis186
Copy link
Copy Markdown
Owner Author

lis186 commented May 22, 2026

PR #33 驗證證據

繁中摘要

PR #33codex-ws-merged,HEAD 06e0f56)涵蓋 Codex WebSocket transport、ChatGPT 路由、AUTH_TOKEN strip、self-loop guard 等多項變更。本次完整驗證了 4 個面向,全部通過:

  • D. AUTH_TOKEN strip?token=<secret> 在轉送上游、寫 SSE 廣播、寫磁碟 entry、印 console 四個位置全部被剝除;非 auth 的 query 參數(例如 ?trace=keepme)正確保留。
  • B. Socket 錯誤求生:client 中途 abort(模擬 codex Ctrl+C)+ upstream 中途 destroy TCP socket(模擬 Anthropic EPIPE),兩種情境下 proxy 都不會掛,後續 probe 仍 200,stderr 無 uncaughtException。
  • A. ChatGPT-auth WS upgrade
    • 自動化測試:ccxray 完整轉送 chatgpt-account-id + openai-beta + 任意自訂 header,並在 header 觸發下做 ChatGPT base URL 路徑改寫。
    • 真實 codex 端對端:實際 codex CLI 在主 WS upgrade 確實帶上 chatgpt-account-id: 1165093a-ccf9-4ee8-9099-35aed5043775(與 ~/.codex/auth.json 對得上),ccxray 完整路由到 wss://chatgpt.com:443/backend-api/codex/responses,所有 Codex 自家 headers(session-idthread-idoriginatorx-codex-*)也全數轉送。
  • C. Claude regression:透過 ccxray 跑真實 Claude 對話「請用 5 個字回答:你好嗎」→「我很好,謝謝你」。兩個 turn(title-gen subagent + 主對話)正確進磁碟;cost、maxContext: 1000000(commit 14f20f8 沒退步)、SSE 8 events、content-addressed dedup 全都正常。

副作用發現(影響未來開發):真實 codex 主流量走 POST /v1/responses + WS upgrade(搭配 openai-beta: responses_websockets=2026-02-06),不是 /v1/realtime。已在 CLAUDE.md「Agent Launching」段補上這個事實,避免下次改 routing 又踩同樣假設。

本次同時新增 3 個整合測試到 test/

檔案 對應驗證
test/auth-token-strip.e2e.test.js D
test/socket-error-survival.e2e.test.js B
test/websocket-headers-forward.e2e.test.js A (ccxray 端)

npm test 從 480 pass 增加到 483 pass / 0 fail,沒有任何退步。


Full evidence (English)

Branch codex-ws-merged @ 06e0f56. git status -sb: only the 4 deliberate changes
(CLAUDE.md note + 3 new test files). No leftover diag code in server/.

Test suite

  • Before this PR: 480 pass / 0 fail.
  • After integration tests added: 483 pass / 0 fail.

D — AUTH_TOKEN ?token= strip (commit a5d28f0)

Now covered by: test/auth-token-strip.e2e.test.js.

Original ad-hoc run output (still applicable):

[verify-D] upstream URL: /v1/messages?trace=keepme
[verify-D] SSE events received: 5 (entry-with-url seen: true)
[verify-D] scanned 3 files under TMP_HOME, clean: true

[verify-D] ✅ ALL ASSERTIONS PASSED
  • upstream URL: /v1/messages?trace=keepme
  • SSE events: 5 captured, no token
  • disk files: 3 scanned, clean
  • console: clean

Assertions covered (each enforced in the integration test):

  • Client ?token=<secret>&trace=keepme → upstream receives ?trace=keepme only.
  • SSE broadcasts of the entry never contain the secret.
  • No file under CCXRAY_HOME contains the secret.
  • ccxray's stdout (the 📤 [...] POST /v1/messages?trace=keepme log line) does not contain the secret.

B — Socket error survival (commit efd4a70)

Now covered by: test/socket-error-survival.e2e.test.js.

Original ad-hoc run output:

[verify-B] case-1 (client abort) outcome: aborted-after-receive
[verify-B] case-1 probe: {"statusCode":200,"len":7000}
[verify-B] case-2 (upstream destroy) outcome: ended after 608b
[verify-B] case-2 probe: {"statusCode":200,"len":7000}

[verify-B] ✅ ALL ASSERTIONS PASSED
  • client abort handled
  • follow-up request HTTP 200
  • no uncaughtException in stderr

Two scenarios, both in the integration test:

  • Case 1 — client destroys its socket mid-SSE (after 2 events). Proxy must
    not crash; a follow-up probe must succeed with HTTP 200.
  • Case 2 — upstream destroys its TCP socket after 3 SSE chunks (the EPIPE
    case the commit explicitly targets). ccxray translates that into a graceful
    client-side close (608 bytes delivered, then end). Probe still 200.
  • Process-level: child.exitCode === null after each case.
  • stderr is checked for absence of uncaughtException / stack trace patterns.

A — WebSocket chatgpt-account-id forwarding + ChatGPT routing (PR #29 + commit 0ff5507)

A.1 — ccxray-side automated

Now covered by: test/websocket-headers-forward.e2e.test.js.

Asserts:

  • chatgpt-account-id forwarded intact.
  • openai-beta forwarded.
  • Custom x-mark header forwarded (validates the general allowlist).
  • host rewritten to upstream hostname (buildWebSocketHeaders sets
    headers.host = upstream.host, no port).
  • Path transform: client /v1/realtime?... → upstream
    /backend-api/codex/realtime?... (ChatGPT routing fires because
    chatgpt-account-id is present; see config.js:167).

A.2 — real-codex end-to-end (manual, kept as evidence)

Ran node server/index.js --port 5588 codex in a separate terminal. Codex
returned a normal response to a trivial prompt. Diag instrumentation was
temporarily added to server/ws-proxy.js (then reverted) to capture the
actual upstream headers:

[2026-05-22T04:50:01.783Z] WS upgrade → wss://chatgpt.com:443/backend-api/codex/responses
  chatgpt-account-id: 1165093a-ccf9-4ee8-9099-35aed5043775
  openai-beta:        responses_websockets=2026-02-06
  authorization:      (present, Bearer eyJhbGciOiJSU...)
  client-url:         /v1/responses
  all-fwd-header-keys: authorization,chatgpt-account-id,host,openai-beta,originator,
                       session-id,session_id,thread-id,thread_id,user-agent,version,
                       x-client-request-id,x-codex-beta-features,x-codex-turn-metadata,
                       x-codex-window-id
  • chatgpt-account-id matches ~/.codex/auth.json.
  • Path: real codex uses /v1/responses (NOT /v1/realtime). This is a
    divergence from what A.1 originally assumed, and has been documented in
    CLAUDE.md to avoid future regressions when anyone touches the routing logic.
  • All Codex-specific headers propagated.
  • git diff server/ws-proxy.js is clean (diag fully reverted).

C — Claude regression (default path)

Ran node server/index.js --port 5588 claude with isolated CCXRAY_HOME.
Prompt: 請用 5 個字回答:你好嗎. Reply: 我很好,謝謝你 (exactly 5
characters).

Two entries written to <TMP_HOME>/logs/index.ndjson (title-gen subagent +
main turn). Key fields:

Entry isSubagent model usage maxContext status
13-49-28-130 true claude-opus-4-7 in:617 out:22 200,000 200
13-49-28-175 false claude-opus-4-7 in:6 out:14, cache_create:109,949 1,000,000 200
  • SSE: 8 events captured for the subagent turn, terminating in message_stop.
  • Cost computed: $0.6876 main + $0.003635 subagent.
  • maxContext: 1000000 on the main turn confirms commit 14f20f8 (1M-tier
    inference from [1m] marker in system prompt) still works.
  • Cache creation: 109,949 ephemeral_1h tokens written (system prompt + tools).
  • Shared content-addressed storage created 4 dedup files (sys_*, tools_*).
  • Session ID detected from request headers (sessionInferred: false).

Documentation update

CLAUDE.md### Agent Launching section, one new bullet:

Codex's main session traffic upgrades to a WebSocket on POST /v1/responses
(with openai-beta: responses_websockets=*), not /v1/realtime.
/v1/realtime exists for the older Realtime API but is not what current
codex uses for normal /goal / chat turns. When ChatGPT auth is active,
codex also sends chatgpt-account-id, which getUpstreamForRequestAndHeaders
(see server/config.js) uses to route to CHATGPT_BASE_URL instead of
OPENAI_BASE_URL.

Files in this verification batch

test/auth-token-strip.e2e.test.js          (new, +194 lines)
test/socket-error-survival.e2e.test.js     (new, +252 lines)
test/websocket-headers-forward.e2e.test.js (new, +138 lines)
CLAUDE.md                                  (modified, +1 line)

@lis186 lis186 marked this pull request as ready for review May 22, 2026 06:43
lis186 added a commit that referenced this pull request May 22, 2026
The "restoreFromLogs — maxContext re-inference for legacy entries" suite
(added in 14f20f8) seeds index.ndjson entries with hardcoded ids like
"2026-05-14T13-27-40-199". restoreFromLogs filters by RESTORE_DAYS (default
3 days, Asia/Taipei date string comparison), so the entries silently get
dropped from store.entries once we are more than ~3 days past the test's
authoring date — turning the suite green on the day it was written and
red the following week.

Override config.RESTORE_DAYS = 0 in the suite's before() hook and restore
it in after(). Avoids brittleness without making the production cutoff
configurable from outside, and keeps the rest of the suite intact.

Surfaced while validating PR #33 (the same merge result re-runs these
tests and fails identically).
Picks up:
  14f20f8  fix(context): infer 1M tier from usage when [1m] marker is missing
  efd4a70  fix(forward): handle late socket errors so the proxy survives EPIPE/ECONNRESET
  99fda2b  docs: add 1.9.3 changelog entry
  db636eb  fix(test): bypass RESTORE_DAYS filter in maxContext re-inference tests

The last commit unblocks CI on this PR: the maxContext re-inference test
suite from 14f20f8 uses hardcoded 2026-05-14 ids that fall outside the
default 3-day RESTORE_DAYS window once enough wall-clock time passes,
which causes the synthetic entries to be filtered out of store.entries
during restoreFromLogs and the assertions to fail. db636eb sets
RESTORE_DAYS=0 in the suite's before() hook.

npm test on the merge result: 498 pass / 0 fail.
@lis186 lis186 merged commit 1df4297 into main May 22, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants