You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Date: 2026-07-02 Gateway:https://gateway.us.posthog.com/posthog_code (US region) Client: Anthropic TypeScript SDK (v0.91.x) over Bun's fetch, streaming SSE from /v1/messages Models:claude-fable-5 (primary), claude-opus-4-8
All findings below were reproduced today against the live gateway with an OAuth access token from the PostHog Code beta. Repro commands use $POSTHOG_TOKEN as a placeholder.
Finding 1 — Anthropic ping keepalives are not forwarded → mid-stream TCP kills
Symptom
During long silent stretches of a streaming response (typically extended thinking on claude-fable-5 / Opus, where Anthropic emits no content deltas for tens of seconds), the TCP connection is closed by an intermediary. The client surfaces:
Error: The socket connection was closed unexpectedly. For more information,
pass `verbose: true` in the second argument to fetch()
This happens mid-response — after message_start and often after partial thinking/text has streamed — so it can't be handled as a simple pre-request connection retry.
Corroboration: the official PostHog Code app shows the same stall
This is not client-specific. In the official PostHog Code desktop app, on the same account and models, we regularly see turns where nothing arrives for 2–3 minutes: the UI stays in its "working" state with no output and no error, then eventually recovers or produces a fresh answer. That is exactly the signature of this failure absorbed silently — the stream dies (or goes irrecoverably quiet) mid-turn, and the client retries/re-issues the request without surfacing anything to the user. The user experience is a silent multi-minute hang; the API-integrator experience is the raw socket error above. Same gateway path, same root cause, two presentations.
If you can correlate server-side: look for connections on /v1/messages closed upstream-idle after ~60–120s of zero payload bytes during active turns, paired with a same-conversation retry request arriving seconds later.
What we believe is happening
api.anthropic.com emits SSE ping events during quiet periods precisely to keep intermediaries from idle-killing the connection.
The gateway does not forward these ping events to the client (we have never observed one in captured SSE traffic through the gateway, while they are routine against the official endpoint).
With zero bytes flowing, an LB/proxy on the gateway path enforces an idle timeout and resets the connection. The longer the model thinks, the higher the probability of a kill.
Impact
Long-thinking turns die non-deterministically. Anything that raises client-side stream-idle tolerances (which integrators must do anyway, because without pings a quiet-but-alive stream is indistinguishable from a dead one) makes the raw socket error more likely to surface instead of a clean client timeout.
The failure text is transport-level and vendor-specific (Bun/undici/node each produce different strings), so generic retry classifiers frequently treat it as fatal rather than transient. Integrators each have to discover and special-case it.
Where a client absorbs the failure with a silent retry (as the official app appears to), the user pays twice: minutes of dead air on the UI, and the partial turn's input tokens re-billed on the retry.
Suggested fix
Forward Anthropic's ping events verbatim, or synthesize an SSE comment (: keepalive\n\n) every ~15–30s of upstream silence. Either keeps the connection warm end-to-end and is invisible to SSE parsers.
Client-side mitigation we applied (works, but shouldn't be needed)
We now wrap the response stream and reclassify mid-stream socket deaths as transient connection errors so our retry layer restarts the turn. That re-bills the partial turn's tokens on every retry — server-side keepalives would eliminate both the failure and the re-billing.
Finding 2 — Request body is schema-filtered: unknown params silently dropped → Anthropic betas unusable
Symptom
The gateway strips top-level request-body parameters it doesn't recognize before forwarding to Anthropic, instead of passing them through or rejecting them. This silently disables Anthropic beta features that are negotiated via new body params + anthropic-beta header.
Concrete case: server-side fallback (server-side-fallback-2026-06-01, refusals-and-fallback docs) — the feature that retries a Fable-5 classifier refusal on another model inside one API call.
Evidence (all reproduced 2026-07-02)
The key probe: a fallbacks chain that is guaranteed to 400 on api.anthropic.com (fallback model identical to the primary — the API requires distinct entries) returns 200 through the gateway. The only explanation is that fallbacks never reaches Anthropic.
#
Request
Expected (direct Anthropic)
Observed via gateway
1
fallbacks: [{model: <same as primary>}] + anthropic-beta: server-side-fallback-2026-06-01
400 "must be distinct"
200, normal completion
2
unknown top-level param frobnicate: true, no beta header
curl -sS "https://gateway.us.posthog.com/posthog_code/v1/messages?beta=true" \
-H "x-api-key: $POSTHOG_TOKEN" \
-H "anthropic-version: 2023-06-01" \
-H "anthropic-beta: server-side-fallback-2026-06-01" \
-H "content-type: application/json" \
-d '{ "model": "claude-fable-5", "max_tokens": 16, "fallbacks": [{"model": "claude-fable-5"}], "messages": [{"role": "user", "content": "Say ok"}] }'# → 200 with a normal assistant message. Direct to api.anthropic.com this is a hard 400.
Note the anthropic-betaheader appears to pass through fine (no error, no behavior change) — it's specifically the body param that's filtered.
Impact
Server-side fallback cannot be used through the gateway at all. Given Fable-5's classifier refusals are exactly the problem your own PR PostHog/code#3078 addresses client-side (SDK fallbackModel), forwarding fallbacks would give every API integrator the same protection in one round trip, with Anthropic's fallback-credit billing (a pre-output refusal attempt costs nothing) instead of a full client-side re-send.
More generally: any future Anthropic beta that introduces body params will be silently broken through the gateway until each param is individually whitelisted. Silent dropping is the worst failure mode — a 400 would at least tell integrators the feature is unsupported.
Suggested fix
Whitelist fallbacks (and ideally adopt pass-through-with-denylist rather than parse-and-rebuild-with-allowlist for /v1/messages bodies). If filtering must stay, return a 4xx or a warning header when a param is dropped.
When a request contains Anthropic's context_management body param (e.g. {"edits":[{"type":"clear_thinking_20251015"}]}) alongside thinking: {"type":"adaptive"}, the gateway forwards context_management but drops the paired thinking field. Anthropic then rejects the request:
400 ... `clear_thinking_20251015` strategy requires thinking to be enabled or adaptive
So the body filtering is not only dropping unknown params (Finding 2) — for at least one known param it forwards the param while dropping a field it depends on. We currently strip context_management client-side to avoid the 400. Consistent pass-through would fix this too.
Also worth noting: unsigned thinking blocks
Thinking blocks returned through the gateway carry an empty signature (signature: ""), while the official endpoint returns signed blocks. Replaying such a block to Anthropic on the next turn produces a 400, so integrators must strip or downgrade thinking blocks in multi-turn conversations. If the gateway is re-serializing responses, preserving the original signature bytes would restore standard multi-turn replay behavior.
Summary of asks, in priority order
Keepalives: forward Anthropic ping SSE events (or inject SSE comments) so idle LB timeouts stop killing long thinking turns mid-stream.
fallbacks passthrough: whitelist the fallbacks body param + server-side-fallback-2026-06-01 beta so classifier refusals can fall back server-side (the API-integrator counterpart of feat(agent): add refusal handling and model fallback #3078).
Body filtering policy: prefer pass-through; if filtering stays, fail loudly instead of silently dropping params, and forward context_management together with its paired thinking field.
Thinking signatures: preserve signature on thinking blocks for standard multi-turn replay.
PostHog Code gateway — API integrator report: mid-stream socket drops & Anthropic beta passthrough
Date: 2026-07-02
Gateway:
https://gateway.us.posthog.com/posthog_code(US region)Client: Anthropic TypeScript SDK (v0.91.x) over Bun's
fetch, streaming SSE from/v1/messagesModels:
claude-fable-5(primary),claude-opus-4-8All findings below were reproduced today against the live gateway with an OAuth access token from the PostHog Code beta. Repro commands use
$POSTHOG_TOKENas a placeholder.Finding 1 — Anthropic
pingkeepalives are not forwarded → mid-stream TCP killsSymptom
During long silent stretches of a streaming response (typically extended thinking on
claude-fable-5/ Opus, where Anthropic emits no content deltas for tens of seconds), the TCP connection is closed by an intermediary. The client surfaces:This happens mid-response — after
message_startand often after partial thinking/text has streamed — so it can't be handled as a simple pre-request connection retry.Corroboration: the official PostHog Code app shows the same stall
This is not client-specific. In the official PostHog Code desktop app, on the same account and models, we regularly see turns where nothing arrives for 2–3 minutes: the UI stays in its "working" state with no output and no error, then eventually recovers or produces a fresh answer. That is exactly the signature of this failure absorbed silently — the stream dies (or goes irrecoverably quiet) mid-turn, and the client retries/re-issues the request without surfacing anything to the user. The user experience is a silent multi-minute hang; the API-integrator experience is the raw socket error above. Same gateway path, same root cause, two presentations.
If you can correlate server-side: look for connections on
/v1/messagesclosed upstream-idle after ~60–120s of zero payload bytes during active turns, paired with a same-conversation retry request arriving seconds later.What we believe is happening
api.anthropic.comemits SSEpingevents during quiet periods precisely to keep intermediaries from idle-killing the connection.pingevents to the client (we have never observed one in captured SSE traffic through the gateway, while they are routine against the official endpoint).Impact
Suggested fix
Forward Anthropic's
pingevents verbatim, or synthesize an SSE comment (: keepalive\n\n) every ~15–30s of upstream silence. Either keeps the connection warm end-to-end and is invisible to SSE parsers.Client-side mitigation we applied (works, but shouldn't be needed)
We now wrap the response stream and reclassify mid-stream socket deaths as transient connection errors so our retry layer restarts the turn. That re-bills the partial turn's tokens on every retry — server-side keepalives would eliminate both the failure and the re-billing.
Finding 2 — Request body is schema-filtered: unknown params silently dropped → Anthropic betas unusable
Symptom
The gateway strips top-level request-body parameters it doesn't recognize before forwarding to Anthropic, instead of passing them through or rejecting them. This silently disables Anthropic beta features that are negotiated via new body params +
anthropic-betaheader.Concrete case: server-side fallback (
server-side-fallback-2026-06-01, refusals-and-fallback docs) — the feature that retries a Fable-5 classifier refusal on another model inside one API call.Evidence (all reproduced 2026-07-02)
The key probe: a
fallbackschain that is guaranteed to 400 onapi.anthropic.com(fallback model identical to the primary — the API requires distinct entries) returns 200 through the gateway. The only explanation is thatfallbacksnever reaches Anthropic.fallbacks: [{model: <same as primary>}]+anthropic-beta: server-side-fallback-2026-06-01frobnicate: true, no beta headerfallbacks: "bogus"(wrong type) + beta header?beta=truequery paramRepro for probe 1:
Note the
anthropic-betaheader appears to pass through fine (no error, no behavior change) — it's specifically the body param that's filtered.Impact
fallbackModel), forwardingfallbackswould give every API integrator the same protection in one round trip, with Anthropic's fallback-credit billing (a pre-output refusal attempt costs nothing) instead of a full client-side re-send.Suggested fix
Whitelist
fallbacks(and ideally adopt pass-through-with-denylist rather than parse-and-rebuild-with-allowlist for/v1/messagesbodies). If filtering must stay, return a 4xx or a warning header when a param is dropped.Finding 3 (related, previously observed) —
context_managementis half-parsed → upstream 400When a request contains Anthropic's
context_managementbody param (e.g.{"edits":[{"type":"clear_thinking_20251015"}]}) alongsidethinking: {"type":"adaptive"}, the gateway forwardscontext_managementbut drops the pairedthinkingfield. Anthropic then rejects the request:So the body filtering is not only dropping unknown params (Finding 2) — for at least one known param it forwards the param while dropping a field it depends on. We currently strip
context_managementclient-side to avoid the 400. Consistent pass-through would fix this too.Also worth noting: unsigned thinking blocks
Thinking blocks returned through the gateway carry an empty
signature(signature: ""), while the official endpoint returns signed blocks. Replaying such a block to Anthropic on the next turn produces a 400, so integrators must strip or downgrade thinking blocks in multi-turn conversations. If the gateway is re-serializing responses, preserving the original signature bytes would restore standard multi-turn replay behavior.Summary of asks, in priority order
pingSSE events (or inject SSE comments) so idle LB timeouts stop killing long thinking turns mid-stream.fallbackspassthrough: whitelist thefallbacksbody param +server-side-fallback-2026-06-01beta so classifier refusals can fall back server-side (the API-integrator counterpart of feat(agent): add refusal handling and model fallback #3078).context_managementtogether with its pairedthinkingfield.signatureon thinking blocks for standard multi-turn replay.