fix(proxy): derive pool cooldown from the 429 reset window (headers + body) by nnemirovsky · Pull Request #50 · nnemirovsky/sluice

nnemirovsky · 2026-05-23T03:55:44Z

Problem

v0.19.0's B1 derives a pool member's cooldown from the upstream 429, but only from headers (Retry-After, x-ratelimit-reset*). Verified live: the OpenAI Codex usage-limit 429 carries no such header. The reset window is in the JSON body:

{"type":"usage_limit_reached","plan_type":"team","resets_at":1779508759,"resets_in_seconds":1357}

With no header, B1 fell back to the 60s default, so an exhausted member uncooled after 60s, the recovery monitor fired "recovered", the agent re-probed, 429 again. The pool cycled exhausted to recovered every ~60 to 75s and the operator notices kept coming (now as exhausted/recovered pairs instead of the old failover flap). The A1/A2 flap fix held, but the practical churn did not stop because the real window was never read.

What this does

Derive the cooldown from the 429 reset window across the conventions that AI providers and general rate limiters actually use, keeping the existing header behavior first.

Headers (tried in this order, Retry-After takes precedence per the IETF RateLimit draft):

Retry-After (delta-seconds or HTTP-date)
RateLimit-Reset (IETF, delta)
X-RateLimit-Reset (GitHub/Twitter epoch, others delta, disambiguated by magnitude)
X-RateLimit-Reset-After (Discord, delta)
OpenAI x-ratelimit-reset-requests / x-ratelimit-reset-tokens (unit-suffixed)
Anthropic anthropic-ratelimit-requests-reset / anthropic-ratelimit-tokens-reset (ISO-8601 timestamp)

Body (when no usable header), top level and nested under an error object:

resets_in_seconds, resets_at (OpenAI Codex)
retry_after (Discord and others), reset_after

Each value is guarded against NaN, Inf, negative, and overflow, and disambiguated epoch vs delta by magnitude. Header windows clamp to MaxCooldown (6h); body windows clamp to MaxUsageLimitCooldown (24h, since a usage-limit reset can be hours or days). Bodies over 64 KiB or non-JSON are skipped.

With this, the Codex team account (resets in ~22 min) governs pool recovery instead of a 60s re-probe, so the pool stays quietly exhausted until a member's real window lapses, then serves it. That also matches the earlier observation that the agent recovered on its own once an account's window reset.

Testing

go test ./... passes (2888), -race on the proxy and vault packages, gofumpt, go vet, golangci-lint clean, make generate no drift. New cases cover each header form (incl. epoch vs delta, ISO-8601, Retry-After precedence) and each body field (top-level and nested, float values floored, oversized/non-JSON ignored).

…no header is present

…ields, IETF/Discord/Anthropic/OpenAI)

nnemirovsky added 2 commits May 23, 2026 11:47

fix(proxy): derive pool cooldown from the 429 body reset window when …

80173e9

…no header is present

fix(proxy): broaden rate-limit reset detection (more headers + body f…

9657cb6

…ields, IETF/Discord/Anthropic/OpenAI)

nnemirovsky merged commit 7b66bdf into main May 23, 2026
6 checks passed

nnemirovsky deleted the pool-cooldown-from-429-body branch May 23, 2026 04:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(proxy): derive pool cooldown from the 429 reset window (headers + body)#50

fix(proxy): derive pool cooldown from the 429 reset window (headers + body)#50
nnemirovsky merged 2 commits into
mainfrom
pool-cooldown-from-429-body

nnemirovsky commented May 23, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

nnemirovsky commented May 23, 2026

Problem

What this does

Testing

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant