Skip to content

fix: stop background worker 401 storm when OAuth token expires#463

Merged
BYK merged 1 commit into
mainfrom
fix/auth-401-storm
May 23, 2026
Merged

fix: stop background worker 401 storm when OAuth token expires#463
BYK merged 1 commit into
mainfrom
fix/auth-401-storm

Conversation

@BYK
Copy link
Copy Markdown
Owner

@BYK BYK commented May 23, 2026

Problem

Sentry Issue: LOREAI-GATEWAY-Z — 19 users, 2,349 events

When a single user's OAuth bearer token expires, background workers (distillation, curation, consolidation) keep retrying every 30 seconds with the stale token. Each attempt generates a Sentry.captureException call, flooding Sentry with thousands of events.

Root Cause

PR #454 added a retry-once mechanism: mark the session credential stale → fall back to global → retry if credential changed. But in single-session OAuth setups (the typical Claude Code user), the session and global credentials are the same expired token. So:

  1. markAuthStale(sessionID) marks the session stale
  2. resolveAuth(sessionID) skips the stale session, falls through to getLastSeenAuth()
  3. The global holds the same expired token → credentialChanged = false
  4. Retry-once path is never taken → Sentry.captureException() fires
  5. Next 30s idle tick repeats the cycle

Fix (three layers)

  1. auth.tsresolveAuth() detects same-token fallback: When the stale session credential and global credential have the same value, return null instead of the expired global token. This lets callers know there's no usable credential available.

  2. idle.ts — skip background work on stale auth: Before scheduling idle work (distillation, curation, consolidation), check isAuthStale(sessionID) && !resolveAuth(sessionID). If auth is stale and no fresh credential is available, skip the session entirely. Auth refreshes when the next client request arrives.

  3. instrument.ts — filter auth errors in beforeSend: Add /Worker upstream auth error/ to TRANSIENT_ERROR_PATTERNS as defense-in-depth, suppressing any residual auth error events that slip through.

Tests

  • Added 2 new test cases for resolveAuth same-token detection
  • All existing auth tests pass (16 total)
  • Full suite: 1810 pass, 5 skip, 0 fail

@BYK BYK merged commit 543a53d into main May 23, 2026
7 checks passed
@BYK BYK deleted the fix/auth-401-storm branch May 23, 2026 22:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant