Skip to content

fix(relay): prevent Polymarket OOM crash via request deduplication#513

Merged
koala73 merged 1 commit intomainfrom
fix/polymarket-relay-oom
Feb 28, 2026
Merged

fix(relay): prevent Polymarket OOM crash via request deduplication#513
koala73 merged 1 commit intomainfrom
fix/polymarket-relay-oom

Conversation

@koala73
Copy link
Owner

@koala73 koala73 commented Feb 28, 2026

Summary

Railway relay crashed with OOM (4.1GB heap) due to Polymarket request flooding:

  • Root cause: No request deduplication — each concurrent cache miss fired an independent https.get() to Gamma API. With 12 categories × multiple clients, 740 requests piled up in 10s, all buffering response bodies simultaneously.
  • Fix: In-flight promise map (polymarketInflight) coalesces concurrent requests for the same cache key into a single upstream fetch. 429/error responses are negative-cached for 30s to prevent retry storms.

Changes

  • scripts/ais-relay.cjs — extracted fetchPolymarketUpstream() returning a Promise, added polymarketInflight Map for dedup, added 30s negative cache on failure

Test plan

  • Deploy to Railway → verify no OOM crash under load
  • Monitor logs: should see single MISS per cache key instead of hundreds
  • Verify Polymarket data still loads in the prediction markets panel
  • Check diagnostics endpoint shows polymarketInflight: 0 at rest

Concurrent Polymarket requests for the same cache key each fired
independent https.get() calls. With 12 categories × multiple clients,
740 requests piled up in 10s, all buffering response bodies → 4.1GB
heap → OOM crash on Railway.

Fix: in-flight promise map deduplicates concurrent requests to the
same cache key. 429/error responses are negative-cached for 30s to
prevent retry storms.
@vercel
Copy link

vercel bot commented Feb 28, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
worldmonitor Building Building Preview, Comment Feb 28, 2026 8:57am
worldmonitor-finance Building Building Preview, Comment Feb 28, 2026 8:57am
worldmonitor-happy Building Building Preview, Comment Feb 28, 2026 8:57am
worldmonitor-startup Building Building Preview, Comment Feb 28, 2026 8:57am

Request Review

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 884fe3d8d8

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

if (response.statusCode !== 200) {
console.error(`[Relay] Polymarket upstream ${response.statusCode}`);
response.resume();
polymarketCache.set(cacheKey, { data: '[]', timestamp: Date.now() - POLYMARKET_CACHE_TTL_MS + POLYMARKET_NEG_TTL_MS });

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Avoid replacing stale cache with empty payload

When the upstream call fails, this line unconditionally overwrites polymarketCache with [], so any previously cached market data is lost for subsequent requests. In the failure path, only the request already holding cached can return stale data; the next request hits this new entry and serves an empty list instead, even though stale data existed moments earlier. This creates user-visible data dropouts during transient 429/5xx periods.

Useful? React with 👍 / 👎.

if (response.statusCode !== 200) {
console.error(`[Relay] Polymarket upstream ${response.statusCode}`);
response.resume();
polymarketCache.set(cacheKey, { data: '[]', timestamp: Date.now() - POLYMARKET_CACHE_TTL_MS + POLYMARKET_NEG_TTL_MS });

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Prevent public 120s caching of negative-cache responses

Encoding negative cache by backdating timestamp makes the entry look like a normal cache hit, so handlePolymarketRequest serves it with Cache-Control: public, max-age=120. Although the in-process entry is intended to last 30s, downstream CDN/browser caches can retain the empty [] response for 120s, extending outage impact beyond the intended retry-suppression window.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant