Skip to content

fix(scrapeguard): rate-limit Redis error logs to prevent log-storm#31

Merged
SoapyRED merged 1 commit into
mainfrom
claude/fix-redis-quota-errors-iezMF
May 15, 2026
Merged

fix(scrapeguard): rate-limit Redis error logs to prevent log-storm#31
SoapyRED merged 1 commit into
mainfrom
claude/fix-redis-quota-errors-iezMF

Conversation

@SoapyRED
Copy link
Copy Markdown
Owner

Summary

Hardens ScrapeGuard middleware against Redis quota / connection errors after the 2026-05-14 Upstash incident (~1K Sentry events in 30 min from [ScrapeGuard] Redis error: log spam once the free-tier 500K-command cap was hit).

  • New lib/scrapeguard-log-sampler.ts: module-level sampler. Classifies errors into a small allow-list (max_requests_limit, ECONNRESET, ETIMEDOUT, ECONNREFUSED, ENOTFOUND, EAI_AGAIN, timeout, unauthorized) with prefix-before-colon fallback. Caps console.warn at one line per class per 60s. Adds a category: 'scrapeguard.redis' Sentry breadcrumb (warning level) on every error — breadcrumbs don't burn event quota, so we keep volume-signal visibility for free.
  • middleware.ts: three Redis catch sites now route through logRedisErrorSampledtryBulkRefScrape, handleScrapeProtection, and the anon 25/day per-IP counter in handleApiRateLimit (same Upstash backend, same blast radius). Fail-open behaviour preserved at every site; no new 429 paths.
  • scripts/test-scrapeguard-redis-sampling.mjs: regression test (run via node --experimental-strip-types --no-warnings, same pattern as sentry-redact-smoke.mjs). 19 assertions: classification, 100 sequential calls → 1 warn within 1s, suppression window expires at +60s, distinct classes log separately, breadcrumb fires 100/100 regardless of sampling, fail-open wrapper returns "allow" (never 429) under 100 Redis throws.
  • CHANGELOG.md + lib/changelog-data.ts: 2026-05-14 Reliability entry. Verified rendered on /changelog in the next build output.

Incident note: 2026-05-14 Upstash quota hit.

Test plan

  • npx tsc --noEmit clean
  • npm run lint — same 49 problems / 14 errors as the pre-change baseline (verified by git stash + re-run); zero new findings in any touched file
  • node scripts/test-uld-integrity.mjs — pass
  • node scripts/test-prefix-lookup.mjs — pass
  • node --experimental-strip-types --no-warnings scripts/test-scrapeguard-redis-sampling.mjs — 19/19 pass
  • npx next build — succeeds; /changelog static HTML contains the new May 14 entry
  • scripts/smoke-test.mjs http://localhost:3030 — 30/35 (same 5 failures as pre-change branch tip: all stem from KV_REST_API_URL being unset locally, which makes handleApiRateLimit bail early before emitting rate-limit headers / before the empty-X-API-Key 401)
  • Verify in preview that a forced Redis throw still produces a 2xx + only one warn per minute (requires preview env with a stubbable Redis)

FAULT 5 checklist

  • siteStats.ts — N/A (no endpoint/tool count change)
  • sitemap.xml — N/A (no new URLs)
  • OpenAPI spec — N/A (no API contract change)
  • /api-docs — N/A
  • nav dropdown — N/A
  • homepage grid — N/A
  • CHANGELOG.md — YES (2026-05-14 entry added)
  • /changelog page render — YES (confirmed in .next/server/app/changelog.html from next build)
  • MCP registration — N/A
  • footer — N/A
  • GitHub README — N/A
  • npm bump — N/A (MCP package unaffected)
  • Postman — N/A
  • 200-word page minimum — N/A
  • Bing+Google sitemap ping — N/A

https://claude.ai/code/session_011vAcBgi6RcSQBMQafBCJ4H


Generated by Claude Code

@vercel
Copy link
Copy Markdown

vercel Bot commented May 14, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
freighttools Ready Ready Preview, Comment May 14, 2026 6:28pm

Request Review

@cloudflare-workers-and-pages
Copy link
Copy Markdown

Deploying with  Cloudflare Workers  Cloudflare Workers

The latest updates on your project. Learn more about integrating Git with Workers.

Status Name Latest Commit Updated (UTC)
❌ Deployment failed
View logs
freighttools f38e038 May 14 2026, 06:29 PM

…ring quota/connection failures

Incident 2026-05-14 ~08:44 UTC: Upstash hit its 500K-command monthly cap
on the free tier. Every ScrapeGuard call thereafter logged a full error
line, producing ~1K Sentry events in 30 minutes before the plan was
upgraded. ScrapeGuard already fails open under @upstash/ratelimit's
default behaviour, so site availability was unaffected — only the log
spam was the problem.

Changes:
- New lib/scrapeguard-log-sampler.ts: classifies Redis errors into a
  small allow-list (max_requests_limit, ECONNRESET, ETIMEDOUT, timeout,
  …) plus a fallback prefix-before-colon. Caps console.warn output at
  one line per error class per 60s. Adds a Sentry breadcrumb
  (category=scrapeguard.redis, level=warning) on every error — they
  don't burn event quota and give us low-volume volume-signal.
- middleware.ts: every ScrapeGuard / KV catch block now routes through
  logRedisErrorSampled instead of console.error. Three call sites
  covered — handleScrapeProtection, tryBulkRefScrape, and the anon
  25/day per-IP counter in handleApiRateLimit (same Upstash backend,
  same blast radius). Fail-open behaviour preserved — Redis failures
  continue to allow the request through, no new 429 paths.
- scripts/test-scrapeguard-redis-sampling.mjs: regression test asserting
  exactly one console.warn across 100 sequential calls within 1s, that
  distinct error classes log separately, that the suppression window
  expires after 60s, that the Sentry breadcrumb fires on every error
  regardless of sampling, and that the fail-open wrapper returns
  "allow" (never 429) under Redis throw.
- CHANGELOG.md + lib/changelog-data.ts: 2026-05-14 reliability entry.

https://claude.ai/code/session_011vAcBgi6RcSQBMQafBCJ4H
@SoapyRED SoapyRED force-pushed the claude/fix-redis-quota-errors-iezMF branch from f38e038 to 8015584 Compare May 15, 2026 03:32
@SoapyRED SoapyRED marked this pull request as ready for review May 15, 2026 03:40
@SoapyRED SoapyRED merged commit 6f3d1b0 into main May 15, 2026
SoapyRED added a commit that referenced this pull request May 16, 2026
Bumps Last-updated 9 May → 16 May. Captures the 17 PRs landed across
2026-05-13..2026-05-16 (PR #25 through PR #41) plus the 14 May infra
changes that didn't have their own PR (Cloudflare disconnect, Upstash
PAYG, IndexNow live).

Sections refreshed:
- Sprint cadence 13–16 May (new): full PR list with one-liner per PR.
- Platform: MCP v2.1.0 → v2.1.1; route count 36 → 38.
- Infrastructure changes (new): CF Workers disconnected 14 May, CF DNS-
  only / Vercel firewall is sole edge security, Upstash PAYG $20 cap,
  CLAUDE.md at root encodes FAULT 5 + FAULT 14, IndexNow workflow live.
- Data integrity status (new): table for ULD / Airlines / ADR / Containers
  / UN-LOCODE / HS / Vehicles / Customs-duty. ULD + Airlines + ADR
  verified: true; the other 5 verified: false pending allowlist
  extension (specific domains enumerated).
- Scraper defence status (new): PR #31 / #32 / #33 / #38 live, Phases
  3+4 deferred to runbook, Phase 2 skipped.
- Edge firewall: scoped to Vercel-only (CF inert now).
- Distribution surfaces: table with current download counts, Smithery
  score, MCP Registry STALE flag, Glama description STALE flag.
- Weekly digest CLI (new): six FAULT 14 invariants summarised; points
  at scripts/weekly-digest/README.md for the full spec.
- Vercel Analytics: 30-day baseline updated (3,311 visitors / 6,070
  PV / 69% bounce / SG 73%).
- First validated user signals: Tom (CEVA) preserved + Simon's team
  organic adoption added per 16 May report.
- What's blocked / What's next / Red flags: updated to reflect today's
  reality — vehicles+customs SHIPPED (#39 #40), weekly digest SHIPPED
  (#41), Make.com Town Hall 21 May 4PM BST queued, CEVA→WFS transition
  complete with week 2 of induction pending.
- Canonical references: added pointers to scripts/weekly-digest/ and
  the IndexNow workflow.

No CHANGELOG entry — internal doc, not user-visible. Per the prompt.

Co-authored-by: SoapyRED <soapyred@users.noreply.github.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants