fix(scrapeguard): rate-limit Redis error logs to prevent log-storm#31
Merged
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
Deploying with
|
| Status | Name | Latest Commit | Updated (UTC) |
|---|---|---|---|
| ❌ Deployment failed View logs |
freighttools | f38e038 | May 14 2026, 06:29 PM |
7 tasks
…ring quota/connection failures Incident 2026-05-14 ~08:44 UTC: Upstash hit its 500K-command monthly cap on the free tier. Every ScrapeGuard call thereafter logged a full error line, producing ~1K Sentry events in 30 minutes before the plan was upgraded. ScrapeGuard already fails open under @upstash/ratelimit's default behaviour, so site availability was unaffected — only the log spam was the problem. Changes: - New lib/scrapeguard-log-sampler.ts: classifies Redis errors into a small allow-list (max_requests_limit, ECONNRESET, ETIMEDOUT, timeout, …) plus a fallback prefix-before-colon. Caps console.warn output at one line per error class per 60s. Adds a Sentry breadcrumb (category=scrapeguard.redis, level=warning) on every error — they don't burn event quota and give us low-volume volume-signal. - middleware.ts: every ScrapeGuard / KV catch block now routes through logRedisErrorSampled instead of console.error. Three call sites covered — handleScrapeProtection, tryBulkRefScrape, and the anon 25/day per-IP counter in handleApiRateLimit (same Upstash backend, same blast radius). Fail-open behaviour preserved — Redis failures continue to allow the request through, no new 429 paths. - scripts/test-scrapeguard-redis-sampling.mjs: regression test asserting exactly one console.warn across 100 sequential calls within 1s, that distinct error classes log separately, that the suppression window expires after 60s, that the Sentry breadcrumb fires on every error regardless of sampling, and that the fail-open wrapper returns "allow" (never 429) under Redis throw. - CHANGELOG.md + lib/changelog-data.ts: 2026-05-14 reliability entry. https://claude.ai/code/session_011vAcBgi6RcSQBMQafBCJ4H
f38e038 to
8015584
Compare
5 tasks
SoapyRED
added a commit
that referenced
this pull request
May 16, 2026
Bumps Last-updated 9 May → 16 May. Captures the 17 PRs landed across 2026-05-13..2026-05-16 (PR #25 through PR #41) plus the 14 May infra changes that didn't have their own PR (Cloudflare disconnect, Upstash PAYG, IndexNow live). Sections refreshed: - Sprint cadence 13–16 May (new): full PR list with one-liner per PR. - Platform: MCP v2.1.0 → v2.1.1; route count 36 → 38. - Infrastructure changes (new): CF Workers disconnected 14 May, CF DNS- only / Vercel firewall is sole edge security, Upstash PAYG $20 cap, CLAUDE.md at root encodes FAULT 5 + FAULT 14, IndexNow workflow live. - Data integrity status (new): table for ULD / Airlines / ADR / Containers / UN-LOCODE / HS / Vehicles / Customs-duty. ULD + Airlines + ADR verified: true; the other 5 verified: false pending allowlist extension (specific domains enumerated). - Scraper defence status (new): PR #31 / #32 / #33 / #38 live, Phases 3+4 deferred to runbook, Phase 2 skipped. - Edge firewall: scoped to Vercel-only (CF inert now). - Distribution surfaces: table with current download counts, Smithery score, MCP Registry STALE flag, Glama description STALE flag. - Weekly digest CLI (new): six FAULT 14 invariants summarised; points at scripts/weekly-digest/README.md for the full spec. - Vercel Analytics: 30-day baseline updated (3,311 visitors / 6,070 PV / 69% bounce / SG 73%). - First validated user signals: Tom (CEVA) preserved + Simon's team organic adoption added per 16 May report. - What's blocked / What's next / Red flags: updated to reflect today's reality — vehicles+customs SHIPPED (#39 #40), weekly digest SHIPPED (#41), Make.com Town Hall 21 May 4PM BST queued, CEVA→WFS transition complete with week 2 of induction pending. - Canonical references: added pointers to scripts/weekly-digest/ and the IndexNow workflow. No CHANGELOG entry — internal doc, not user-visible. Per the prompt. Co-authored-by: SoapyRED <soapyred@users.noreply.github.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Hardens ScrapeGuard middleware against Redis quota / connection errors after the 2026-05-14 Upstash incident (~1K Sentry events in 30 min from
[ScrapeGuard] Redis error:log spam once the free-tier 500K-command cap was hit).lib/scrapeguard-log-sampler.ts: module-level sampler. Classifies errors into a small allow-list (max_requests_limit,ECONNRESET,ETIMEDOUT,ECONNREFUSED,ENOTFOUND,EAI_AGAIN,timeout,unauthorized) with prefix-before-colon fallback. Capsconsole.warnat one line per class per 60s. Adds acategory: 'scrapeguard.redis'Sentry breadcrumb (warning level) on every error — breadcrumbs don't burn event quota, so we keep volume-signal visibility for free.middleware.ts: three Redis catch sites now route throughlogRedisErrorSampled—tryBulkRefScrape,handleScrapeProtection, and the anon 25/day per-IP counter inhandleApiRateLimit(same Upstash backend, same blast radius). Fail-open behaviour preserved at every site; no new 429 paths.scripts/test-scrapeguard-redis-sampling.mjs: regression test (run vianode --experimental-strip-types --no-warnings, same pattern assentry-redact-smoke.mjs). 19 assertions: classification, 100 sequential calls → 1 warn within 1s, suppression window expires at +60s, distinct classes log separately, breadcrumb fires 100/100 regardless of sampling, fail-open wrapper returns"allow"(never 429) under 100 Redis throws.CHANGELOG.md+lib/changelog-data.ts: 2026-05-14 Reliability entry. Verified rendered on/changelogin thenext buildoutput.Incident note: 2026-05-14 Upstash quota hit.
Test plan
npx tsc --noEmitcleannpm run lint— same 49 problems / 14 errors as the pre-change baseline (verified bygit stash+ re-run); zero new findings in any touched filenode scripts/test-uld-integrity.mjs— passnode scripts/test-prefix-lookup.mjs— passnode --experimental-strip-types --no-warnings scripts/test-scrapeguard-redis-sampling.mjs— 19/19 passnpx next build— succeeds;/changelogstatic HTML contains the new May 14 entryscripts/smoke-test.mjs http://localhost:3030— 30/35 (same 5 failures as pre-change branch tip: all stem fromKV_REST_API_URLbeing unset locally, which makeshandleApiRateLimitbail early before emitting rate-limit headers / before the empty-X-API-Key 401)FAULT 5 checklist
.next/server/app/changelog.htmlfromnext build)https://claude.ai/code/session_011vAcBgi6RcSQBMQafBCJ4H
Generated by Claude Code