Add per-IP rate limit and body size caps to relay endpoints#39
Merged
Conversation
Before this change /subscribe and /n/<id> had no per-IP throttling, no body cap, and a bare json.NewDecoder on the request body. Three concrete attacks per issue #6: 1. Subscriber-spam — POST random device tokens to /subscribe → unbounded Firestore writes → operator billing exposure. 2. Notify-URL pwning — anyone holding a leaked notify URL spams pushes, each one costing a Cloud Run invocation + Firestore write + APNs call. 3. Memory pressure — a multi-MiB JSON body could OOM the 256Mi Cloud Run instance before the decoder finishes parsing. Mitigations land here: - New rate.go owns an in-process per-IP token bucket (`golang.org/x/time/rate`, 5 rps with burst 10). Stale entries (>10min lastSeen) are swept every 5min. The /subscribe and /n/<id> routes register through `limiter.middleware`; over-cap returns 429. - Cloud Run's max-instances=3 means the actual ceiling is 15 rps total, not 5 — fine for the abuse vectors, which need orders of magnitude more to be effective. `clientIP` reads the leftmost X-Forwarded-For entry (Cloud Run sets this) and falls back to r.RemoteAddr. - http.MaxBytesReader wraps r.Body in both handlers: 1 KiB on /subscribe (deviceToken is ~64 hex chars), 4 KiB on /n/<id> (APNs payload max). Over-cap now returns 413 (RequestEntityTooLarge) rather than falling through to the generic 400 "bad body". Tests (relay_test.go) cover: - /subscribe and /n/<id> with 2 MiB valid-JSON bodies return 413 (junk-byte garbage trips a syntax error before MaxBytesReader gets a chance, so the test builds realistic oversized JSON instead). - 10 rapid requests from the same source IP yield both 200s (burst capacity) and 429s (after exhaustion). - Two different source IPs maintain independent buckets — one being throttled doesn't affect the other. - clientIP parses X-Forwarded-For chains correctly and falls back to r.RemoteAddr when the header is absent. Notify-URL rotation (the issue's third suggested fix) is deliberately deferred: it's the largest scope-per-payoff piece (Store interface change + iOS UI button + confirmation flow + agent re-setup prompt re-copy) for the smallest threat (a leaked URL can be spammed but the rate limit + body cap already cap the damage). Recovery today remains uninstall + reinstall. Filed as a follow-up issue. Closes #6. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
rate.goimplements an in-process token-bucket limiter viagolang.org/x/time/rate(5 rps, burst 10). Stale entries (>10 min lastSeen) get swept every 5 min./subscribeand/n/<id>register through the middleware; over-cap returns 429. Cloud Run's max-instances=3 means the effective ceiling is 15 rps total — fine for the abuse vectors (subscriber spam, leaked-URL spam), which require orders of magnitude more to be effective.clientIPreads the leftmostX-Forwarded-Forentry (Cloud Run sets this) and falls back tor.RemoteAddrfor local/direct requests.http.MaxBytesReaderwrapsr.Bodyin both handlers: 1 KiB on/subscribe, 4 KiB on/n/<id>(APNs payload max). Over-cap returns 413 instead of falling through to the generic 400.Deliberately deferred (filed as #38)
Notify-URL rotation. Largest scope-per-payoff (Store interface change + iOS button + confirmation flow + agent re-setup prompt re-copy) for the smallest threat — rate limit + body cap already cap the damage from a leaked URL. Recovery today remains uninstall+reinstall. Tracked as a follow-up.
Test plan
go test ./...fromserver/sshido-relay/— 7/7 pass:TestSubscribeBodyCap,TestNotifyBodyCap— 2 MiB valid-JSON bodies return 413TestSubscribeHappyPath— normal flow still returns 200 with the expected body shapeTestRateLimitReturns429— 10 rapid requests yield a mix of 200 (burst) and 429 (throttled)TestRateLimitIsPerIP— independent buckets per source IPTestClientIPParsesXForwardedFor,TestClientIPFallsBackToRemoteAddr— IP extractionfor i in $(seq 1 20); do curl -s -o /dev/null -w "%{http_code}\n" -X POST https://push.sshido.com/n/nonexistent; done | sort | uniq -cshould show some 429s; a 2 MiB body to/subscribeshould return 413.Implementation notes
{"deviceToken": "<2MiB of a>"}) rather than junk bytes: a bareaaaa...body trips the JSON syntax error before MaxBytesReader gets a chance to fire.newIPLimiter. The sweep tick is 5 minutes, TTL 10 minutes — so an idle IP's entry gets garbage-collected within ~15 min. Fine for memory; Cloud Run instances rarely live long enough to accumulate problem-scale state.Closes #6.
🤖 Generated with Claude Code