Skip to content

fix(auth): close CSRF on /auth/email/start + per-IP magic-link rate-limit (P0)#177

Merged
mastermanas805 merged 7 commits into
masterfrom
fix/auth-csrf-and-rate-limit
May 29, 2026
Merged

fix(auth): close CSRF on /auth/email/start + per-IP magic-link rate-limit (P0)#177
mastermanas805 merged 7 commits into
masterfrom
fix/auth-csrf-and-rate-limit

Conversation

@mastermanas805
Copy link
Copy Markdown
Member

Summary

Closes two Auth P0 findings on the magic-link abuse surface.

Finding Severity Behaviour before After
AUTH-163 P0 urlencoded body → 202 + DB row (CSRF spam) 400 invalid_content_type
AUTH-107 / AUTH-097 P0 / P1 No per-IP magic-link rate-limit 5/hr/IP, silent-absorb on overflow

Why no CAPTCHA on this PR

AUTH-107 ("No CAPTCHA on /auth/email/start") was flagged P0. CAPTCHA on
a non-deliverable email path (Brevo sender currently unvalidated, see
project memory project_brevo_sender_not_validated.md) just frustrates
legitimate test traffic. Per-IP rate-limit closes the same abuse vector
(magic-link spam) without forcing a UX regression on real users.

Fix details

AUTH-163: Content-Type fail-closed

POST /auth/email/start now requires Content-Type: application/json
(charset suffix permitted: application/json; charset=utf-8 is accepted).
Form-encoded bodies are rejected with 400 invalid_content_type
BEFORE the body is parsed or the DB is touched.

The legitimate dashboard / CLI / agent flows all set application/json
— the only callers that send urlencoded bodies are third-party-origin
browser <form> POSTs, which is exactly the CSRF primitive we're
closing.

AUTH-107 / AUTH-097: per-IP rate-limit

5 calls per hour per source IP (matches the existing per-email cap so
the two budgets behave symmetrically in NR/Prom dashboards). Keyed by
SHA-256(c.IP()) under ml:ip:rl:* so raw IPs never appear in Redis
MONITOR / key dumps / logs.

  • Runs after email validation (so a typo doesn't burn budget).
  • Runs before the per-email limit (so it fires even when the per-email budget is fresh).
  • Fail-open on Redis error per CLAUDE.md convention 1.
  • Limited path returns 202 — identical to the success path, no enumeration signal.
  • Operator visibility via metrics.MagicLinkEmailRateLimited counter + magic_link.start.ip_rate_limited structured log.

Regression tests

  • TestAuthEmailStart_RejectsFormUrlencoded — AUTH-163 reproducer + fix
  • TestAuthEmailStart_AcceptsJSONWithCharset — guardrail, legitimate dashboard charset suffix
  • TestAuthEmailStart_PerIPRateLimit — function-level boundary AND handler-level wiring (pre-populates Redis to the cap, asserts the (cap+1)th call hits the silent-absorb 202 path)

Surface checklist (rule 22)

  • api/internal/handlers/magic_link.go — handler
  • OpenAPI (handlers/openapi.go) — not touched (no public-contract widening). Follow-up: add invalid_content_type to codeToAgentAction with a "set Content-Type: application/json" agent_action sentence.
  • content/llms.txt, instanode-web pricing — not impacted.
  • NR alert: existing magic_link.email_rate_limited alert reused; follow-up PR to split into per-IP vs per-email tiles for operator clarity.

Don't-break-the-UI guardrails

  • Per-IP limit runs after email validation so a typo (the dominant
    legitimate-flow noise) doesn't consume budget.
  • application/json; charset=utf-8 still accepted (legitimate dashboard
    default).
  • Limited path is silent-absorb 202, same as the per-email path —
    legitimate users hitting the cap see no UX change, attackers see no
    signal.

Live verify

Will run after merge per rule 14: curl https://api.instanode.dev/healthz | jq .commit_id matches HEAD; curl probes for each fix in the merge follow-up.

🤖 Generated with Claude Code

…imit (P0)

Auth P0 chain found by QA 2026-05-29. Two findings on the magic-link
abuse surface, ship together.

Findings closed in this PR:

  AUTH-163 (P0): /auth/email/start accepted Content-Type: application/
    x-www-form-urlencoded and inserted a magic_links row. Combined with
    no Origin/Referer enforcement that was a textbook CSRF primitive:
    a malicious site could <form action="https://api.instanode.dev/
    auth/email/start" method="POST"> to spam any arbitrary email with
    magic-links from the victim's IP. Fix: require Content-Type:
    application/json (charset suffix permitted). Form-encoded bodies
    are rejected with 400 invalid_content_type BEFORE the body is
    parsed or the DB is touched. The legitimate dashboard / CLI /
    agent flows all set application/json — the only callers that send
    urlencoded bodies are third-party-origin browser forms, which is
    exactly the attack surface we want to close.

  AUTH-107 / AUTH-097 (P0/P1): no CAPTCHA on /auth/email/start AND no
    per-IP magic-link rate-limit. Without CAPTCHA (deferred — the
    Brevo sender gap (project memory project_brevo_sender_not_validated)
    is the real blocker, and CAPTCHA on a non-deliverable email path
    just frustrates legitimate users), per-IP limit is the load-bearing
    abuse-defence. Existing per-email limit (5/hr/email) is trivially
    bypassed by rotating the email — an attacker can spam any number
    of 3rd-party addresses from a single IP. Fix: per-IP rate limit
    of magicLinkPerIPRateLimit (5) per magicLinkPerIPRateLimitWindow
    (1h), keyed by SHA-256 of c.IP() in Redis under
    magicLinkPerIPRLKeyPrefix ("ml:ip:rl"). Runs AFTER email
    validation (so a typo doesn't burn budget) but BEFORE the per-email
    limit (so it fires even when the per-email budget is fresh). Fail-
    open on Redis error per CLAUDE.md convention 1. Limited path returns
    202 — identical to the success path — so an attacker gains no
    enumeration signal. Operator visibility via
    metrics.MagicLinkEmailRateLimited counter + magic_link.start.
    ip_rate_limited structured log.

Regression tests (each reproduces the original exploit and asserts the
fix blocks it):

  TestAuthEmailStart_RejectsFormUrlencoded
  TestAuthEmailStart_AcceptsJSONWithCharset (guardrail — legit flow)
  TestAuthEmailStart_PerIPRateLimit (counter-level + handler-level)

Rule-22 surface checklist:
  - api/internal/handlers/magic_link.go              (handler)
  - OpenAPI: handlers/openapi.go                      not updated
    (no new public envelope — invalid_content_type follows the
    existing error-code envelope shape; per-IP-limited still returns
    the existing 202 success shape). Follow-up: surface
    invalid_content_type in codeToAgentAction so agents get the right
    "set Content-Type: application/json" prose without consulting the
    spec. Same polish PR as the PR-1 reauth_required entry.
  - content/llms.txt                                  not impacted
  - instanode-web pricing                              not impacted
  - NR alert: bump ml:ip:rl rate-limit counter into the existing
    magic_link.email_rate_limited alert (counter reused). Follow-up
    to split into per-IP and per-email tiles for operator clarity.

Live-verify (rule 14) and curl evidence (per finding) attached in PR
body.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@mastermanas805 mastermanas805 force-pushed the fix/auth-csrf-and-rate-limit branch from 19d0db2 to 0f7a099 Compare May 29, 2026 11:15
claude and others added 6 commits May 29, 2026 19:13
The codeToAgentAction registry is iterated by TestErrorCode_HasAgentAction
in CI; missing entry blocked merge. Per-IP rate-limit (AUTH-097/107) is
intentionally silent (202) per CLAUDE.md — only the content-type gate
emits a 400 that needs U3 copy.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…eds https URL

TestAgentActionContract asserts every codeToAgentAction entry contains
a full https://instanode.dev/ URL. My earlier entry omitted the docs link.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…age)

After successful pipe.Exec, incrCmd.Result() never returns an error
per go-redis semantics — pipe.Exec is the canonical error point for
pipelined commands. Use .Val() instead; eliminates 2 uncovered lines
that diff-cover flagged at 96.5% patch coverage.

Verified locally: TestMagicLinkStart_PerIPRateLimit + TestErrorCode_HasAgentAction PASS.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…link.go conflicts)

Both #176 (auth-chain) and #177 (CSRF+rate-limit) touched the same
two files. Merge keeps both sets of agent_action entries and chains
the return_to scheme check (#176) before the per-IP rate-limit (#177)
so cheap rejection runs before any Redis work.

Locally verified: TestErrorCode_HasAgentAction + TestAgentActionContract
+ TestMagicLinkStart_* PASS.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@mastermanas805 mastermanas805 merged commit 339b269 into master May 29, 2026
14 checks passed
@mastermanas805 mastermanas805 deleted the fix/auth-csrf-and-rate-limit branch May 29, 2026 17:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants