Rate limit /chat/message: per-user + per-IP defence-in-depth by vahid-ahmadi · Pull Request #48 · PolicyEngine/policyengine-uk-chat

vahid-ahmadi · 2026-05-06T11:37:20Z

Summary

Adds two-layer rate limiting on `POST /chat/message` — the only expensive endpoint. Other routes are cheap reads and stay unlimited.

Per user (or IP if anonymous) — `5/min` and `60/hour`. Authenticated clients send `X-User-Id`; the limiter keys by `user:` so signed-in users aren't blocked by anon flooders on shared NAT/proxy IPs.
Per IP defence-in-depth — `30/min`, regardless of who is sending. Catches anon abuse and runaway scripts.
Limits configurable via `RATE_LIMIT_CHAT_PER_MIN`, `RATE_LIMIT_CHAT_PER_HOUR`, `RATE_LIMIT_CHAT_IP_PER_MIN`.
429 handler returns JSON with `retry_after_seconds` and a matching `Retry-After` header.
Frontend handles 429 with an inline assistant message ("you're sending messages a bit fast — please wait ~Ns") rather than the 402 paywall flow.

Closes #46.

Implementation notes

New module `backend/rate_limit.py` owns the `Limiter`, the `chat_key_func`, and the env-tunable limit strings. Imported by both `main.py` (registration + handler) and `chatbot.py` (decorator).
`backend/tests/conftest.py` bumps test limits to 10000/min — without it the existing `TestChatMessage` tests would trip the production cap on the 6th call.
Storage is slowapi's default in-memory backend. Modal runs up to 10 containers with concurrency 100, so a determined attacker can spread requests across containers and bypass any single counter. The IP layer is approximate but adequate for the threat model — swap to Redis via `storage_uri` when we need cross-container accuracy.

Test plan

`docker-compose up`, then hammer `/chat/message` 6 times in 30s with the same logged-in user → 6th request returns 429 with `Retry-After: 60`.
Hammer 6 anonymous requests from one IP → same behaviour.
In the UI, hit the limit → an inline assistant bubble appears with "you're sending messages a bit fast — please wait ~60s". Streaming state clears, paywall is not shown.
`GET /health`, `GET /version`, `GET /conversations`, and the title generator continue to work under load.
`pytest backend/tests/test_api.py -q` passes — the new `TestRateLimitConfig` is fast and offline.

Out of scope

Cross-container accurate rate limiting (would need Redis).
Per-org/team limits (no orgs yet).
Rate-limiting cheap GET endpoints (deemed unnecessary for current threat model).

Two layers protect the only expensive endpoint: - Per user (or per IP if anonymous): 5/min and 60/hour. Authenticated clients send their user id in `X-User-Id`; the limiter keys by `user:<id>` so signed-in users aren't blocked by anon flooders on shared IPs. - Per IP defence-in-depth: 30/min, regardless of who is sending. Both limits are env-tunable (`RATE_LIMIT_CHAT_PER_MIN`, `RATE_LIMIT_CHAT_PER_HOUR`, `RATE_LIMIT_CHAT_IP_PER_MIN`). The 429 handler returns JSON with `retry_after_seconds` and a matching `Retry-After` header. Frontend handles 429 with an inline assistant message ("you're sending messages a bit fast — please wait ~Ns") rather than the paywall flow used for 402. Storage is slowapi's default in-memory backend. With Modal's max_containers=10 and concurrent=100, an attacker could spread requests across containers to bypass any single counter — the IP layer is approximate but adequate. Swap to Redis via `storage_uri` if we need cross-container accuracy later. Tests: TestRateLimitConfig covers the key function (user/IP precedence, empty header fallback, env-var overrides). End-to-end limit triggering isn't tested — it would require precise timing and per-test storage resets. conftest.py raises test limits well above pytest workload so the existing chat tests don't trip the production 5/min cap. Closes #46 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

vercel · 2026-05-06T11:37:25Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
policyengine-uk-chat	Ready	Preview, Comment	May 22, 2026 10:45am

github-actions · 2026-05-06T11:39:35Z

Beta preview has been cleaned up because this PR was closed.

- Critical: slowapi's @limiter.limit grabs the endpoint parameter named `request` and requires a starlette Request. The route named the Pydantic body `request` and the Request `http_request`, so slowapi raised on every call and /chat/message 500'd for everyone. Renamed: `request` is now the Starlette Request, the body is `chat_request`. - Per-IP layer keyed on `request.client.host`, which behind Modal's proxy is the proxy address — collapsing all traffic into one bucket. New `client_ip()` prefers the first `X-Forwarded-For` entry. - The new TestRateLimitConfig only exercised the key func, so it stayed green while the endpoint was broken. Added a signature guard (the `request` param must be a Starlette Request) and a smoke test that /chat/message does not 500 from the decorator, plus client_ip cases. Verified against slowapi 0.1.9: the fixed signature streams 5x then 429s; the pre-fix signature 500s on every call. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

vahid-ahmadi · 2026-05-22T10:45:44Z

Pushed 3ba817f — fixes the three issues from review.

🔴 Blocker — /chat/message 500'd on every call. slowapi's @limiter.limit grabs the endpoint parameter literally named request and requires it to be a starlette.requests.Request. This route named the Pydantic body request and the Starlette request http_request, so slowapi raised parameter \request` must be an instance of starlette.requests.Requestbefore the handler ran. Renamed:requestis now the Starlette Request, the parsed body ischat_request(6 references updated in the function body). I reproduced this againstslowapi 0.1.9/fastapi 0.136.1`:

signature	result
`request: ChatRequest, http_request: Request` (before)	`[500, 500, 500, 500, 500, 500, 500, 500]`
`request: Request, chat_request: ChatRequest` (after)	`[200, 200, 200, 200, 200, 429, 429, 429]`

🟠 Per-IP layer keyed on the proxy IP. get_remote_address returns request.client.host — behind Modal's proxy that is the proxy address, collapsing all anonymous traffic into one bucket. New client_ip() prefers the first X-Forwarded-For entry; chat_key_func's anon fallback and the limiter default both use it.

🟠 Test gap + 🟡 spoofability. TestRateLimitConfig only exercised the key func, so it stayed green while the endpoint was fully broken. Added: a signature guard (the request param must be a Starlette Request — pinpoints this exact bug class, offline), a smoke test that /chat/message does not 500 from the decorator, and client_ip/X-Forwarded-For cases. Also documented in rate_limit.py that both keys are unauthenticated and client-spoofable — deliberate defence-in-depth against casual abuse, not a security control.

Verification: ran the real rate_limit.py functions and the slowapi integration end-to-end (table above); inspect.signature confirmed to see the Starlette Request through slowapi's decorators; py_compile (3.13) clean on all changed files. The full pytest backend/tests/test_api.py was not run here — it imports the PolicyEngine model packages (data-token install) and is meant to run inside the backend container; the chatbot.py change is a mechanical, syntax-verified rename and the slowapi behaviour is proven independently above.

+87/−9 across chatbot.py, rate_limit.py, test_api.py.

vercel Bot deployed to Preview May 6, 2026 11:37 View deployment

vahid-ahmadi self-assigned this May 6, 2026

vahid-ahmadi requested a review from SakshiKekre May 6, 2026 12:00

vercel Bot deployed to Preview May 22, 2026 10:45 View deployment

vahid-ahmadi merged commit 9bcfbff into main May 22, 2026
4 checks passed

vahid-ahmadi mentioned this pull request May 22, 2026

Add model backend selector and Claude Agent SDK runner #51

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rate limit /chat/message: per-user + per-IP defence-in-depth#48

Rate limit /chat/message: per-user + per-IP defence-in-depth#48
vahid-ahmadi merged 2 commits into
mainfrom
feat/rate-limiting

vahid-ahmadi commented May 6, 2026

Uh oh!

vercel Bot commented May 6, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented May 6, 2026 •

edited

Loading

Uh oh!

vahid-ahmadi commented May 22, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

vahid-ahmadi commented May 6, 2026

Summary

Implementation notes

Test plan

Out of scope

Uh oh!

vercel Bot commented May 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented May 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vahid-ahmadi commented May 22, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

vercel Bot commented May 6, 2026 •

edited

Loading

github-actions Bot commented May 6, 2026 •

edited

Loading