-
Notifications
You must be signed in to change notification settings - Fork 2
Rate Limiting
Per-API-key fixed-window throttling. Default: off. Opt-in globally, override per key.
Agents loop. Agents retry. Agents have off-by-one bugs. Without a cap, a
misbehaving agent can OOM your Postgres or DDoS your own /contacts list.
Every API-key request runs a single atomic UPDATE:
UPDATE api_keys
SET usage_window_start = CASE
WHEN usage_window_start IS NULL
OR now() - usage_window_start >= interval '60 seconds'
THEN now()
ELSE usage_window_start END,
usage_count = CASE
WHEN usage_window_start IS NULL
OR now() - usage_window_start >= interval '60 seconds'
THEN 1
ELSE usage_count + 1 END,
last_used_at = now()
WHERE id = :key_id
RETURNING usage_count, usage_window_start;If the returned usage_count exceeds the limit, the request gets 429 with
a Retry-After header carrying seconds remaining in the current window.
Fixed-window means a burst right before the reset + a burst right after can spike to 2× the limit for one second. If you need true sliding windows, that's v2. For almost every use case, fixed-window is fine.
API_KEY_RATE_LIMIT_PER_MINUTE=120 # requests / minute / key
0 (default) disables rate limiting globally.
At creation time:
curl -X POST http://localhost:8000/workspace/api-keys \
-H "Authorization: Bearer nk_<admin>" \
-d '{"name":"sdr-agent","role":"member","rate_limit_per_minute":60}'Or via PATCH later (endpoint: v2). For now, revoke and re-create.
Resolution order: per-key rate_limit_per_minute → global default → off.
HTTP/1.1 429 Too Many Requests
Retry-After: 42
Content-Type: application/json
{"error":"rate limit exceeded (60/min); try again shortly"}Agents should respect Retry-After. Claude's MCP client does this
automatically; other clients vary.
The api_keys.usage_count + usage_window_start columns reflect the
current window. For historical observation, mirror requests into your
logging layer (Nakatomi logs to stdout) — dedicated rate-limit metrics
are on the v2 roadmap.
| Use case | Suggested limit |
|---|---|
| Interactive agent (Claude Desktop, Cursor) | 60-120 / min |
| Batch import / bulk upsert | 600 / min on a dedicated key |
| Webhook relay agent | 30 / min (most loads are low) |
| Read-only dashboard user | 30 / min |
| Service-account for integrations | 300-600 / min |
When in doubt: set a conservative number and bump it when you see 429s in logs. Cheaper than over-provisioning Postgres.
- User JWT requests (no counter on the User row in v1). Protect these at the load balancer if it matters.
- Discovery endpoints (
/health,/schema,/llms.txt,/.well-known/) — they're cheap and unauthenticated.
Set a low limit on a test key, loop:
for i in $(seq 1 100); do
curl -o /dev/null -s -w "%{http_code}\n" \
http://localhost:8000/contacts \
-H "Authorization: Bearer nk_..."
done | sort | uniq -cExpect a mix of 200s and 429s.
Repository · Issues · MIT licensed · maintained by Matt Dula