-
Notifications
You must be signed in to change notification settings - Fork 1
Rate Limit
Per-IP token bucket guarding mutating + WebSocket + login endpoints against credential stuffing and runaway clients. Zero external deps — in-memory state, lock-free refill.
-
LAN-only deployment — the default
enabled: trueis harmless; legitimate browser traffic stays well within the 120-token api bucket (≥1 token per request, 2 tok/s refill). -
External / public-internet deployment — leave it on. The
authbucket (cap 10, 0.2 tok/s) keeps/loginattempts under ~12/min — credential-stuffing infeasible. -
Behind nginx / Cloudflare with their own rate limit — disable
in
server.yml(security.rateLimit.enabled: false) to avoid double-counting.
# server.yml
security:
rateLimit:
enabled: true
apiCapacity: 120 # /api/ + /ws/ buckets per IP
apiRefillPerSecond: 2.0
authCapacity: 10 # /login + /api/auth/login per IP
authRefillPerSecond: 0.2Env override:
-
VIBECODER_SECURITY_RATE_LIMIT_ENABLED=false(etc.)
Restart the container to apply.
| Path prefix | Bucket | Notes |
|---|---|---|
/api/ |
api |
Every JSON endpoint |
/ws/ |
api |
WebSocket handshake. Closed connections re-open later — long-lived streams cost 1 token |
/login (POST) |
auth |
SSR login form |
/api/auth/login |
auth |
JSON login |
Everything else (/static/*, SSR pages, /setup, /health)
skips the limiter.
Once installAuth recognizes a valid admin Bearer token / cookie, the
limiter short-circuits with null. The rate counter is not bumped for
that request. This keeps backup, install, sub-agent fanout, etc. from
accidentally throttling the operator.
Detection happens in the limiter phase (before installAuth runs
its full lookup) — same device-row read, so the cost is one
SHA-256 hash + one PG row read per request. Negligible in practice.
HTTP/1.1 429 Too Many Requests
Retry-After: 4
Content-Type: application/json
{"code":"rate_limited","message":"too many requests","retryAfter":4}
The Retry-After value is integer seconds, rounded up — the
limiter computes how long until 1 token becomes available given
the current bucket deficit + the refill rate.
Visible via Metrics:
| Metric | Description |
|---|---|
vibe_rate_limit_buckets_active{bucket="api"|"auth"} |
Active IPs in the bucket map |
vibe_rate_limit_429_total{path_bucket="api"|"auth"} |
Cumulative 429 responses |
Useful PromQL:
# Reject ratio (per minute)
rate(vibe_rate_limit_429_total[5m])
# How many unique IPs are currently tracked?
vibe_rate_limit_buckets_active
RateLimiter keeps a ConcurrentHashMap<ip, Bucket> where each
bucket stores tokens: Double + lastNanos: Long. Refill is
lock-free in the hot path; only the tryAcquire block briefly
synchronized(bucket) to atomically subtract.
There's a hard MAX_IPS = 10_000 safety: if exceeded (would
require an attacker rotating IPs aggressively at single-user dev
server scale), the half oldest buckets get evicted. In normal
single-user operation the map stays at ~1–5 IPs.
State is in-memory only — restarting the server resets all buckets. For a single-user dev server this is fine; the burst window is short anyway.
-
No persistent ban — repeated abusers from the same IP just
see 429 forever until they slow down. Hard IP block is the
separate
AuthService.ipFailurestracker, which catches multiple-account credential stuffing over 24 h. - No cluster sync — k8s pod restart loses state. In a multi-replica deployment you'd need an external store (Redis). Single-user single-replica dev server doesn't.
- Security Model — full auth threat model
- Metrics — operational dashboards