Rate Limit

Rate Limit (per-IP)

Per-IP token bucket guarding mutating + WebSocket + login endpoints against credential stuffing and runaway clients. Zero external deps — in-memory state, lock-free refill.

When to use

LAN-only deployment — the default enabled: true is harmless; legitimate browser traffic stays well within the 120-token api bucket (≥1 token per request, 2 tok/s refill).
External / public-internet deployment — leave it on. The auth bucket (cap 10, 0.2 tok/s) keeps /login attempts under ~12/min — credential-stuffing infeasible.
Behind nginx / Cloudflare with their own rate limit — disable in server.yml (security.rateLimit.enabled: false) to avoid double-counting.

Configuration

# server.yml
security:
  rateLimit:
    enabled: true
    apiCapacity: 120        # /api/ + /ws/ buckets per IP
    apiRefillPerSecond: 2.0
    authCapacity: 10        # /login + /api/auth/login per IP
    authRefillPerSecond: 0.2

Env override:

VIBECODER_SECURITY_RATE_LIMIT_ENABLED=false (etc.)

Restart the container to apply.

What's covered

Path prefix	Bucket	Notes
`/api/`	`api`	Every JSON endpoint
`/ws/`	`api`	WebSocket handshake. Closed connections re-open later — long-lived streams cost 1 token
`/login` (POST)	`auth`	SSR login form
`/api/auth/login`	`auth`	JSON login

Everything else (/static/*, SSR pages, /setup, /health) skips the limiter.

Admin bypass

Once installAuth recognizes a valid admin Bearer token / cookie, the limiter short-circuits with null. The rate counter is not bumped for that request. This keeps backup, install, sub-agent fanout, etc. from accidentally throttling the operator.

Detection happens in the limiter phase (before installAuth runs its full lookup) — same device-row read, so the cost is one SHA-256 hash + one PG row read per request. Negligible in practice.

429 response

HTTP/1.1 429 Too Many Requests
Retry-After: 4
Content-Type: application/json

{"code":"rate_limited","message":"too many requests","retryAfter":4}

The Retry-After value is integer seconds, rounded up — the limiter computes how long until 1 token becomes available given the current bucket deficit + the refill rate.

Metrics

Visible via Metrics:

Metric	Description
`vibe_rate_limit_buckets_active{bucket="api"\|"auth"}`	Active IPs in the bucket map
`vibe_rate_limit_429_total{path_bucket="api"\|"auth"}`	Cumulative 429 responses

Useful PromQL:

# Reject ratio (per minute)
rate(vibe_rate_limit_429_total[5m])

# How many unique IPs are currently tracked?
vibe_rate_limit_buckets_active

Implementation notes

RateLimiter keeps a ConcurrentHashMap<ip, Bucket> where each bucket stores tokens: Double + lastNanos: Long. Refill is lock-free in the hot path; only the tryAcquire block briefly synchronized(bucket) to atomically subtract.

There's a hard MAX_IPS = 10_000 safety: if exceeded (would require an attacker rotating IPs aggressively at single-user dev server scale), the half oldest buckets get evicted. In normal single-user operation the map stays at ~1–5 IPs.

State is in-memory only — restarting the server resets all buckets. For a single-user dev server this is fine; the burst window is short anyway.

Trade-offs

No persistent ban — repeated abusers from the same IP just see 429 forever until they slow down. Hard IP block is the separate AuthService.ipFailures tracker, which catches multiple-account credential stuffing over 24 h.
No cluster sync — k8s pod restart loses state. In a multi-replica deployment you'd need an external store (Redis). Single-user single-replica dev server doesn't.

Rate Limit

Rate Limit (per-IP)

When to use

Configuration

What's covered

Admin bypass

429 response

Metrics

Implementation notes

Trade-offs

Related

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally