Skip to content

Operations

Ameya Borkar edited this page Jun 4, 2026 · 7 revisions

Operations

Headers, IP keys, PII safety, observability, and failure modes.

Standards-compliant headers

buildRateLimitHeaders(decision, opts) produces a plain Record<string, string> (the adapters call it for you), in three families via emit:

  • draft (default) — the IETF RateLimit-Limit/-Remaining/-Reset triple.
  • structured — RFC 9651 RateLimit + RateLimit-Policy.
  • legacy — the X-RateLimit-* triple.

On a denial a Retry-After (delta-seconds, min 1) is always added, and all time math derives from the injected now.

Trusted proxy & IPv6 aggregation

Trusting X-Forwarded-For blindly is the classic bypass. clientIp refuses to: the default is trustProxy: false (use the socket peer), trust is opt-in as a hop count or CIDR allowlist, and it aggregates IPv6 to a configurable prefix (/64 default) so one customer can't rotate through billions of addresses.

import { clientIp } from "throttlekit";

const key = clientIp(
  { remoteAddr: req.socket.remoteAddress ?? "", xForwardedFor: req.headers["x-forwarded-for"] },
  { trustProxy: ["10.0.0.0/8"], ipv6Prefix: 64 }, // or trustProxy: 1 for a single hop
);

The Express and fetch adapters accept trustProxy/ipv6Prefix directly and derive this key by default.

PII-safe keys (HMAC)

Hash raw identifiers with a server secret before they reach the store, so a shared Redis never holds the raw value:

import { hmacKeyer } from "throttlekit";
const keyer = hmacKeyer(process.env.RL_SECRET ?? "");
await limiter.check(keyer(rawUserId));

Observability

Every Decision is a plain, loggable object. For metrics, the optional OpenTelemetry layer (throttlekit/otel) wraps a limiter or guard with your own Meter:

import { instrumentLimiter, instrumentGuard } from "throttlekit/otel";
import { metrics } from "@opentelemetry/api";

const meter = metrics.getMeter("my-service");
const observed = instrumentLimiter(limiter, meter); // throttlekit.checks / .remaining / .store.latency
instrumentGuard(guard, meter);                       // concurrency.limit / .inflight / .rtt_noload

The metric names and span-attribute keys are a stable contract — exported as METRIC_NAMES / SPAN_ATTRIBUTES, pinned by a test, and changed only on a major bump. For trace-level visibility, recordDecisionOnSpan(span, decision, strategy) stamps throttlekit.allowed / .strategy / .limit / .remaining / .retry_after_ms onto a span you already have. The full reference (instrument types, units, attributes, the Prometheus ._ mapping) is in docs/METRICS.md.

For zero-config insight without a metrics backend, wrap a limiter with withAnalytics — it tracks allow/deny counts and the top-K heavy hitters (keys driving the most traffic and denials) in bounded memory via Space-Saving (Metwally et al. 2005), so your worst offenders surface even under a flood of unique keys:

import { withAnalytics, rateLimit, gcra } from "throttlekit";

const limiter = withAnalytics(rateLimit({ strategy: gcra({ limit: 100, periodMs: 60_000 }) }));
await limiter.check(clientIp); // use exactly like any limiter
const a = limiter.analytics(); // { allowed, denied, total, denyRate, topRequested: [...], topDenied: [...] }

The analytics tap (raw decision stream)

For full control — your own metrics, structured logs, an audit pipeline — tapDecisions fires a callback once per completed check with the decision and its latency, then returns the decision unchanged. Dependency-free, and a throwing tap can never break the limiter:

import { tapDecisions, rateLimit, gcra } from "throttlekit";

const limiter = tapDecisions(rateLimit({ strategy: gcra({ limit: 100, periodMs: 60_000 }) }), (e) => {
  // e: { key, cost, decision, strategy, durationMs, kind: "check" | "checkSync" | "checkMany" | "checkManySync" }
  if (!e.decision.allowed) log.warn({ key: e.key, retryAfterMs: e.decision.retryAfterMs }, "rate limited");
  myHistogram.observe(e.durationMs);
});

It's the primitive that withAnalytics (built-in counters) and instrumentLimiter (OpenTelemetry) build on — reach for it when you want the raw stream. All three wrappers forward peek/forecast/close, so introspection and disposal survive wrapping.

A live dashboard — the Lens

Want to see this rather than wire it into a backend? throttlekit-lens is a built-in, zero-dependency, read-only dashboard built on exactly these primitives (tapDecisions + withAnalytics, plus their unified-admission siblings). It serves the full ops board for any limiter — and, for a unifiedAdmission, live binding-axis attribution: which of rate / concurrency / cost is throttling each key right now, with the exact per-axis Decision a click away. It's on by default on throttlekit-server, or mount it in your own app. → Monitoring & the Lens

Failure modes

The in-process MemoryStore never fails. A distributed store can: if Redis is unreachable, check() rejects (StoreUnavailableError). You decide what that means — every adapter takes a fail policy and fires onError before applying it:

fail On a store outage Use when
"open" (default) Allow the request Availability > the cap — most public APIs
"closed" Reject with 503 The cap is a hard guarantee — billing, abuse-critical paths
expressRateLimit({
  strategy: gcra({ limit: 100, periodMs: 60_000 }),
  store: redisStore,
  fail: "closed",
  onError: (_req, _res, err) => log.warn({ err }, "rate limiter store down"),
});

Two extra hedges: twoTier leased keeps serving from the local lease while L2 is briefly unreachable, and the Redis path is a single atomic round trip (no read-then-write window to interrupt). Both fail modes are tested on every adapter.

What happens when each store goes down

apply rejects on an outage (never silently allows/denies); your fail policy decides the rest. No store ever writes partially — every RMW is atomic (Lua / advisory-lock txn / compare-and-set / single-threaded DO) — so an outage mid-operation can't corrupt state.

Store apply() on outage State after the outage Recovery
MemoryStore never rejects (in-process) lost on process restart (RAM only) starts empty — one window of over-admission, then exact
RedisStore rejects (client error; OCC fallback → StoreUnavailableError) preserved across reconnect; lost only if Redis is flushed/non-persistent reconnect → resume; 1 atomic EVALSHA, no partial write
PostgresStore rejects (pg error) preserved (durable table, survives a PG restart) reconnect → resume; advisory-lock txn is atomic
DynamoStore / D1Store / DenoKvStore rejects (StoreUnavailableError after CAS retries) preserved in the backing store resume; compare-and-set ⇒ no partial write; native TTL reclaims
DurableObjectStore RMW runs inside the DO (no network hop, no retry loop) preserved in DO storage relocation carries storage; nothing to recover

twoTier during an L2 outage: strict and cached-deny fail every (allow) request to the fail policy; leased keeps serving from each key's remaining local credits and only falls back to the policy once they're exhausted and L2 is still unreachable — the outage hedge. Worst-case over-admission stays within the proven leasing bound (= Limit under windowCoupled).

The full matrix — including what "lost" costs in over-admission and how to choose the policy — is in docs/FAILURE-MODES.md.

Clone this wiki locally