-
Notifications
You must be signed in to change notification settings - Fork 0
Operations
Headers, IP keys, PII safety, observability, and failure modes.
buildRateLimitHeaders(decision, opts) produces a plain Record<string, string> (the adapters call it for you), in three families via emit:
-
draft(default) — the IETFRateLimit-Limit/-Remaining/-Resettriple. -
structured— RFC 9651RateLimit+RateLimit-Policy. -
legacy— theX-RateLimit-*triple.
On a denial a Retry-After (delta-seconds, min 1) is always added, and all time math derives from the injected now.
Trusting X-Forwarded-For blindly is the classic bypass. clientIp refuses to: the default is trustProxy: false (use the socket peer), trust is opt-in as a hop count or CIDR allowlist, and it aggregates IPv6 to a configurable prefix (/64 default) so one customer can't rotate through billions of addresses.
import { clientIp } from "throttlekit";
const key = clientIp(
{ remoteAddr: req.socket.remoteAddress ?? "", xForwardedFor: req.headers["x-forwarded-for"] },
{ trustProxy: ["10.0.0.0/8"], ipv6Prefix: 64 }, // or trustProxy: 1 for a single hop
);The Express and fetch adapters accept trustProxy/ipv6Prefix directly and derive this key by default.
Hash raw identifiers with a server secret before they reach the store, so a shared Redis never holds the raw value:
import { hmacKeyer } from "throttlekit";
const keyer = hmacKeyer(process.env.RL_SECRET ?? "");
await limiter.check(keyer(rawUserId));Every Decision is a plain, loggable object. For metrics, the optional OpenTelemetry layer (throttlekit/otel) wraps a limiter or guard with your own Meter:
import { instrumentLimiter, instrumentGuard } from "throttlekit/otel";
import { metrics } from "@opentelemetry/api";
const meter = metrics.getMeter("my-service");
const observed = instrumentLimiter(limiter, meter); // throttlekit.checks / .remaining / .store.latency
instrumentGuard(guard, meter); // concurrency.limit / .inflight / .rtt_noloadThe metric names and span-attribute keys are a stable contract — exported as METRIC_NAMES / SPAN_ATTRIBUTES, pinned by a test, and changed only on a major bump. For trace-level visibility, recordDecisionOnSpan(span, decision, strategy) stamps throttlekit.allowed / .strategy / .limit / .remaining / .retry_after_ms onto a span you already have. The full reference (instrument types, units, attributes, the Prometheus .→_ mapping) is in docs/METRICS.md.
For zero-config insight without a metrics backend, wrap a limiter with withAnalytics — it tracks allow/deny counts and the top-K heavy hitters (keys driving the most traffic and denials) in bounded memory via Space-Saving (Metwally et al. 2005), so your worst offenders surface even under a flood of unique keys:
import { withAnalytics, rateLimit, gcra } from "throttlekit";
const limiter = withAnalytics(rateLimit({ strategy: gcra({ limit: 100, periodMs: 60_000 }) }));
await limiter.check(clientIp); // use exactly like any limiter
const a = limiter.analytics(); // { allowed, denied, total, denyRate, topRequested: [...], topDenied: [...] }For full control — your own metrics, structured logs, an audit pipeline — tapDecisions fires a callback once per completed check with the decision and its latency, then returns the decision unchanged. Dependency-free, and a throwing tap can never break the limiter:
import { tapDecisions, rateLimit, gcra } from "throttlekit";
const limiter = tapDecisions(rateLimit({ strategy: gcra({ limit: 100, periodMs: 60_000 }) }), (e) => {
// e: { key, cost, decision, strategy, durationMs, kind: "check" | "checkSync" | "checkMany" | "checkManySync" }
if (!e.decision.allowed) log.warn({ key: e.key, retryAfterMs: e.decision.retryAfterMs }, "rate limited");
myHistogram.observe(e.durationMs);
});It's the primitive that withAnalytics (built-in counters) and instrumentLimiter (OpenTelemetry) build on — reach for it when you want the raw stream. All three wrappers forward peek/forecast/close, so introspection and disposal survive wrapping.
Want to see this rather than wire it into a backend? throttlekit-server --config x.yaml --tui opens a built-in, zero-dependency, read-only terminal dashboard built on exactly these primitives (tapDecisions + withAnalytics, plus their unified-admission siblings). It renders the full ops board for any policy — and, for a unifiedAdmission, live binding-axis attribution: which of rate / concurrency / cost is throttling each key right now, with the exact per-axis numbers in the live denial feed. Opt-in (--tui, an interactive TTY); for headless / production, stay on OpenTelemetry → Grafana. → Monitoring
The in-process MemoryStore never fails. A distributed store can: if Redis is unreachable, check() rejects (StoreUnavailableError). You decide what that means — every adapter takes a fail policy and fires onError before applying it:
fail |
On a store outage | Use when |
|---|---|---|
"open" (default)
|
Allow the request | Availability > the cap — most public APIs |
"closed" |
Reject with 503
|
The cap is a hard guarantee — billing, abuse-critical paths |
expressRateLimit({
strategy: gcra({ limit: 100, periodMs: 60_000 }),
store: redisStore,
fail: "closed",
onError: (_req, _res, err) => log.warn({ err }, "rate limiter store down"),
});Two extra hedges: twoTier leased keeps serving from the local lease while L2 is briefly unreachable, and the Redis path is a single atomic round trip (no read-then-write window to interrupt). Both fail modes are tested on every adapter.
apply rejects on an outage (never silently allows/denies); your fail policy decides the rest. No store ever writes partially — every RMW is atomic (Lua / advisory-lock txn / compare-and-set / single-threaded DO) — so an outage mid-operation can't corrupt state.
| Store |
apply() on outage |
State after the outage | Recovery |
|---|---|---|---|
MemoryStore |
never rejects (in-process) | lost on process restart (RAM only) | starts empty — one window of over-admission, then exact |
RedisStore |
rejects (client error; OCC fallback → StoreUnavailableError) |
preserved across reconnect; lost only if Redis is flushed/non-persistent | reconnect → resume; 1 atomic EVALSHA, no partial write |
PostgresStore |
rejects (pg error) | preserved (durable table, survives a PG restart) | reconnect → resume; advisory-lock txn is atomic |
DynamoStore / D1Store / DenoKvStore
|
rejects (StoreUnavailableError after CAS retries) |
preserved in the backing store | resume; compare-and-set ⇒ no partial write; native TTL reclaims |
DurableObjectStore |
RMW runs inside the DO (no network hop, no retry loop) | preserved in DO storage | relocation carries storage; nothing to recover |
twoTier during an L2 outage: strict and cached-deny fail every (allow) request to the fail policy; leased keeps serving from each key's remaining local credits and only falls back to the policy once they're exhausted and L2 is still unreachable — the outage hedge. Worst-case over-admission stays within the proven leasing bound (= Limit under windowCoupled).
The full matrix — including what "lost" costs in over-admission and how to choose the policy — is in docs/FAILURE-MODES.md.
ThrottleKit · MIT · 1.0 — API frozen under SemVer (Stability)
- Getting Started
- Choosing a strategy
- Frameworks & the edge
- Distributed & provable
- Federation
- Scaling & the Fleet
- Unified admission
- Pillar 4 — Weighted Fair Escrow
- Middleware integration
- Distributed adaptive concurrency
- Advanced limiting
- Overload, fairness & DDoS
- Operations
- Monitoring — ThrottleKit Lens
- Policy Plans
- Replay
- Performance
- Migrating
- Polyglot & Python
- GALE & TALE