-
Notifications
You must be signed in to change notification settings - Fork 0
Distributed and Provable
The same strategy you ran in-memory runs across a fleet — just hand it a distributed store. Every backend produces the same decisions (the conformance suite proves it bit-identical).
import { rateLimit, gcra } from "throttlekit";
import { RedisStore } from "throttlekit/redis";
import Redis from "ioredis";
const store = new RedisStore({ client: new Redis(process.env.REDIS_URL) });
const limiter = rateLimit({
strategy: gcra({ limit: 1000, periodMs: 60_000, burst: 100 }),
store, // one EVALSHA per check, fully atomic — no read-then-write race
prefix: "api", // namespace, so one store can back many limiters
});
const d = await limiter.check(userId);Built-in strategies run their atomic Lua form in a single EVALSHA (with an EVAL fallback on NOSCRIPT); custom strategies fall back to optimistic concurrency (WATCH/MULTI/EXEC). RedisStore derives now from the Redis server clock, so node clock skew never corrupts shared state.
Any Redis client — including serverless / edge. RedisStore speaks the ioredis shape directly; for the official node-redis client or the Upstash REST client (where TCP isn't allowed), wrap it:
import { RedisStore, fromNodeRedis, fromUpstash } from "throttlekit/redis";
new RedisStore({ client: new Redis(url) }); // ioredis — straight through
new RedisStore({ client: fromNodeRedis(nodeRedisClient) }); // official `redis` client
new RedisStore({ client: fromUpstash(Upstash.fromEnv()) }); // Upstash REST — serverless/edgeAll three are proven bit-identical to the in-process path by the conformance suite (ioredis and node-redis tested against a live server). Upstash REST has no interactive WATCH/MULTI, so it supports the Lua-backed built-ins only. See examples/redis-distributed.ts.
Already running Postgres? You don't need to add Redis. PostgresStore is a fully distributed backend:
import { rateLimit, gcra } from "throttlekit";
import { PostgresStore } from "throttlekit/postgres";
import { Pool } from "pg";
const store = new PostgresStore({ pool: new Pool({ connectionString: process.env.DATABASE_URL }), prefix: "api" });
const limiter = rateLimit({ strategy: gcra({ limit: 1000, periodMs: 60_000, burst: 100 }), store });
const d = await limiter.check(userId);It runs the same pure JS transform the in-memory store runs — no Postgres-specific algorithm to keep in sync — inside a transaction serialized per key by a transaction-scoped advisory lock (pg_advisory_xact_lock, which serializes first-touch keys that SELECT … FOR UPDATE cannot). So concurrent checks are atomic (N simultaneous at limit K admit exactly K, proven on a live server) and decisions are bit-identical to the other backends. Pass a pg.Pool directly — no adapter. Each check is one transaction; for hot keys, use it as the L2 of twoTier({ mode: "leased" }). See examples/postgres.ts.
Front the distributed store (L2) with a local in-process tier (L1) and pick the consistency/throughput trade-off:
import { twoTier, gcra } from "throttlekit";
const limiter = twoTier({
strategy: gcra({ limit: 10_000, periodMs: 60_000, burst: 500 }),
l2: store, // a distributed store, e.g. RedisStore
mode: "leased", // "strict" | "cached-deny" | "leased"
lease: { batch: 50, windowCoupled: true }, // lease 50 at a time; expire at the L2 window
});| Mode | Network cost | Global accuracy | Best for |
|---|---|---|---|
strict |
1 round trip / request | Exact | Hard quotas, billing |
cached-deny |
1 round trip / allowed request | Exact for allows, local for denies | Public APIs under abuse |
leased |
~1 round trip / batch requests |
Bounded overshoot (below) | High-throughput internal APIs |
leased trades exactness for throughput, with a provably bounded worst-case overshoot you choose:
-
Default (carryover):
admitted ≤ Limit + N·(Batch−1)— tight, but grows with the fleet sizeN. -
windowCoupled: true: credits expire at the L2 window boundary, soadmitted ≤ Limit— independent ofN. Opt-in; default off preserves legacy behaviour.twoTier'scheckis async (checkSyncthrows — L2 is asynchronous).
These bounds are proven, not claimed. A TLA⁺ spec is model-checked with TLC — carryover overshoot is exactly Limit + N·(Batch−1) (a counterexample shows it's tight, not loose) — and window-coupling tightens it to exactly Limit, independent of N (second spec + a Java-free exhaustive checker reproducing both in CI). This is the shipped core of GALE, the research track on provable distributed leasing. Full write-up: docs/FORMAL-MODEL.md.
The batch is a throughput-vs-stranding dial: larger batches mean fewer L2 round trips but more credits checked out of the shared pool that other nodes can't use. The right value tracks a node's demand, which drifts — so rather than hard-code it, leaseSizer learns it online (AdaGrad on the EOQ coordination-vs-stranding cost, O(√T) regret), and predictiveLeaseSizer folds in a per-window demand prediction with consistency + robustness:
import { leaseSizer } from "throttlekit";
const sizer = leaseSizer({ orderCost: 20, strandPenalty: 1 });
const batch = sizer.size(); // use as twoTier lease.batch
// …serve the window…
sizer.observe(demandThisWindow); // learn for next windowThis can never loosen the bound: under windowCoupled the overshoot is exactly Limit regardless of batch, so the batch sets only coordination frequency, not safety. These are GALE Pillars 2–3 (see the research track); today they're standalone learners that advise lease.batch — wiring them into the lease loop as live in-flight adaptation is a planned follow-up.
A global limit across regions is the leased model with the regions as the leasing nodes and one shared L2. Each region serves the bulk of its traffic from a local lease — region-local latency, no per-request cross-region hop — and the same verified bound caps the worldwide overshoot:
global admitted per window ≤ Limit + regions × (batch − 1) (carryover)
≤ Limit (windowCoupled — any number of regions)
So 4 regions leasing batch: 50 against a global limit: 10_000 admit at most 10_196 worldwide under carryover (< 2% overshoot for ~1 cross-region hop per 50 requests) — or exactly 10_000 with windowCoupled, no matter how many regions. There's no separate multi-region engine to trust — it's twoTier leased at a shared store. See examples/multi-region.ts.
The in-process MemoryStore never fails. A distributed store can — see Operations → Failure modes for the fail: "open" | "closed" policy, plus the two extra hedges (twoTier leased keeps serving from the local lease during a brief L2 outage; the Redis path is a single atomic round trip with no read-then-write window to interrupt).
ThrottleKit · MIT · 1.0 — API frozen under SemVer (Stability)
- Getting Started
- Choosing a strategy
- Frameworks & the edge
- Distributed & provable
- Federation
- Scaling & the Fleet
- Unified admission
- Pillar 4 — Weighted Fair Escrow
- Middleware integration
- Distributed adaptive concurrency
- Advanced limiting
- Overload, fairness & DDoS
- Operations
- Monitoring — ThrottleKit Lens
- Policy Plans
- Replay
- Performance
- Migrating
- Polyglot & Python
- GALE & TALE