-
Notifications
You must be signed in to change notification settings - Fork 0
Distributed and Provable
The same strategy you ran in-memory runs across a fleet — just hand it a distributed store. Every backend produces the same decisions (the conformance suite proves it bit-identical).
import { rateLimit, gcra } from "throttlekit";
import { RedisStore } from "throttlekit/redis";
import Redis from "ioredis";
const store = new RedisStore({ client: new Redis(process.env.REDIS_URL) });
const limiter = rateLimit({
strategy: gcra({ limit: 1000, periodMs: 60_000, burst: 100 }),
store, // one EVALSHA per check, fully atomic — no read-then-write race
prefix: "api", // namespace, so one store can back many limiters
});
const d = await limiter.check(userId);Built-in strategies run their atomic Lua form in a single EVALSHA (with an EVAL fallback on NOSCRIPT); custom strategies fall back to optimistic concurrency (WATCH/MULTI/EXEC). RedisStore derives now from the Redis server clock, so node clock skew never corrupts shared state.
Any Redis client — including serverless / edge. RedisStore speaks the ioredis shape directly; for the official node-redis client or the Upstash REST client (where TCP isn't allowed), wrap it:
import { RedisStore, fromNodeRedis, fromUpstash } from "throttlekit/redis";
new RedisStore({ client: new Redis(url) }); // ioredis — straight through
new RedisStore({ client: fromNodeRedis(nodeRedisClient) }); // official `redis` client
new RedisStore({ client: fromUpstash(Upstash.fromEnv()) }); // Upstash REST — serverless/edgeAll three are proven bit-identical to the in-process path by the conformance suite (ioredis and node-redis tested against a live server). Upstash REST has no interactive WATCH/MULTI, so it supports the Lua-backed built-ins only. See examples/redis-distributed.ts.
Already running Postgres? You don't need to add Redis. PostgresStore is a fully distributed backend:
import { rateLimit, gcra } from "throttlekit";
import { PostgresStore } from "throttlekit/postgres";
import { Pool } from "pg";
const store = new PostgresStore({ pool: new Pool({ connectionString: process.env.DATABASE_URL }), prefix: "api" });
const limiter = rateLimit({ strategy: gcra({ limit: 1000, periodMs: 60_000, burst: 100 }), store });
const d = await limiter.check(userId);It runs the same pure JS transform the in-memory store runs — no Postgres-specific algorithm to keep in sync — inside a transaction serialized per key by a transaction-scoped advisory lock (pg_advisory_xact_lock, which serializes first-touch keys that SELECT … FOR UPDATE cannot). So concurrent checks are atomic (N simultaneous at limit K admit exactly K, proven on a live server) and decisions are bit-identical to the other backends. Pass a pg.Pool directly — no adapter. Each check is one transaction; for hot keys, use it as the L2 of twoTier({ mode: "leased" }). See examples/postgres.ts.
Four more exact stores, each built on its platform's native atomic primitive — so decisions stay bit-identical to Redis/Postgres with no lock held, proven by the same conformance suite (including a 200-way concurrent read-modify-write).
Cloudflare Durable Objects (throttlekit/cloudflare) — a single-threaded actor with transactional storage. Construct it inside your Durable Object from state; the read-modify-write runs in blockConcurrencyWhile, so it is atomic with no retry loop:
import { DurableObjectStore } from "throttlekit/cloudflare";
// inside your Durable Object's constructor(state):
this.limiter = rateLimit({ strategy: gcra({ limit: 100, periodMs: 60_000 }), store: new DurableObjectStore(state) });-
Cloudflare D1 (
throttlekit/cloudflare,D1Store) — edge SQLite for a plain Worker: optimistic concurrency via a version compare-and-set, plus in-process per-key coalescing so a hot key costs ~1 write instead of a retry storm.new D1Store({ db: env.DB }); callsweep()from a Cron Trigger to reclaim space. -
DynamoDB (
throttlekit/dynamodb,DynamoStore) — a conditional-write CAS on aversionattribute;expires_atis stored in epoch seconds so DynamoDB's native TTL reclaims rows. The structural client mirrors the AWS SDK v3 commands (an ~8-line pass-through adapter). -
Deno KV (
throttlekit/deno,DenoKvStore) — Deno KV's native atomicversionstampCAS with nativeexpireInTTL.
Workers KV (throttlekit/cloudflare, KVStore) is offered only as an explicitly best-effort store: it is eventually consistent with no atomic compare-and-set, so concurrent checks read-modify-write over each other and edge reads can lag writes — it can over-admit under load. It is therefore not run through the conformance suite and does not honor the exact Store guarantee. Reach for it only for coarse edge protection where occasional over-admission is acceptable; use Durable Objects or D1 when you need the exact bound. The four exact stores are all async-only (checkSync throws) and slot into twoTier as the L2 exactly like Redis/Postgres.
Front the distributed store (L2) with a local in-process tier (L1) and pick the consistency/throughput trade-off:
import { twoTier, gcra } from "throttlekit";
const limiter = twoTier({
strategy: gcra({ limit: 10_000, periodMs: 60_000, burst: 500 }),
l2: store, // a distributed store, e.g. RedisStore
mode: "leased", // "strict" | "cached-deny" | "leased"
lease: { batch: 50, windowCoupled: true }, // lease 50 at a time; expire at the L2 window
});| Mode | Network cost | Global accuracy | Best for |
|---|---|---|---|
strict |
1 round trip / request | Exact | Hard quotas, billing |
cached-deny |
1 round trip / allowed request | Exact for allows, local for denies | Public APIs under abuse |
leased |
~1 round trip / batch requests |
Bounded overshoot (below) | High-throughput internal APIs |
leased trades exactness for throughput, with a provably bounded worst-case overshoot you choose:
-
Default (carryover):
admitted ≤ Limit + N·(Batch−1)— tight, but grows with the fleet sizeN. -
windowCoupled: true: credits expire at the L2 window boundary, soadmitted ≤ Limit— independent ofN. Opt-in; default off preserves legacy behaviour.twoTier'scheckis async (checkSyncthrows — L2 is asynchronous).
Tier-2: a client-held lease. When the leasing node is a remote client of a ThrottleKit server (rather than an in-process L1), the same window-coupled escrow is available as a client-held lease —
Fleet.Reservehands a batch to the caller and the coreLeaseSpenderspends it locally, so most requests never round-trip to the server. The lease window boundary is pinned to the store's authoritative clock (GlobalCoordinator.leaseWindowed), so node clock skew can't shift a window and over-admit. See Scaling & the Fleet.
These bounds are proven, not claimed. A TLA⁺ spec is model-checked with TLC — carryover overshoot is exactly Limit + N·(Batch−1) (a counterexample shows it's tight, not loose) — and window-coupling tightens it to exactly Limit, independent of N (second spec + a Java-free exhaustive checker reproducing both in CI). This is the shipped core of GALE, ThrottleKit's provable distributed-leasing work. Full write-up: docs/FORMAL-MODEL.md.
The batch is a throughput-vs-stranding dial: larger batches mean fewer L2 round trips but more credits checked out of the shared pool that other nodes can't use. The right value tracks a node's demand, which drifts — so rather than hard-code it, leaseSizer learns it online (AdaGrad on the EOQ coordination-vs-stranding cost, O(√T) regret), and predictiveLeaseSizer folds in a per-window demand prediction with consistency + robustness:
import { leaseSizer } from "throttlekit";
const sizer = leaseSizer({ orderCost: 20, strandPenalty: 1 });
const batch = sizer.size(); // use as twoTier lease.batch
// …serve the window…
sizer.observe(demandThisWindow); // learn for next windowThis can never loosen the bound: under windowCoupled the overshoot is exactly Limit regardless of batch, so the batch sets only coordination frequency, not safety. These are GALE Pillars 2–3 (see GALE & TALE).
Pillar 2 is wired into the lease loop. Rather than drive the sizer by hand, pass lease.adaptive and twoTier sizes each key's batch online for you — feeding the per-key learner the demand that key served each window and leasing at the size it reads back:
const limiter = twoTier({
strategy: fixedWindow({ limit: 10_000, windowMs: 60_000 }),
l2, mode: "leased",
lease: { windowCoupled: true, adaptive: { orderCost: 20, strandPenalty: 1 } },
});The manual snippet above is still how you drive predictiveLeaseSizer (Pillar 3), which exploits a per-window demand prediction the in-loop wiring can't supply. Walkthrough: examples/adaptive-lease-sizing.ts.
A global limit across regions is the leased model with the regions as the leasing nodes and one shared L2. Each region serves the bulk of its traffic from a local lease — region-local latency, no per-request cross-region hop — and the same verified bound caps the worldwide overshoot:
global admitted per window ≤ Limit + regions × (batch − 1) (carryover)
≤ Limit (windowCoupled — any number of regions)
So 4 regions leasing batch: 50 against a global limit: 10_000 admit at most 10_196 worldwide under carryover (< 2% overshoot for ~1 cross-region hop per 50 requests) — or exactly 10_000 with windowCoupled, no matter how many regions. There's no separate multi-region engine to trust — it's twoTier leased at a shared store. See examples/multi-region.ts.
twoTier(leased, windowCoupled) above gives the K-independent bound when all processes share one regional store. When your processes span multiple Redis clusters (one per region) and you want one global limit pooled across them, use federate(...):
import { fixedWindow, federate, RedisCoordinator } from "throttlekit";
const coordinator = new RedisCoordinator({ client, windowMs: 60_000, budgetPerWindow: 1000 });
const limiter = federate({
strategy: fixedWindow({ limit: 1000, windowMs: 60_000 }),
coordinator,
region: "us-east",
batch: 16,
});Same K-independent bound (admitted ≤ Limit, proven in spec/GaleFederatedLeasing.tla), full pooling under skew (the hot region can draw the entire budget), and fail-closed across every outage shape. Full design + eval in Federation; the contribution vs static partition: federation admits 3× as many requests at max skew.
The in-process MemoryStore never fails. A distributed store can — see Operations → Failure modes for the fail: "open" | "closed" policy, plus the two extra hedges (twoTier leased keeps serving from the local lease during a brief L2 outage; the Redis path is a single atomic round trip with no read-then-write window to interrupt).
ThrottleKit · MIT · 1.0 — API frozen under SemVer (Stability)
- Getting Started
- Choosing a strategy
- Frameworks & the edge
- Distributed & provable
- Federation
- Scaling & the Fleet
- Unified admission
- Pillar 4 — Weighted Fair Escrow
- Middleware integration
- Distributed adaptive concurrency
- Advanced limiting
- Overload, fairness & DDoS
- Operations
- Monitoring — ThrottleKit Lens
- Policy Plans
- Replay
- Performance
- Migrating
- Polyglot & Python
- GALE & TALE