Distributed and Provable

Distributed, and provably bounded

The same strategy you ran in-memory runs across a fleet — just hand it a distributed store. Every backend produces the same decisions (the conformance suite proves it bit-identical).

Redis — atomic Lua, one round trip

import { rateLimit, gcra } from "throttlekit";
import { RedisStore } from "throttlekit/redis";
import Redis from "ioredis";

const store = new RedisStore({ client: new Redis(process.env.REDIS_URL) });

const limiter = rateLimit({
  strategy: gcra({ limit: 1000, periodMs: 60_000, burst: 100 }),
  store,         // one EVALSHA per check, fully atomic — no read-then-write race
  prefix: "api", // namespace, so one store can back many limiters
});

const d = await limiter.check(userId);

Built-in strategies run their atomic Lua form in a single EVALSHA (with an EVAL fallback on NOSCRIPT); custom strategies fall back to optimistic concurrency (WATCH/MULTI/EXEC). RedisStore derives now from the Redis server clock, so node clock skew never corrupts shared state.

Any Redis client — including serverless / edge. RedisStore speaks the ioredis shape directly; for the official node-redis client or the Upstash REST client (where TCP isn't allowed), wrap it:

import { RedisStore, fromNodeRedis, fromUpstash } from "throttlekit/redis";

new RedisStore({ client: new Redis(url) });                  // ioredis — straight through
new RedisStore({ client: fromNodeRedis(nodeRedisClient) });  // official `redis` client
new RedisStore({ client: fromUpstash(Upstash.fromEnv()) });  // Upstash REST — serverless/edge

All three are proven bit-identical to the in-process path by the conformance suite (ioredis and node-redis tested against a live server). Upstash REST has no interactive WATCH/MULTI, so it supports the Lua-backed built-ins only. See examples/redis-distributed.ts.

PostgreSQL — no Redis required

Already running Postgres? You don't need to add Redis. PostgresStore is a fully distributed backend:

import { rateLimit, gcra } from "throttlekit";
import { PostgresStore } from "throttlekit/postgres";
import { Pool } from "pg";

const store = new PostgresStore({ pool: new Pool({ connectionString: process.env.DATABASE_URL }), prefix: "api" });
const limiter = rateLimit({ strategy: gcra({ limit: 1000, periodMs: 60_000, burst: 100 }), store });
const d = await limiter.check(userId);

It runs the same pure JS transform the in-memory store runs — no Postgres-specific algorithm to keep in sync — inside a transaction serialized per key by a transaction-scoped advisory lock (pg_advisory_xact_lock, which serializes first-touch keys that SELECT … FOR UPDATE cannot). So concurrent checks are atomic (N simultaneous at limit K admit exactly K, proven on a live server) and decisions are bit-identical to the other backends. Pass a pg.Pool directly — no adapter. Each check is one transaction; for hot keys, use it as the L2 of twoTier({ mode: "leased" }). See examples/postgres.ts.

Serverless & edge backends (Cloudflare, DynamoDB, Deno KV)

Four more exact stores, each built on its platform's native atomic primitive — so decisions stay bit-identical to Redis/Postgres with no lock held, proven by the same conformance suite (including a 200-way concurrent read-modify-write).

Cloudflare Durable Objects (throttlekit/cloudflare) — a single-threaded actor with transactional storage. Construct it inside your Durable Object from state; the read-modify-write runs in blockConcurrencyWhile, so it is atomic with no retry loop:

import { DurableObjectStore } from "throttlekit/cloudflare";
// inside your Durable Object's constructor(state):
this.limiter = rateLimit({ strategy: gcra({ limit: 100, periodMs: 60_000 }), store: new DurableObjectStore(state) });

Cloudflare D1 (throttlekit/cloudflare, D1Store) — edge SQLite for a plain Worker: optimistic concurrency via a version compare-and-set, plus in-process per-key coalescing so a hot key costs ~1 write instead of a retry storm. new D1Store({ db: env.DB }); call sweep() from a Cron Trigger to reclaim space.
DynamoDB (throttlekit/dynamodb, DynamoStore) — a conditional-write CAS on a version attribute; expires_at is stored in epoch seconds so DynamoDB's native TTL reclaims rows. The structural client mirrors the AWS SDK v3 commands (an ~8-line pass-through adapter).
Deno KV (throttlekit/deno, DenoKvStore) — Deno KV's native atomic versionstamp CAS with native expireIn TTL.

Workers KV (throttlekit/cloudflare, KVStore) is offered only as an explicitly best-effort store: it is eventually consistent with no atomic compare-and-set, so concurrent checks read-modify-write over each other and edge reads can lag writes — it can over-admit under load. It is therefore not run through the conformance suite and does not honor the exact Store guarantee. Reach for it only for coarse edge protection where occasional over-admission is acceptable; use Durable Objects or D1 when you need the exact bound. The four exact stores are all async-only (checkSync throws) and slot into twoTier as the L2 exactly like Redis/Postgres.

Two-tier — local cache in front of the network

Front the distributed store (L2) with a local in-process tier (L1) and pick the consistency/throughput trade-off:

import { twoTier, gcra } from "throttlekit";

const limiter = twoTier({
  strategy: gcra({ limit: 10_000, periodMs: 60_000, burst: 500 }),
  l2: store,                                   // a distributed store, e.g. RedisStore
  mode: "leased",                              // "strict" | "cached-deny" | "leased"
  lease: { batch: 50, windowCoupled: true },   // lease 50 at a time; expire at the L2 window
});

Mode	Network cost	Global accuracy	Best for
`strict`	1 round trip / request	Exact	Hard quotas, billing
`cached-deny`	1 round trip / allowed request	Exact for allows, local for denies	Public APIs under abuse
`leased`	~1 round trip / `batch` requests	Bounded overshoot (below)	High-throughput internal APIs

leased trades exactness for throughput, with a provably bounded worst-case overshoot you choose:

Default (carryover): admitted ≤ Limit + N·(Batch−1) — tight, but grows with the fleet size N.
windowCoupled: true: credits expire at the L2 window boundary, so admitted ≤ Limit — independent of N. Opt-in; default off preserves legacy behaviour. twoTier's check is async (checkSync throws — L2 is asynchronous).

Tier-2: a client-held lease. When the leasing node is a remote client of a ThrottleKit server (rather than an in-process L1), the same window-coupled escrow is available as a client-held lease — Fleet.Reserve hands a batch to the caller and the core LeaseSpender spends it locally, so most requests never round-trip to the server. The lease window boundary is pinned to the store's authoritative clock (GlobalCoordinator.leaseWindowed), so node clock skew can't shift a window and over-admit. See Scaling & the Fleet.

Formally verified — and independent of fleet size

These bounds are proven, not claimed. A TLA⁺ spec is model-checked with TLC — carryover overshoot is exactly Limit + N·(Batch−1) (a counterexample shows it's tight, not loose) — and window-coupling tightens it to exactly Limit, independent of N (second spec + a Java-free exhaustive checker reproducing both in CI). This is the shipped core of GALE, ThrottleKit's provable distributed-leasing work. Full write-up: docs/FORMAL-MODEL.md.

Sizing the lease (`leaseSizer`)

The batch is a throughput-vs-stranding dial: larger batches mean fewer L2 round trips but more credits checked out of the shared pool that other nodes can't use. The right value tracks a node's demand, which drifts — so rather than hard-code it, leaseSizer learns it online (AdaGrad on the EOQ coordination-vs-stranding cost, O(√T) regret), and predictiveLeaseSizer folds in a per-window demand prediction with consistency + robustness:

import { leaseSizer } from "throttlekit";
const sizer = leaseSizer({ orderCost: 20, strandPenalty: 1 });
const batch = sizer.size();      // use as twoTier lease.batch
// …serve the window…
sizer.observe(demandThisWindow); // learn for next window

This can never loosen the bound: under windowCoupled the overshoot is exactly Limit regardless of batch, so the batch sets only coordination frequency, not safety. These are GALE Pillars 2–3 (see GALE & TALE).

Pillar 2 is wired into the lease loop. Rather than drive the sizer by hand, pass lease.adaptive and twoTier sizes each key's batch online for you — feeding the per-key learner the demand that key served each window and leasing at the size it reads back:

const limiter = twoTier({
  strategy: fixedWindow({ limit: 10_000, windowMs: 60_000 }),
  l2, mode: "leased",
  lease: { windowCoupled: true, adaptive: { orderCost: 20, strandPenalty: 1 } },
});

The manual snippet above is still how you drive predictiveLeaseSizer (Pillar 3), which exploits a per-window demand prediction the in-loop wiring can't supply. Walkthrough: examples/adaptive-lease-sizing.ts.

Multi-region

A global limit across regions is the leased model with the regions as the leasing nodes and one shared L2. Each region serves the bulk of its traffic from a local lease — region-local latency, no per-request cross-region hop — and the same verified bound caps the worldwide overshoot:

global admitted per window  ≤  Limit + regions × (batch − 1)   (carryover)
                            ≤  Limit                            (windowCoupled — any number of regions)

So 4 regions leasing batch: 50 against a global limit: 10_000 admit at most 10_196 worldwide under carryover (< 2% overshoot for ~1 cross-region hop per 50 requests) — or exactly 10_000 with windowCoupled, no matter how many regions. There's no separate multi-region engine to trust — it's twoTier leased at a shared store. See examples/multi-region.ts.

Cross-cluster federation — when regions span clusters

twoTier(leased, windowCoupled) above gives the K-independent bound when all processes share one regional store. When your processes span multiple Redis clusters (one per region) and you want one global limit pooled across them, use federate(...):

import { fixedWindow, federate, RedisCoordinator } from "throttlekit";

const coordinator = new RedisCoordinator({ client, windowMs: 60_000, budgetPerWindow: 1000 });
const limiter = federate({
  strategy: fixedWindow({ limit: 1000, windowMs: 60_000 }),
  coordinator,
  region: "us-east",
  batch: 16,
});

Same K-independent bound (admitted ≤ Limit, proven in spec/GaleFederatedLeasing.tla), full pooling under skew (the hot region can draw the entire budget), and fail-closed across every outage shape. Full design + eval in Federation; the contribution vs static partition: federation admits 3× as many requests at max skew.

When the backend goes down

The in-process MemoryStore never fails. A distributed store can — see Operations → Failure modes for the fail: "open" | "closed" policy, plus the two extra hedges (twoTier leased keeps serving from the local lease during a brief L2 outage; the Redis path is a single atomic round trip with no read-then-write window to interrupt).

ThrottleKit · MIT · 1.0 — API frozen under SemVer (Stability)

ThrottleKit Wiki

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Distributed and Provable

Distributed, and provably bounded

Redis — atomic Lua, one round trip

PostgreSQL — no Redis required

Serverless & edge backends (Cloudflare, DynamoDB, Deno KV)

Two-tier — local cache in front of the network

Formally verified — and independent of fleet size

Sizing the lease (`leaseSizer`)

Multi-region

Cross-cluster federation — when regions span clusters

When the backend goes down

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally

Distributed and Provable

Distributed, and provably bounded

Redis — atomic Lua, one round trip

PostgreSQL — no Redis required

Serverless & edge backends (Cloudflare, DynamoDB, Deno KV)

Two-tier — local cache in front of the network

Formally verified — and independent of fleet size

Sizing the lease (leaseSizer)

Multi-region

Cross-cluster federation — when regions span clusters

When the backend goes down

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally

Sizing the lease (`leaseSizer`)