Skip to content

Distributed and Provable

Ameya Borkar edited this page May 27, 2026 · 8 revisions

Distributed, and provably bounded

The same strategy you ran in-memory runs across a fleet — just hand it a distributed store. Every backend produces the same decisions (the conformance suite proves it bit-identical).

Redis — atomic Lua, one round trip

import { rateLimit, gcra } from "throttlekit";
import { RedisStore } from "throttlekit/redis";
import Redis from "ioredis";

const store = new RedisStore({ client: new Redis(process.env.REDIS_URL) });

const limiter = rateLimit({
  strategy: gcra({ limit: 1000, periodMs: 60_000, burst: 100 }),
  store,         // one EVALSHA per check, fully atomic — no read-then-write race
  prefix: "api", // namespace, so one store can back many limiters
});

const d = await limiter.check(userId);

Built-in strategies run their atomic Lua form in a single EVALSHA (with an EVAL fallback on NOSCRIPT); custom strategies fall back to optimistic concurrency (WATCH/MULTI/EXEC). RedisStore derives now from the Redis server clock, so node clock skew never corrupts shared state.

Any Redis client — including serverless / edge. RedisStore speaks the ioredis shape directly; for the official node-redis client or the Upstash REST client (where TCP isn't allowed), wrap it:

import { RedisStore, fromNodeRedis, fromUpstash } from "throttlekit/redis";

new RedisStore({ client: new Redis(url) });                  // ioredis — straight through
new RedisStore({ client: fromNodeRedis(nodeRedisClient) });  // official `redis` client
new RedisStore({ client: fromUpstash(Upstash.fromEnv()) });  // Upstash REST — serverless/edge

All three are proven bit-identical to the in-process path by the conformance suite (ioredis and node-redis tested against a live server). Upstash REST has no interactive WATCH/MULTI, so it supports the Lua-backed built-ins only. See examples/redis-distributed.ts.

PostgreSQL — no Redis required

Already running Postgres? You don't need to add Redis. PostgresStore is a fully distributed backend:

import { rateLimit, gcra } from "throttlekit";
import { PostgresStore } from "throttlekit/postgres";
import { Pool } from "pg";

const store = new PostgresStore({ pool: new Pool({ connectionString: process.env.DATABASE_URL }), prefix: "api" });
const limiter = rateLimit({ strategy: gcra({ limit: 1000, periodMs: 60_000, burst: 100 }), store });
const d = await limiter.check(userId);

It runs the same pure JS transform the in-memory store runs — no Postgres-specific algorithm to keep in sync — inside a transaction serialized per key by a transaction-scoped advisory lock (pg_advisory_xact_lock, which serializes first-touch keys that SELECT … FOR UPDATE cannot). So concurrent checks are atomic (N simultaneous at limit K admit exactly K, proven on a live server) and decisions are bit-identical to the other backends. Pass a pg.Pool directly — no adapter. Each check is one transaction; for hot keys, use it as the L2 of twoTier({ mode: "leased" }). See examples/postgres.ts.

Two-tier — local cache in front of the network

Front the distributed store (L2) with a local in-process tier (L1) and pick the consistency/throughput trade-off:

import { twoTier, gcra } from "throttlekit";

const limiter = twoTier({
  strategy: gcra({ limit: 10_000, periodMs: 60_000, burst: 500 }),
  l2: store,                                   // a distributed store, e.g. RedisStore
  mode: "leased",                              // "strict" | "cached-deny" | "leased"
  lease: { batch: 50, windowCoupled: true },   // lease 50 at a time; expire at the L2 window
});
Mode Network cost Global accuracy Best for
strict 1 round trip / request Exact Hard quotas, billing
cached-deny 1 round trip / allowed request Exact for allows, local for denies Public APIs under abuse
leased ~1 round trip / batch requests Bounded overshoot (below) High-throughput internal APIs

leased trades exactness for throughput, with a provably bounded worst-case overshoot you choose:

  • Default (carryover): admitted ≤ Limit + N·(Batch−1) — tight, but grows with the fleet size N.
  • windowCoupled: true: credits expire at the L2 window boundary, so admitted ≤ Limitindependent of N. Opt-in; default off preserves legacy behaviour. twoTier's check is async (checkSync throws — L2 is asynchronous).

Formally verified — and independent of fleet size

These bounds are proven, not claimed. A TLA⁺ spec is model-checked with TLC — carryover overshoot is exactly Limit + N·(Batch−1) (a counterexample shows it's tight, not loose) — and window-coupling tightens it to exactly Limit, independent of N (second spec + a Java-free exhaustive checker reproducing both in CI). This is the shipped core of GALE, the research track on provable distributed leasing. Full write-up: docs/FORMAL-MODEL.md.

Sizing the lease (leaseSizer)

The batch is a throughput-vs-stranding dial: larger batches mean fewer L2 round trips but more credits checked out of the shared pool that other nodes can't use. The right value tracks a node's demand, which drifts — so rather than hard-code it, leaseSizer learns it online (AdaGrad on the EOQ coordination-vs-stranding cost, O(√T) regret), and predictiveLeaseSizer folds in a per-window demand prediction with consistency + robustness:

import { leaseSizer } from "throttlekit";
const sizer = leaseSizer({ orderCost: 20, strandPenalty: 1 });
const batch = sizer.size();      // use as twoTier lease.batch
// …serve the window…
sizer.observe(demandThisWindow); // learn for next window

This can never loosen the bound: under windowCoupled the overshoot is exactly Limit regardless of batch, so the batch sets only coordination frequency, not safety. These are GALE Pillars 2–3 (see the research track); today they're standalone learners that advise lease.batch — wiring them into the lease loop as live in-flight adaptation is a planned follow-up.

Multi-region

A global limit across regions is the leased model with the regions as the leasing nodes and one shared L2. Each region serves the bulk of its traffic from a local lease — region-local latency, no per-request cross-region hop — and the same verified bound caps the worldwide overshoot:

global admitted per window  ≤  Limit + regions × (batch − 1)   (carryover)
                            ≤  Limit                            (windowCoupled — any number of regions)

So 4 regions leasing batch: 50 against a global limit: 10_000 admit at most 10_196 worldwide under carryover (< 2% overshoot for ~1 cross-region hop per 50 requests) — or exactly 10_000 with windowCoupled, no matter how many regions. There's no separate multi-region engine to trust — it's twoTier leased at a shared store. See examples/multi-region.ts.

When the backend goes down

The in-process MemoryStore never fails. A distributed store can — see Operations → Failure modes for the fail: "open" | "closed" policy, plus the two extra hedges (twoTier leased keeps serving from the local lease during a brief L2 outage; the Redis path is a single atomic round trip with no read-then-write window to interrupt).

Clone this wiki locally