Skip to content

Strategies

Ameya Borkar edited this page May 27, 2026 · 2 revisions

Choosing a strategy

Pick one and pass it to rateLimit({ strategy }):

Goal Strategy
Good general default — tiny state, smooth pacing, controlled bursts gcra({ limit, periodMs, burst? })
Client-friendly "tokens remaining", controlled bursts tokenBucket({ capacity, refillPerSec })
Cheapest coarse cap (allows up to 2× across a boundary, by design) fixedWindow({ limit, windowMs })
Near-exact rolling window at any limit, bounded memory slidingWindow({ limit, windowMs, buckets? })
Exact "N in the last X" at low/moderate limits slidingWindowLog({ limit, windowMs })
Shape/queue outbound calls to a fixed rate (delays, doesn't reject) leakyBucket({ ratePerSec, maxQueueMs })
Protect a service from overload when the right rate is unknown adaptiveConcurrency({ ... })
Billing-period budget that resets on a real calendar boundary ("1M/month on the 1st") quota({ limit, resetCadence })

Notes

  • gcra stores a single number (the theoretical arrival time); burst defaults to limit. It is the recommended default: smooth pacing, controlled bursts, one timestamp of state per key.
  • slidingWindowbuckets defaults to 10 (error ≈ 1/buckets of the window); buckets: 1 is the classic estimator. slidingWindowLog is exact but O(limit) memory per key.
  • leakyBucket and adaptiveConcurrency build a Shaper / concurrency guard, not a Limiter — see Advanced limiting → Backpressure & shaping.

First-class quotas (quota)

A quota is a budget that resets on a real calendar boundary — distinct from a sliding rate limit. The canonical case is "1,000,000 calls/month, resetting on the 1st": Decision.remaining is the quota left this period and Decision.resetAt is the true next boundary (the next civil 1st, leap-year-correct), not a rolling approximation.

import { rateLimit, quota } from "throttlekit";
import { RedisStore } from "throttlekit/redis";

const limiter = rateLimit({
  strategy: quota({ limit: 1_000_000, resetCadence: "calendar-month" }),
  store: new RedisStore({ client }),
  prefix: "quota",
});
const d = await limiter.check(accountId); // d.remaining = calls left this month; d.resetAt = next 1st
resetCadence Resets Extra options
"calendar-month" on the 1st of each civil month offsetMinutes?
"calendar-week" on weekStartsOn each week offsetMinutes?, weekStartsOn? (0=Sun … 6=Sat, default Mon)
"calendar-day" at local midnight offsetMinutes?
"fixed" every periodMs from anchor periodMs (required), anchor? (default epoch-aligned)
"rolling" trailing periodMs window periodMs (required), buckets?

Calendar boundaries come from a dependency-free civil-calendar algorithm (Howard Hinnant's days_from_civil / civil_from_days) that is mirrored byte-for-byte in the atomic Redis Lua form, so a monthly quota decides identically in-process and on Redis, Postgres, DynamoDB, D1, or Deno KV — proven bit-identical by the dual-path conformance suite, even when Redis substitutes its own server clock.

Timezone caveat (honest scope). offsetMinutes is a fixed UTC offset, not a DST-aware zone. A daylight-saving transition cannot be reproduced in Redis Lua (no os.date, no tz database), and faking it would break the bit-identity guarantee — so it is intentionally out of scope. Pick the offset of your billing timezone (most billing runs in UTC or a fixed offset anyway). "rolling" simply delegates to slidingWindow.

Why GCRA by default

GCRA (the Generic Cell Rate Algorithm) paces requests smoothly rather than resetting on a window boundary, so it has no "2× burst across the boundary" pathology that a fixed window has by design. It needs only a single timestamp of state per key, which keeps memory bounded and makes the Redis/Postgres encodings trivial (one number to store). For most APIs, gcra is the right call; reach for the others when you specifically need their semantics (e.g. a client-facing "tokens remaining" display → tokenBucket, or exact trailing-window counting → slidingWindowLog).

Composing strategies

For tiered burst + sustained limits (e.g. 10/sec and 1000/hour), compose two GCRA limiters and allow only if both pass — or use multi-dimensional limits to evaluate several axes in a single round trip.

Clone this wiki locally