Skip to content

Performance

Ameya Borkar edited this page Jun 10, 2026 · 4 revisions

Performance

All numbers are reproducible on your own hardware via npm run bench (and npm run bench:compare for the head-to-head). They are measured on the dev machine (below), not vendor claims, and vary run-to-run (~±10%). The Redis/Postgres latencies here are Docker-on-Windows (WSL2, NAT-bound) and are dominated by that network path — read the cross-library relative numbers and the round-trip counts as the signal, not the absolute microseconds. The canonical performance document — algorithms labelled, methodology, machine spec, per-tier p50/p99 — is BENCH.md; every place ThrottleKit loses is in SCOREBOARD.md.

In-process, single hot key

Measured 2026-05-31 (1.0.0) on an AMD Ryzen AI 9 HX 370 (24 threads), Node v24.13.1; reproducible via npm run bench:

  • checkSync (GCRA): ~5.9M ops/s, 169 ns/op, ~1 B/op (≈allocation-free).
  • check (async, GCRA): ~3.3M ops/s (~300 ns/op).
  • Token bucket checkSync: ~5.5M ops/s; fixed window checkSync: ~5.2M ops/s.
  • Redis: exactly one EVALSHA round trip per check.
  • Tier-2 Fleet lease: spending a held lease (LeaseSpender.spend) is ~10 ns/op — a local counter decrement, no network. The Fleet.Reserve lease amortizes the per-request server round trip across a whole batch (a twoTier(leased) Redis batch-100 sustains ~66.4k ops/s end-to-end over the Docker-on-Windows path). See Scaling & the Fleet.

Head-to-head, the honest version

npm run bench:compare — same machine, process, warmup, and iteration count; all on the allow path. The algorithm each library actually implements is labelled (a fixed-window counter and a GCRA cell are not the same guarantee even at equal ops/s).

  • Sync: ThrottleKit is one of the few JS limiters with a synchronous API at all, and it's allocation-free.
  • Redis (loopback): level with rate-limiter-flexible on throughput and p50 (both one atomic Lua round trip), with a tighter tail (cached EVALSHA + a leaner script).
  • Async in-memory: ThrottleKit edges past rate-limiter-flexible (~3.3M vs ~3.0M ops/s); express-rate-limit's bare Map counter stays ~1.5× faster (~5.0M) — the cost of returning a full Decision over a bounded-memory GCRA / timing-wheel store. All contenders are far past real-world per-process need.
  • Postgres: a single bare check trails rate-limiter-flexible's one-statement upsert (~2.9×, by design — PostgresStore runs one generic transaction per strategy so the same proven transform drives every backend). Under load, twoTier(leased) over Postgres amortizes one transaction per batch requests into a ~35× throughput win (12.3k vs 348 ops/s).

Why "where it loses" is in the docs

A rate limiter's value is its correctness guarantee, not its microbenchmark. ThrottleKit publishes the cases where a leaner counter beats it precisely because the trade is deliberate: a bounded-memory GCRA cell with a smooth pacing guarantee and a proven distributed bound is worth a few hundred nanoseconds against a plain counter that offers neither. The benchmark harness is in the repo so you can confirm all of this on your own hardware.

Clone this wiki locally