-
Notifications
You must be signed in to change notification settings - Fork 0
Advanced Limiting
Multi-dimensional limits and the two backpressure/shaping primitives.
Limit on per-IP and per-user and per-route at once. all({...}) allows only if every dimension allows, and consumes nothing unless all allow (no partial-consume). any({...}) allows if any permits. Pass the result to multiRateLimit (not rateLimit):
import { all, gcra, fixedWindow, multiRateLimit } from "throttlekit";
interface Ctx { ip: string; userId: string; route: string }
const limiter = multiRateLimit<Ctx>({
store, // on Redis, all dimensions fuse into one atomic Lua round trip
strategy: all<Ctx>({
ip: { key: (c) => c.ip, strategy: gcra({ limit: 100, periodMs: 60_000 }) },
user: { key: (c) => c.userId, strategy: gcra({ limit: 1000, periodMs: 60_000 }) },
route: { key: (c) => c.route, strategy: fixedWindow({ limit: 50, windowMs: 1_000 }) },
}),
});
const d = await limiter.check({ ip, userId, route: "/search" });The returned Decision reflects the binding constraint (the denying dimension, or the smallest remaining when allowed). Dimensions support per-dimension cost. On Redis, multi-dimensional checks support gcra, tokenBucket, and fixedWindow. See examples/multi-dimensional.ts.
Not a rate — a dynamically inferred ceiling on in-flight requests, derived from the latency gradient (RTT_noload / RTT_actual) and adjusted with a congestion-control sawtooth.
import { adaptiveConcurrency } from "throttlekit";
const guard = adaptiveConcurrency({ minLimit: 4, maxLimit: 512, algorithm: "gradient2" });
const lease = guard.acquire();
if (!lease.ok) return; // over the inferred ceiling — shed load (e.g. 503)
try {
await handle(request);
} finally {
lease.release(); // latency measured automatically; pass { dropped: true } on a failure/timeout
}Introspect with guard.limit, guard.inflight, guard.stats(). Algorithms: "gradient2" (default) or "aimd". See examples/adaptive-concurrency.ts. For a single ceiling shared across a fleet (so N nodes don't each infer the whole backend's capacity), use distributedAdaptiveConcurrency.
leakyBucket builds a Shaper that delays rather than rejects, smoothing bursty input to a steady output rate — handy for pacing outbound calls to a third-party budget.
import { leakyBucket, QueueFullError } from "throttlekit";
const shaper = leakyBucket({ ratePerSec: 5, maxQueueMs: 2_000 });
try {
await shaper.schedule("upstream-api"); // resolves after the paced delay
await callUpstream();
} catch (err) {
if (err instanceof QueueFullError) console.warn("queue full, retry in", err.retryAfterMs, "ms");
}reserve(key, cost?) returns a Reservation ({ accepted, delayMs }) without sleeping; schedule waits or throws QueueFullError. reserveSync is available on a sync store. See examples/leaky-bucket.ts.
ThrottleKit · MIT · 1.0 — API frozen under SemVer (Stability)
- Getting Started
- Choosing a strategy
- Frameworks & the edge
- Distributed & provable
- Federation
- Scaling & the Fleet
- Unified admission
- Pillar 4 — Weighted Fair Escrow
- Middleware integration
- Distributed adaptive concurrency
- Advanced limiting
- Overload, fairness & DDoS
- Operations
- Monitoring — ThrottleKit Lens
- Policy Plans
- Replay
- Performance
- Migrating
- Polyglot & Python
- GALE & TALE