-
Notifications
You must be signed in to change notification settings - Fork 0
Monitoring and the Lens
ThrottleKit Lens is a built-in, zero-dependency, read-only monitoring dashboard. It gives every limiter the full operational board — and one view no other rate-limiter dashboard can render: live binding-axis attribution.
throttlekit-lensis@experimental— it lives outside ThrottleKit's 1.x SemVer freeze (its surface and snapshot shapes may change in a minor). It builds on the core's@experimentaladmissionTap/withAdmissionAnalyticsprimitives and needsthrottlekit >= 1.1.0.
Already emitting OpenTelemetry? Keep doing that — the Lens doesn't replace your metrics backend (see Operations for the OTel layer). The Lens is the no-backend, see-it-now view, and it carries a signal Prometheus/Grafana structurally can't: which axis bound each denial.
Most rate-limit dashboards can tell you that requests were denied. Because ThrottleKit composes rate × concurrency × cost in a single unifiedAdmission decision, the Lens can tell you which constraint actually bound each denial — rate, concurrency, cost, or the joint-LP policy lane — as a live Sankey from policy → binding lane → top-denied keys. Click a denial and a drawer shows the exact per-axis Decision (remaining / limit / retryAfterMs / resetAt) that produced it: the literal "why was this throttled, with numbers."
This is structural, not a UI trick. bindingAxis is minted end-to-end inside unified admission. As of 1.2.0 you can export it to Grafana as an aggregate counter — throttlekit.denies_by_axis{lane}, via instrumentAdmitter (the deliberate escape hatch; see Operations and docs/METRICS.md). The Lens goes further than any counter can: a live, per-key view that needs no observability backend and carries the exact per-axis Decision (remaining / limit / retryAfter) a metric can't — per-key axis numbers would be unbounded label cardinality.
The axis lane is the premium layer for unified-admission users; the dashboard itself works for everyone. A plain rateLimit(), quota, twoTier, token-budget, or concurrency guard gets the whole board plus "why throttled" attribution by policy/limiter + hot key — sourced from the same tapDecisions stream and withAnalytics top-K that already ship in the core. A single-axis limiter simply has one axis, so there is nothing to decompose; the Lens says so rather than implying otherwise.
Register what you already use with a hub; it returns tapped wrappers to use in their place. The taps are synchronous, exception-swallowing, and O(1) — the dashboard can never perturb your control path or change a decision.
import { createLensHub, lensHandler } from "throttlekit-lens";
import { rateLimit, gcra, unifiedAdmission } from "throttlekit";
const hub = createLensHub();
const api = hub.trackLimiter("api", rateLimit({ strategy: gcra({ limit: 100, periodMs: 60_000 }) }));
const checkout = hub.trackAdmitter("checkout", unifiedAdmission({ rate, concurrency, cost }));
// Express — mount the read-only handler at a private path, behind a token:
const handler = lensHandler(hub, { basePath: "/__throttlekit", token: process.env.LENS_TOKEN });
app.use("/__throttlekit", (req, res) => handler(req, res));
// ...then use `api` / `checkout` exactly like the originals.import { createLensHub, serveLens } from "throttlekit-lens";
const hub = createLensHub();
// ...track your limiters / admitters / guards...
const lens = await serveLens(hub, { port: 9090 }); // loopback by default
console.log(`Lens at ${lens.url}`);throttlekit-server serves the Lens on by default, bound to loopback — no code, works for Python/Go/any-language clients too (it watches the server's decisions, whatever drives them):
throttlekit-server --config .throttlekit.yaml
# → gRPC on :50051 + Lens dashboard on http://127.0.0.1:9090--lens off disables it; --lens-host / --lens-port move or expose it (a non-loopback host warns and wants a --lens-token); --lens-aggregator <url> pushes this node's snapshot to a fleet aggregator.
- Admission attribution (hero) — the policy → binding-lane → top-denied-keys Sankey + a stacked-area deny-rate-by-axis timeline + the click-to-snapshot drawer.
- Throughput & outcome — requests/s split allow/deny.
- Deny rate — overall, by policy/limiter for everyone, and the per-axis split for unified admitters.
- Top keys/tenants — Space-Saving top-K heavy hitters (requested + denied), each row carrying a policy (and binding-axis) chip.
- Latency — admit-/check-path latency over a small ring.
-
Concurrency health —
limitvsinflight, no-load RTT,share/lGlobal/nodes, thefencedflag, and a live self-fence event feed. - Guarantee panel (below).
- Store / fleet health — backend, reachability, fail-mode (and, in server mode, lease-table size + reclaim count).
ThrottleKit's headline is a machine-checked, fleet-size-independent overshoot bound. The Guarantee panel makes it operational: per key, live admitted-this-window vs the design-time-computed ceiling Limit + N·(B−1) (collapsing to exactly Limit under windowCoupled), rendered as headroom to a proven line — plus live Σinflight ≤ L PASS/FAIL chips, each linking to the TLA⁺ spec.
It is headroom-to-a-known-line, not a "proof is holding" needle: the bound and invariants are design-time proofs (a static badge + chips that link to the spec), and approaching the line is real model-predicted overshoot or a misconfiguration — never a claim that the Lens is continuously model-checking.
Each instance serves its own per-process Lens. For a fleet-global view, point every node at one aggregator — it merges additive counters (allow/deny, per-lane denials) across nodes and re-tops the heavy hitters, into a single mode:"fleet" snapshot the same UI renders:
import { createLensAggregator, serveLensAggregator, pushSnapshots } from "throttlekit-lens";
// Aggregator host:
const agg = createLensAggregator();
await serveLensAggregator(agg, { port: 9091, token: process.env.LENS_TOKEN });
// Each node (or `throttlekit-server --lens-aggregator http://aggregator:9091`):
pushSnapshots(hub, { url: "http://aggregator:9091", token: process.env.LENS_TOKEN });Best-effort and eventually-consistent: the fleet view reflects the last snapshot each live node pushed (stale nodes are evicted). The top-K merge is an honest additive merge of the per-node Space-Saving lists — it never drops a true fleet heavy hitter.
-
On by default only on
throttlekit-server, and only on loopback. The embedded handler is opt-in (you choose where to mount it — it's your process); the sidecar binds to127.0.0.1by default. -
Exposing beyond loopback wants auth. A non-loopback host without TLS or a token logs a loud warning (mirroring the insecure-gRPC warning). Pass
{ token }(constant-time-comparedAuthorization: Bearer …) and/or{ tls }(HTTPS, or mTLS with acaPath). -
Strictly read-only. Only
GETis served; there are no mutation endpoints. The board exposes keys/tenants, so on a shared host loopback isn't private — hence the off switch (--lens off) and the expose-requires-auth rule.
- The binding-axis lane needs
unifiedAdmission; a single-axisrateLimit()has nothing to decompose (the board still works — it just shows policy/key attribution). - Numbers are eventually-consistent and per-window; top-K is Space-Saving (over-estimates, never misses a true heavy hitter).
- The Guarantee panel is live overshoot by accumulation vs a computed line, rendered as headroom — not a native
admitted_this_windowfield, and not a live model-checker. - This is not bit-exact replay; the Lens streams live decisions. Reproducing a past incident exactly is a future direction (deterministic replay against a candidate policy).
Two @experimental core primitives, both zero-dep and read-only:
-
admissionTap(admitter, onAdmission)— the multi-axis sibling oftapDecisions; fires once per completed unified admission with the combined decision, the binding axis, and the per-axis snapshot. -
withAdmissionAnalytics(admitter, opts)— the lane-segmented fork ofwithAnalytics: allow/deny counters and Space-Saving top-K, partitioned by binding lane (withΣ deniedByLane === denied).
A plain rateLimit() feeds the universal board through the existing tapDecisions + withAnalytics; a unifiedAdmission additionally lights up the axis lane through these two.
See also: Operations (OTel, the analytics tap, headers, failure modes) · Unified admission (the rate × concurrency × cost decision the axis lane attributes) · Distributed & provable (the bound the Guarantee panel tracks) · the package README.
ThrottleKit · MIT · 1.0 — API frozen under SemVer (Stability)
- Getting Started
- Choosing a strategy
- Frameworks & the edge
- Distributed & provable
- Federation
- Scaling & the Fleet
- Unified admission
- Pillar 4 — Weighted Fair Escrow
- Middleware integration
- Distributed adaptive concurrency
- Advanced limiting
- Overload, fairness & DDoS
- Operations
- Monitoring — ThrottleKit Lens
- Policy Plans
- Replay
- Performance
- Migrating
- Polyglot & Python
- GALE & TALE