Monitoring and the Lens

Monitoring & the Lens

ThrottleKit Lens is a built-in, zero-dependency, read-only monitoring dashboard. It gives every limiter the full operational board — and one view no other rate-limiter dashboard can render: live binding-axis attribution.

throttlekit-lens is @experimental — it lives outside ThrottleKit's 1.x SemVer freeze (its surface and snapshot shapes may change in a minor). It builds on the core's @experimental admissionTap / withAdmissionAnalytics primitives and needs throttlekit >= 1.1.0.

Already emitting OpenTelemetry? Keep doing that — the Lens doesn't replace your metrics backend (see Operations for the OTel layer). The Lens is the no-backend, see-it-now view, and it carries a signal Prometheus/Grafana structurally can't: which axis bound each denial.

The hero — which axis is throttling you, right now

Most rate-limit dashboards can tell you that requests were denied. Because ThrottleKit composes rate × concurrency × cost in a single unifiedAdmission decision, the Lens can tell you which constraint actually bound each denial — rate, concurrency, cost, or the joint-LP policy lane — as a live Sankey from policy → binding lane → top-denied keys. Click a denial and a drawer shows the exact per-axis Decision (remaining / limit / retryAfterMs / resetAt) that produced it: the literal "why was this throttled, with numbers."

This is structural, not a UI trick. bindingAxis is minted end-to-end inside unified admission but is exposed only as an OpenTelemetry span attribute — never a metric label — so no Prometheus/Grafana board can break denials down by axis. The Lens is a purpose-built live view that needs no observability backend and carries the numeric per-axis decision the span omits. (Honest closest analog with your existing tools: faceting the throttlekit.binding_axis span in an APM trace view.)

Universal — the full board for every limiter

The axis lane is the premium layer for unified-admission users; the dashboard itself works for everyone. A plain rateLimit(), quota, twoTier, token-budget, or concurrency guard gets the whole board plus "why throttled" attribution by policy/limiter + hot key — sourced from the same tapDecisions stream and withAnalytics top-K that already ship in the core. A single-axis limiter simply has one axis, so there is nothing to decompose; the Lens says so rather than implying otherwise.

Three ways to run it

Register what you already use with a hub; it returns tapped wrappers to use in their place. The taps are synchronous, exception-swallowing, and O(1) — the dashboard can never perturb your control path or change a decision.

1. Mount it in your own app (no extra port)

import { createLensHub, lensHandler } from "throttlekit-lens";
import { rateLimit, gcra, unifiedAdmission } from "throttlekit";

const hub = createLensHub();
const api = hub.trackLimiter("api", rateLimit({ strategy: gcra({ limit: 100, periodMs: 60_000 }) }));
const checkout = hub.trackAdmitter("checkout", unifiedAdmission({ rate, concurrency, cost }));

// Express — mount the read-only handler at a private path, behind a token:
const handler = lensHandler(hub, { basePath: "/__throttlekit", token: process.env.LENS_TOKEN });
app.use("/__throttlekit", (req, res) => handler(req, res));

// ...then use `api` / `checkout` exactly like the originals.

2. Run a standalone sidecar

import { createLensHub, serveLens } from "throttlekit-lens";

const hub = createLensHub();
// ...track your limiters / admitters / guards...
const lens = await serveLens(hub, { port: 9090 }); // loopback by default
console.log(`Lens at ${lens.url}`);

3. Get it for free on the service door

throttlekit-server serves the Lens on by default, bound to loopback — no code, works for Python/Go/any-language clients too (it watches the server's decisions, whatever drives them):

throttlekit-server --config .throttlekit.yaml
#  → gRPC on :50051  +  Lens dashboard on http://127.0.0.1:9090

--lens off disables it; --lens-host / --lens-port move or expose it (a non-loopback host warns and wants a --lens-token); --lens-aggregator <url> pushes this node's snapshot to a fleet aggregator.

What's on the board

Admission attribution (hero) — the policy → binding-lane → top-denied-keys Sankey + a stacked-area deny-rate-by-axis timeline + the click-to-snapshot drawer.
Throughput & outcome — requests/s split allow/deny.
Deny rate — overall, by policy/limiter for everyone, and the per-axis split for unified admitters.
Top keys/tenants — Space-Saving top-K heavy hitters (requested + denied), each row carrying a policy (and binding-axis) chip.
Latency — admit-/check-path latency over a small ring.
Concurrency health — limit vs inflight, no-load RTT, share / lGlobal / nodes, the fenced flag, and a live self-fence event feed.
Guarantee panel (below).
Store / fleet health — backend, reachability, fail-mode (and, in server mode, lease-table size + reclaim count).

The Guarantee panel

ThrottleKit's headline is a machine-checked, fleet-size-independent overshoot bound. The Guarantee panel makes it operational: per key, live admitted-this-window vs the design-time-computed ceiling Limit + N·(B−1) (collapsing to exactly Limit under windowCoupled), rendered as headroom to a proven line — plus live Σinflight ≤ L PASS/FAIL chips, each linking to the TLA⁺ spec.

It is headroom-to-a-known-line, not a "proof is holding" needle: the bound and invariants are design-time proofs (a static badge + chips that link to the spec), and approaching the line is real model-predicted overshoot or a misconfiguration — never a claim that the Lens is continuously model-checking.

Fleet view

Each instance serves its own per-process Lens. For a fleet-global view, point every node at one aggregator — it merges additive counters (allow/deny, per-lane denials) across nodes and re-tops the heavy hitters, into a single mode:"fleet" snapshot the same UI renders:

import { createLensAggregator, serveLensAggregator, pushSnapshots } from "throttlekit-lens";

// Aggregator host:
const agg = createLensAggregator();
await serveLensAggregator(agg, { port: 9091, token: process.env.LENS_TOKEN });

// Each node (or `throttlekit-server --lens-aggregator http://aggregator:9091`):
pushSnapshots(hub, { url: "http://aggregator:9091", token: process.env.LENS_TOKEN });

Best-effort and eventually-consistent: the fleet view reflects the last snapshot each live node pushed (stale nodes are evicted). The top-K merge is an honest additive merge of the per-node Space-Saving lists — it never drops a true fleet heavy hitter.

Security & defaults

On by default only on throttlekit-server, and only on loopback. The embedded handler is opt-in (you choose where to mount it — it's your process); the sidecar binds to 127.0.0.1 by default.
Exposing beyond loopback wants auth. A non-loopback host without TLS or a token logs a loud warning (mirroring the insecure-gRPC warning). Pass { token } (constant-time-compared Authorization: Bearer …) and/or { tls } (HTTPS, or mTLS with a caPath).
Strictly read-only. Only GET is served; there are no mutation endpoints. The board exposes keys/tenants, so on a shared host loopback isn't private — hence the off switch (--lens off) and the expose-requires-auth rule.

Honest scope (the non-claims)

The binding-axis lane needs unifiedAdmission; a single-axis rateLimit() has nothing to decompose (the board still works — it just shows policy/key attribution).
Numbers are eventually-consistent and per-window; top-K is Space-Saving (over-estimates, never misses a true heavy hitter).
The Guarantee panel is live overshoot by accumulation vs a computed line, rendered as headroom — not a native admitted_this_window field, and not a live model-checker.
This is not bit-exact replay; the Lens streams live decisions. Reproducing a past incident exactly is a future direction (deterministic replay against a candidate policy).

Built on

Two @experimental core primitives, both zero-dep and read-only:

admissionTap(admitter, onAdmission) — the multi-axis sibling of tapDecisions; fires once per completed unified admission with the combined decision, the binding axis, and the per-axis snapshot.
withAdmissionAnalytics(admitter, opts) — the lane-segmented fork of withAnalytics: allow/deny counters and Space-Saving top-K, partitioned by binding lane (with Σ deniedByLane === denied).

A plain rateLimit() feeds the universal board through the existing tapDecisions + withAnalytics; a unifiedAdmission additionally lights up the axis lane through these two.

See also: Operations (OTel, the analytics tap, headers, failure modes) · Unified admission (the rate × concurrency × cost decision the axis lane attributes) · Distributed & provable (the bound the Guarantee panel tracks) · the package README.

ThrottleKit · MIT · 1.0 — API frozen under SemVer (Stability)

ThrottleKit Wiki

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Monitoring and the Lens

Monitoring & the Lens

The hero — which axis is throttling you, right now

Universal — the full board for every limiter

Three ways to run it

1. Mount it in your own app (no extra port)

2. Run a standalone sidecar

3. Get it for free on the service door

What's on the board

The Guarantee panel

Fleet view

Security & defaults

Honest scope (the non-claims)

Built on

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally