Skip to content

Fleet and Monitor

ameyaborkar edited this page Jun 10, 2026 · 1 revision

Fleet & Monitor clients

0.5.0 adds two Python clients for the server's additive Fleet and Monitor doors, and makes the four distributed features reachable from the existing ServiceBackend with no client change. The Node core stays the one oracle throughout — these clients transport and spend, they never re-derive a decision.

Tier-1 — distributed decisions, no client change

The simplest way to go fleet-wide is to change nothing in Python. Point the same ServiceBackend (or AsyncServiceBackend) at a throttlekit-server whose policy is configured distributed, and every decision is coordinated across the whole fleet:

Server policy Reached by Holds across the fleet
federated: check one global per-window rate budget across regions
fleetBudget: debit one cost/token budget across every instance
distributedConcurrency: admit one in-flight ceiling across every instance
federatedFairEscrow: check one weighted-fair budget split across tenants, fleet total ≤ L
from throttlekit import ServiceBackend

# the server runs `federated:` / `distributedConcurrency:` / … — the Python code is unchanged
with ServiceBackend("limiter.internal:50051") as rl:
    d = rl.check("global-api", tenant)     # now bound by ONE global budget across the fleet

This is the recommended path. Reach for the Fleet door (below) only when a per-request round trip is itself the bottleneck.

Tier-2 — FleetBackend: lease a chunk, spend it locally

When the server runs a federated: policy, a very high-throughput client can lease a slice of the global per-window budget and serve it locally — round-tripping only to refresh, not once per request. The server sizes the grant (the one oracle); LeasedLimiter spends it with a LeaseSpender that is byte-for-byte the core's leased path (pinned by the golden lease vectors).

from throttlekit import FleetBackend

with FleetBackend("localhost:50051") as fleet:               # loopback needs no secret
    api = fleet.leased("global-api", domain="acme", batch=200)   # lease ~200 at a time, for this tenant
    for _ in workload:
        d = api.check()                                      # local credit; refreshes only when low
        if not d.allowed:
            backoff(d.retry_after_ms)                        # the global window is spent (server's verdict)
  • fleet.leased(policy, *, domain="", batch=1, window_coupled=True) → LeasedLimiterbatch (≥ 1) is the throughput lever: how many units to lease per refresh round trip. The default 1 round-trips once per request (like the direct service door); raise it (e.g. 200) to serve ~batch requests per Fleet.Reserve. domain selects which tenant's budget to lease (empty leases the policy as a whole). One LeasedLimiter tracks one (policy, domain) budget.
  • LeasedLimiter.check(cost=1) → Decision — serves from local credits, or refreshes when the chunk is spent. Returns a local allow, or the server's denial verbatim when the global budget is exhausted. It takes no key — the domain already selected the budget.
  • fleet.reserve(policy, *, domain="", wants=1, axis="rate") → Lease — the raw, one-shot lease if you want to manage spend yourself.

The grant is window-coupled and discarded at the server's window boundary, so the fleet never over-admits — at any client count. AsyncFleetBackend is the await twin (async with, await api.check()).

v1 leases the rate axis only; reserving an unsupported axis raises OperationNotSupportedError, and a non-leasable policy raises PolicyNotFoundError.

MonitorBackend: read the server's live state

The read-only Monitor door exposes the same operational state ThrottleKit Lens renders in the terminal — from Python, remotely:

from throttlekit import MonitorBackend

with MonitorBackend("localhost:50051") as mon:               # loopback needs no secret
    snap = mon.get_snapshot()                                # a point-in-time operational snapshot
    for p in snap.policies:
        print(p.name, p.allowed, p.denied)                   # per-policy allow / deny, top keys, latency, …

get_snapshot() returns a MonitorSnapshot (its .policies carry per-policy allowed / denied / limit / latency + top keys). AsyncMonitorBackend is the await twin. The door is read-only — it never computes or affects a decision.

Auth

Both doors are loopback-only by default (the Fleet door hands out budget; the Monitor snapshot carries traffic keys = PII). To reach them from another host, set the server's --fleet-secret / --monitor-secret and pass it here:

fleet = FleetBackend("limiter.internal:50051", secret="…", credentials=tls_creds)
mon   = MonitorBackend("limiter.internal:50051", secret="…", credentials=tls_creds)

Conformance

The Tier-2 spend is proven, not trusted: LeaseSpender replays the golden lease vectors byte-for-byte against the Node core's twoTier(leased, windowCoupled) L1 path — see Conformance & development.

Next: Conformance & development — how the client stays bit-for-bit with the core.