-
Notifications
You must be signed in to change notification settings - Fork 0
Home
Beyond rate limiting — from Python. Govern rate, concurrency & cost, provably: this is ThrottleKit's Python client, returning decisions from the one Node core and its two engines — GALE (provable distributed leasing) and TALE (LLM token-budget escrow) — bit-identical to the Node oracle, through either of two pluggable backends.
Installed as
throttlekit-py, imported asthrottlekit(PyPI'sthrottlekitis an unrelated project).
🌐 throttlekit.in · 📦 PyPI · 🧪 runnable examples (one script per axis)
The whole design rests on it: exactly one thing computes a Decision — the Node core, directly or as Lua-in-Redis. Neither backend re-implements an algorithm, so there is no second rate limiter to keep in sync and no float-determinism risk. The client transports decisions; it never derives them.
The decisions you get back carry the core's guarantees — not a re-implemented approximation:
- a machine-checked (TLA⁺), fleet-size-independent overshoot bound — window-coupled leasing admits ≤ the limit at any fleet size; most rate limiters state no bound at all;
-
GALE (provable distributed leasing) and TALE (LLM token-budget escrow), shipped as features and reachable from Python — leased two-tier
check, the cost axis viadebit, unified rate × concurrency × cost viaadmit; - bit-identical results, replayed against the core's golden vectors through real Redis;
-
fleet scale with no client change — point the same
ServiceBackendat a distributed-configured server (federated:/fleetBudget:/distributedConcurrency:) for fleet-coordinated decisions, or lease a chunk of the global budget with the newFleetBackend(0.5.0).
A Python service gets the same proven core a Node fleet does, not a second limiter to keep in sync. The guarantees — how they work — are what make ThrottleKit worth reaching for from any language.
| Backend | Path | Decision computed in | Use it when |
|---|---|---|---|
ServiceBackend |
gRPC → throttlekit-server
|
the service (= the core) | you want the full surface (rate · cost · concurrency · unified) and never to touch the raw wire |
RedisBackend |
vendored Lua → the same Redis a Node fleet uses | Lua-in-Redis (the core's script) | you already run Redis and want one hop, no extra service — check only |
pip install throttlekit-py # the gRPC ServiceBackend
pip install "throttlekit-py[redis]" # + a redis client for the direct RedisBackendfrom throttlekit import ServiceBackend
with ServiceBackend("localhost:50051") as rl:
d = rl.check("api", api_key)
if not d.allowed:
... # 429 — retry after d.retry_after_msA denial is a normal Decision (allowed is False), never an exception; gRPC faults map to PolicyNotFoundError / OperationNotSupportedError / ServiceUnavailableError.
| Page | What's in it |
|---|---|
| Getting Started | Install, the two backends, the Decision object, errors |
| The axes | Every axis from Python: check (rate), debit (cost), leased two-tier, admit (concurrency / unified) |
| Fleet & Monitor clients | Tier-2 fleet leasing (FleetBackend / LeasedLimiter) + reading the server's live state (MonitorBackend) — new in 0.5.0 |
| Conformance & development | How it stays in lock-step with the core, and how to develop / contribute |
Experimental (alpha). The contract (throttlekit.proto, the golden vectors, the extracted Lua) is vendored and checksum-pinned from the throttlekit core's frozen public API; this client tracks it. The .proto evolves additively only — 0.5.0 adds the read-only Monitor and Fleet services alongside the unchanged decision RPCs — while the raw Lua wire ships frozen: false, so the RedisBackend is explicitly experimental and may change with the core's scripts.
- throttlekit-py on PyPI · repository
- ThrottleKit (the Node core) · its Polyglot & Python wiki page
-
throttlekit-server— the gRPC service theServiceBackendtalks to