Home

throttlekit-py

Beyond rate limiting — from Python. Govern rate, concurrency & cost, provably: this is ThrottleKit's Python client, returning decisions from the one Node core and its two engines — GALE (provable distributed leasing) and TALE (LLM token-budget escrow) — bit-identical to the Node oracle, through either of two pluggable backends.

Installed as throttlekit-py, imported as throttlekit (PyPI's throttlekit is an unrelated project).

🌐 throttlekit.in · 📦 PyPI · 🧪 runnable examples (one script per axis)

The one invariant

The whole design rests on it: exactly one thing computes a Decision — the Node core, directly or as Lua-in-Redis. Neither backend re-implements an algorithm, so there is no second rate limiter to keep in sync and no float-determinism risk. The client transports decisions; it never derives them.

Why reach for it

The decisions you get back carry the core's guarantees — not a re-implemented approximation:

a machine-checked (TLA⁺), fleet-size-independent overshoot bound — window-coupled leasing admits ≤ the limit at any fleet size; most rate limiters state no bound at all;
GALE (provable distributed leasing) and TALE (LLM token-budget escrow), shipped as features and reachable from Python — leased two-tier check, the cost axis via debit, unified rate × concurrency × cost via admit;
bit-identical results, replayed against the core's golden vectors through real Redis;
fleet scale with no client change — point the same ServiceBackend at a distributed-configured server (federated: / fleetBudget: / distributedConcurrency:) for fleet-coordinated decisions, or lease a chunk of the global budget with the new FleetBackend (0.5.0).

A Python service gets the same proven core a Node fleet does, not a second limiter to keep in sync. The guarantees — how they work — are what make ThrottleKit worth reaching for from any language.

Two backends

Backend	Path	Decision computed in	Use it when
`ServiceBackend`	gRPC → `throttlekit-server`	the service (= the core)	you want the full surface (rate · cost · concurrency · unified) and never to touch the raw wire
`RedisBackend`	vendored Lua → the same Redis a Node fleet uses	Lua-in-Redis (the core's script)	you already run Redis and want one hop, no extra service — `check` only

Install

pip install throttlekit-py            # the gRPC ServiceBackend
pip install "throttlekit-py[redis]"   # + a redis client for the direct RedisBackend

30 seconds

from throttlekit import ServiceBackend

with ServiceBackend("localhost:50051") as rl:
    d = rl.check("api", api_key)
    if not d.allowed:
        ...  # 429 — retry after d.retry_after_ms

A denial is a normal Decision (allowed is False), never an exception; gRPC faults map to PolicyNotFoundError / OperationNotSupportedError / ServiceUnavailableError.

Guides

Page	What's in it
Getting Started	Install, the two backends, the `Decision` object, errors
The axes	Every axis from Python: `check` (rate), `debit` (cost), leased two-tier, `admit` (concurrency / unified)
Fleet & Monitor clients	Tier-2 fleet leasing (`FleetBackend` / `LeasedLimiter`) + reading the server's live state (`MonitorBackend`) — new in 0.5.0
Conformance & development	How it stays in lock-step with the core, and how to develop / contribute

Status

Experimental (alpha). The contract (throttlekit.proto, the golden vectors, the extracted Lua) is vendored and checksum-pinned from the throttlekit core's frozen public API; this client tracks it. The .proto evolves additively only — 0.5.0 adds the read-only Monitor and Fleet services alongside the unchanged decision RPCs — while the raw Lua wire ships frozen: false, so the RedisBackend is explicitly experimental and may change with the core's scripts.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Home

throttlekit-py

The one invariant

Why reach for it

Two backends

Install

30 seconds

Guides

Status

See also

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally