Polyglot and Python

Polyglot — one core, every language

ThrottleKit is a Node library, but you don't have to be on Node to use it. A layered-hybrid design lets a Python (or any-language) service reach the same limiter — the same decisions, bit-for-bit — without re-implementing a single algorithm.

The load-bearing rule: exactly one thing computes a Decision — the Node core, directly or as Lua-in-Redis. Every other surface is a thin pipe, conformance-checked against one set of language-neutral golden vectors. There is no second rate limiter to keep in sync and no float-determinism risk.

Two doors

Door	Package	Decision computed in	Reach
Service	`throttlekit-server` (a gRPC service)	the service (= the core)	the full surface — rate, cost, concurrency, unified
Direct	the client runs the core's vendored Lua against the same Redis your Node fleet uses	Lua-in-Redis (the core's own script)	`check` only — one hop, no extra service

Both prove themselves against the golden vectors; neither re-derives a decision.

The service door — `throttlekit-server`

A small standalone service (depends on the published throttlekit + @grpc/grpc-js). It loads a .throttlekit.yaml, runs the core, and answers the throttlekit.proto contract (throttlekit.v1.RateLimiter). A denial is a normal decision (allowed: false), never an RPC error — operational faults map to gRPC status codes (NOT_FOUND / UNIMPLEMENTED / UNAVAILABLE).

npx throttlekit-server --config .throttlekit.yaml --port 50051
#   pick a shared store for a coordinated fleet (omit for a single-instance in-process memory store):
#     --redis redis://…                             Redis
#     --postgres-url postgres://user:pass@host/db   Postgres (no Redis needed)
#     --store dynamodb --dynamodb-table tk          DynamoDB (+ --dynamodb-create-table to provision)
#   --tls-cert/--tls-key/--tls-ca for mTLS

The server is store-agnostic behind a pluggable resolver — --store memory|redis|postgres|dynamodb (inferred from the URL flags when omitted). The client sends the same requests regardless of backend; the core computes every decision server-side, so they stay bit-identical. (Deno KV and Cloudflare D1 / Durable Objects / Workers KV are edge-runtime stores — reachable only inside those runtimes, not through the Node service door.)

# .throttlekit.yaml — one file, every axis
version: 1
limiters:
  api:                      # rate (the base axis)
    { strategy: gcra, limit: 1000, period: 1m, burst: 100 }
  leased:                   # two-tier leased (cut the per-request L2 round trip)
    strategy: gcra
    limit: 1000
    period: 1m
    twoTier: { mode: leased, batch: 100, windowCoupled: true }
  completions:              # cost axis — a windowed token budget (LLM gateway)
    tokenBudget: { budget: 100000, windowMs: 60000 }
  checkout:                 # concurrency — at most N requests in flight
    concurrency: { maxLimit: 64 }
  unified:                  # rate AND concurrency, whichever binds first
    strategy: gcra
    limit: 1000
    period: 1m
    concurrency: { maxLimit: 64 }

The proto is additive-evolvable and is the stable polyglot contract; the raw Lua wire is behavior-locked but deliberately not frozen (it can change with the core's scripts). See research/polyglot/DESIGN.md.

The Python client — `throttlekit-py`

Installed as throttlekit-py, imported as throttlekit (PyPI's throttlekit is an unrelated project):

pip install throttlekit-py            # the gRPC ServiceBackend
pip install "throttlekit-py[redis]"   # + a redis client for the direct RedisBackend

Every axis is reachable. A denial is always a normal Decision/Admission, never an exception.

from throttlekit import ServiceBackend

with ServiceBackend("localhost:50051") as rl:
    # Rate — the base axis (also check_many / peek / forecast)
    if not rl.check("api", api_key).allowed:
        return 429

    # Cost — debit the actual tokens a stream produces (the LLM-gateway problem)
    rl.debit("completions", tenant, tokens=n)

    # Concurrency / unified — hold an in-flight slot for the duration of the work
    with rl.admit("checkout", user_id) as adm:
        if not adm.allowed:
            return 429                 # adm.binding_axis names the axis that bound it
        do_work()                      # released on exit (dropped=True if it raises)

Axis	Python	Notes
Rate	`check` / `check_many` / `peek` / `forecast`	the base limiter; `peek`/`forecast` are service-door only
Two-tier leased	`check` (transparent)	the policy is configured as `twoTier` server-side; the client just calls `check`
Cost / token budget	`debit(policy, key, tokens)`	windowed budget; per-token debiting overshoots by 0
Concurrency / unified	`admit(policy, key) → Admission`	holds a crash-safe lease; `heartbeat=True` for long holds

Crash safety for admit: a granted admission holds a server lease; if the client crashes without releasing, the server reclaims the slot once the lease TTL lapses without a heartbeat — the same node↔coordinator contract the core uses for distributed concurrency, one layer out.

The direct Redis door runs the core's vendored Lua against the same Redis a Node fleet shares, and replays the full rate-limit golden vectors through real Redis to reproduce the Node oracle bit-for-bit:

import redis
from throttlekit import RedisBackend, Gcra

api = RedisBackend(redis.Redis.from_url("redis://localhost:6379"),
                   Gcra(limit=100, period_ms=60_000, burst=20), prefix="prod")
d = api.check(api_key)   # decided server-side, in Lua; same key scheme as the Node core

How it stays in lock-step

throttlekit-py vendors the contract from the core with checksums (scripts/sync_contract.py): the throttlekit.proto, the golden vectors, and the runtime Lua (with the core's manifest.json). A drift-gate test fails if any vendored byte diverges from its pinned checksum, and the real proof is behavioral — the cross-language conformance suite replays every rate-limit vector through Python → vendored Lua → real Redis and asserts each reply field equals the Node oracle.

Fleet-coordinated features — over the existing RPCs

The Tier-1 distributed features reach Python with no client change — they ride the server's existing decision RPCs, so a stock ServiceBackend gets them just by pointing at a fleet-configured policy: federated: / federatedFairEscrow: over check, fleetBudget: over debit, and distributedConcurrency: over admit. The coordination lives server-side; the client just makes the same call it always did. On top of that the server adds two new additive services with first-class Python clients (0.5.0): the Fleet lease door (Fleet.Reserve — a client-held window-coupled lease, the Tier-2 path) and the read-only Monitor door (GetSnapshot / Watch, the programmable observability surface — see Operations). Full picture in Scaling & the Fleet.

Status

The polyglot surface is experimental (alpha). The .proto is the comfortable, additive-only contract (Monitor + Fleet are new additive services; the decision RPCs are unchanged); the raw Lua wire ships frozen: false, so the direct RedisBackend may change with the core's scripts. The single-instance stateful axes (cost/concurrency) still work standalone; the fleet-coordinated variants above are reachable by the same client API.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Polyglot and Python

Polyglot — one core, every language

Two doors

The service door — `throttlekit-server`

The Python client — `throttlekit-py`

How it stays in lock-step

Fleet-coordinated features — over the existing RPCs

Status

See also

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally

Polyglot and Python

Polyglot — one core, every language

Two doors

The service door — throttlekit-server

The Python client — throttlekit-py

How it stays in lock-step

Fleet-coordinated features — over the existing RPCs

Status

See also

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally

The service door — `throttlekit-server`

The Python client — `throttlekit-py`