Research

GALE & TALE — the proven guarantees

ThrottleKit's distributed guarantees aren't folklore — they come from two bodies of engineering developed alongside the library, GALE and TALE. Both are proven and measured: gated under research/ and test/, with the pieces that ship into the public API marked below. Reproduce with npx vitest run test/gale and npx vitest run test/cost.

GALE — Globally-Accounted Learned Escrow

The first distributed rate limiter with a hard, tight overshoot bound independent of fleet size. Four pillars and a capstone, each machine-checked or measured:

Pillar	Result	Status
1 — safety	Window-coupled overshoot `= L`, independent of N	Shipped as `lease.windowCoupled`; TLA⁺ + exhaustive BFS twin + discrete-event sim (Δ=0 for N→512 under latency/partitions)
2 — efficiency	Online-EOQ lease sizing, `O(√T)` regret	Shipped as `leaseSizer`; measured (avg regret/round 18.6 → 0.40)
3 — predictions	Learning-augmented sizing; consistency + robustness, safety unconditional	Shipped as `predictiveLeaseSizer`; measured
4 — fairness	Weighted Fair Escrow (work-conserving multi-tenant fairness)	Shipped as `weightedMaxMin` / `weightedFairShare`; 4 theorems machine-checked on 20k instances
Capstone	The rate-limiting trilemma `Δ + N·U ≥ (N−1)L`, tight, + a `0<C<N` partial-coordination interpolation	Proven + machine-checked (N ∈ {2,3,4})

The crux insight: stranded capacity is overshoot debt — both are held-but-unused credits surviving the L2 window boundary, so minimizing them tightens overshoot and raises utilization at once. The only real tension is hold-few-credits vs coordination cost, which the trilemma makes precise. Write-ups under research/gale/.

TALE — Temporally-Accounted Learned Escrow

The cost-axis sibling of GALE: token-budget rate limiting for LLMs, where a request's cost — its output tokens — is unknown at admission and revealed only as it streams. A reserve-then-reconcile escrow in three layers:

Layer	Result	Status
1 — streaming meter	Overshoot `≤ g−1` (0 at `g=1`), independent of `max_tokens`	Shipped as `tokenBudget`; measured (vs reserve-max util collapse 0.77→0)
2 — learned reservation	Online newsvendor critical-fractile quantile, `O(√T)` regret	Shipped as `learnedReservation`; measured (avg pinball regret 8.49 → 2.77)
3 — predictions-with-safety	Rank predictor + Hedge; safety unconditional	Shipped as `predictiveReservation`; measured (overshoot 0 under any predictor)
Distributed	Multi-gateway TPM = GALE leased budget (token unit)	Byte-identical to GALE `simulateWindowCoupled`, ∀ gateways C ∈ {1..32}

The unification: GALE escrows across placement (which node spends a shared budget); TALE escrows across cost (how much a single request spends). They are the same reserve → meter-actuals → reconcile mechanism — TALE's learned layers are literally GALE's retargeted onto the cost axis, and the multi-gateway form reduces to GALE leasing token-for-token. Write-up under research/cost-uncertainty/.

Relationship to the shipped library

You do not need to read any of this to use ThrottleKit. The research exists so that the library's distributed claims are provable — and most of it now ships: lease.windowCoupled (GALE Pillar 1), the adaptive lease sizers leaseSizer / predictiveLeaseSizer (Pillars 2–3), weightedFairShare / weightedMaxMin (Pillar 4), and on the cost axis tokenBudget / learnedReservation / predictiveReservation (TALE Layers 1–3) are all first-class API. What remains validation-only is the theory behind them — the trilemma lower bounds and the at-scale discrete-event simulator — which justify the design rather than being code to call. See also docs/FORMAL-MODEL.md.

ThrottleKit · MIT · 1.0 — API frozen under SemVer (Stability)

ThrottleKit Wiki

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Research

GALE & TALE — the proven guarantees

GALE — Globally-Accounted Learned Escrow

TALE — Temporally-Accounted Learned Escrow

Relationship to the shipped library

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally