v0.2.0 — Foundation
Pre-releasethrottle-net v0.2.0 — Composition
The foundation and the differentiators, in one release. v0.2.0 turns the scaffold into a working library: the waiting outbound acquire, the Limiter trait every algorithm and composite shares, and the four ways to compose limits that motivate the crate — hybrid, multi-dimensional, per-key, and layered. All of it shares one correctness rule (peek-then-commit) so a request never spends in one limiter when another would block it. No breaking changes from 0.1; that release was an empty scaffold.
What is throttle-net?
A general-purpose outbound throttling library. Where rate-net protects your service from being overwhelmed (inbound), throttle-net protects your service from overwhelming the downstreams it calls — and from being banned by them. Its defining operation is to wait for capacity rather than reject the caller: you pace your own outbound work. It consumes better-bucket for token-bucket accounting and clock-lib for time, and builds the waiting, cost-aware, composable surface on top.
What's new in 0.2.0
Throttle — the Tier-1 token bucket
The common case in two calls. Construct with per_second / per_duration, then pace your work with acquire().await, which returns as soon as a token is free rather than rejecting you.
use throttle_net::Throttle;
let throttle = Throttle::per_second(100);
throttle.acquire().await?; // waits if necessary
throttle.acquire_with_cost(10).await?; // a heavier requesttry_acquire() / try_acquire_with_cost(n) are the non-blocking forms, peek(cost) is a non-consuming check, and with_clock injects a ManualClock for deterministic tests. Accounting is lock-free — one atomic compare-and-swap per acquire, ~27ns uncontended.
Limiter — the trait everything shares
pub trait Limiter: Send + Sync {
fn peek(&self, cost: u32) -> Decision; // non-consuming
fn acquire_cost(&self, cost: u32) -> Decision; // consuming
fn available(&self) -> u32;
fn capacity(&self) -> u32;
}acquire_cost is the synchronous, consuming core; the waiting acquire surfaces are thin layers on top. peek is what makes "must pass all" composition correct: a composite peeks every constituent before committing, so an early limiter never spends a token for a request a later one refuses.
Hybrid — must pass all constituents
Combine limiters so a request must satisfy every one — "10 per second and 100 per minute" on a single resource, where either ceiling can bind. A Hybrid is itself a Limiter, so hybrids nest.
use std::time::Duration;
use throttle_net::{Hybrid, Throttle};
let hybrid = Hybrid::builder()
.limiter(Throttle::per_second(10))
.limiter(Throttle::per_duration(100, Duration::from_secs(60)))
.build();The peek-then-commit contract is tested directly: when a later constituent refuses, the earlier ones keep their tokens.
MultiLimiter — multi-dimensional budgets
The killer feature for LLM APIs. One call spends against several named budgets at once — requests, input tokens, output tokens — and is admitted only when every dimension can afford its share.
use std::time::Duration;
use throttle_net::{MultiLimiter, Throttle};
let minute = Duration::from_secs(60);
let limiter = MultiLimiter::builder()
.dimension("requests", Throttle::per_duration(60, minute))
.dimension("input_tokens", Throttle::per_duration(100_000, minute))
.dimension("output_tokens", Throttle::per_duration(20_000, minute))
.build();
limiter
.acquire_costs(&[("requests", 1), ("input_tokens", 1500), ("output_tokens", 200)])
.await?;The runnable examples/llm_budget.rs drives a batch of calls through this and shows the limiter pacing itself when a budget runs low.
PerKey — independent state per key, bounded memory
Each key — a tenant, a user, a token — gets its own bucket, so one noisy key cannot spend another's budget. State lives in a sharded concurrent map: an existing key's acquire takes only a shard read lock plus the bucket's atomic accounting, so unrelated keys never contend. K is any hashable type.
use throttle_net::PerKey;
let limiter: PerKey<String> = PerKey::per_second(100);
limiter.acquire(&"tenant:42".to_string()).await?;Memory is bounded by Eviction — an idle TTL and/or a hard key cap, enforced lazily and per-shard, so a flood of unique keys hits a ceiling instead of growing without limit. The default policy is bounded. A populated 10 000-key lookup measures ~70ns.
Layered — ordered scopes
Stack a global ceiling, a per-key share, and a per-endpoint cap; a request must clear every configured scope. The key and endpoint types are independent (a numeric tenant id and a string route, say).
use throttle_net::{Layered, PerKey, Throttle};
let layered = Layered::<String>::builder()
.global(Throttle::per_second(1000))
.per_key(PerKey::per_second(100))
.per_endpoint(PerKey::per_second(50))
.build();
layered.acquire(&"tenant:42".to_string(), &"/v1/chat".to_string()).await?;Property tests and benchmarks
tests/proptests.rs encodes the defining invariant — a limiter never admits more than its capacity, and a composite never admits more than its binding scope — as an equality checked over a wide input space, for Throttle, Hybrid, PerKey, Layered, and MultiLimiter, plus the per-key flood bound. benches/throttle_bench.rs tracks the single-throttle acquire and the 10k-key per-key lookup.
Breaking changes
None. v0.1.0 was an empty scaffold; this is the first release with a public surface.
Verification
Run on Windows x86_64, Rust stable 1.93.1; the same commands run in the CI matrix on Linux, macOS, and Windows across stable and MSRV 1.85:
cargo fmt --all -- --check
cargo clippy --all-targets --all-features -- -D warnings
cargo clippy --no-default-features
cargo test --all-features
cargo test --no-default-features
RUSTDOCFLAGS="-D warnings" cargo doc --no-deps --all-features
cargo bench --bench throttle_bench
cargo deny check
cargo auditAll green. Counts at this tag (--all-features): 53 unit tests, 7 property tests, 45 doctests. Benchmarks: throttle_try_acquire ~27ns, perkey_lookup_10k_existing ~70ns (target: < 1µs).
What's next
- v0.3.0 — Retry and backoff. Standalone retry that also composes with limiters: constant / linear / exponential backoff with Full, Equal, and Decorrelated jitter; retry-on / give-up-on conditions;
Retry-Afterheader parsing honored over the computed backoff.
Installation
[dependencies]
throttle-net = "0.2"MSRV: Rust 1.85.
Documentation
Changelog: CHANGELOG.md.