Releases: jamesgober/throttle-net
v1.0.0 — Stable API
throttle-net v1.0.0 — Stable
The outbound throttling and resilience library is stable. After the 0.x series built and hardened it, v1.0.0 freezes the public API until 2.0. One library replaces the four you assemble today — a token bucket, a backoff loop, a circuit breaker, and per-provider header parsing — and adds the parts nobody else ships: multi-dimensional cost-aware limits and adaptive throttling. No functional change from 0.9.0; this release is the stability commitment.
What is throttle-net?
Where rate-net protects your service from being overwhelmed (inbound), throttle-net protects your service from overwhelming the downstreams it calls — and from being banned by them. The defining operation is to wait for capacity, not to reject: you pace your own outbound work rather than dropping someone else's request.
The common case is one builder and one acquire().await?:
# async fn run() -> Result<(), throttle_net::ThrottleError> {
use throttle_net::Throttle;
let throttle = Throttle::per_second(100);
throttle.acquire().await?; // returns as soon as a token is free
// ... call the downstream ...
# Ok(())
# }The hard cases — LLM token budgets, per-tenant quotas, adaptive backpressure — are first-class.
The stable surface
Everything below is frozen until 2.0.
- Limiters — the [
Throttle] token bucket (lock-free, one atomic compare-and-swap per acquire) and the exact [SlidingWindowLog] (no boundary burst), each with a waiting, cost-awareacquireand non-blockingtry_acquire/peek. - Composition — [
Hybrid] (must pass all), [MultiLimiter] (multi-dimensional cost-aware budgets), [PerKey] (independent per-key state with bounded memory), and [Layered] (global / per-key / per-endpoint scopes) — all with peek-then-commit correctness, so a request never spends in one limiter when another would block it. - Retry & backoff — [
Backoff] (constant / linear / exponential, with full / equal / decorrelated jitter) and [Retry] with per-error classification andRetry-Afterhonoring. - Resilience — a [
CircuitBreaker] that wraps any limiter and fails fast without consuming it, and a bounded, deadline-aware, priority [Queue]. - Adaptive concurrency — an [
AdaptiveLimiter] that discovers the right in-flight limit from outcome feedback (AIMD or Vegas), never exceeding the hard ceiling. - Provider integration — response-header parsers (OpenAI, Anthropic, GitHub, Stripe, AWS, the IETF draft) with limiter synchronization, and LLM tier presets.
- Observability — metrics and tracing events, feature-gated and zero-cost when off.
- The
Limitertrait — the Tier-3 extension seam; every algorithm and composite is oneLimiter.
The tiered design holds: Tier-1 is the whole common case in a couple of calls, Tier-2 the builders for tuning, Tier-3 the trait seam for custom backends.
Runtime and platform
- Runtimes — the waiting surface runs on tokio (default) or smol with identical call-site code; you pick the timer backend by feature.
no_std— withstdoff, the pure algorithm core ([Backoff], [Jitter], [Decision]) compiles without the standard library.- Platforms — Linux, macOS, and Windows are first-class, verified on stable and MSRV 1.85 across the CI matrix.
How it is held correct
The defining invariants are tested, not assumed:
- Property tests for every limiter invariant — burst never exceeds capacity, the exact window admits at most N per window, composites bind on the tightest scope, per-key floods stay within the eviction bound, backoff stays under its ceiling, and the adaptive limit stays within
[floor, ceiling]. - A
loommodel check that exhaustively explores the adaptive limiter's lock-free slot accounting — no over-admission, no lost slots — under arbitrary thread interleavings. - Fuzzed parsers — the
Retry-Afterand provider-header parsers never panic on arbitrary input, checked both bycargo-fuzztargets and an always-on deterministic smoke suite. (This caught two real integer-overflow panics during 0.9, both fixed.) - No
unsafe— the crate is#![forbid(unsafe_code)]. - Benchmarks — uncontended
try_acquireis ~27 ns (target < 50 ns), benchmarked head-to-head withgovernor, with a contention sweep at 1 / 4 / 16 / 64 acquirers.
Changes since 0.9.0
- Stable. The public API is frozen until 2.0.
- Removed the unimplemented
serdefeature flag (it pulled the dependency but wired up noSerialize/Deserialize). Serializable configs may return post-1.0 as an additive feature with a designed, documented representation.
Breaking changes
None from 0.9.0. The only removal is the no-op serde feature flag, which had no functional effect.
After 1.0
Backward-compatible additions under consideration (not yet shipped):
- Serializable limiter configs (
serde), with a designed on-disk representation. - Companion middleware crates (
throttle-net-tower,throttle-net-reqwest). - Distributed limiter state is a 2.0 design topic; 1.x is in-process.
Verification
Run on Windows x86_64, Rust stable; the same commands run in CI on Linux, macOS, and Windows across stable and MSRV 1.85, with dedicated jobs for the smol + no_std matrix, the loom model check, and the fuzz smoke run:
cargo fmt --all -- --check
cargo clippy --all-targets --all-features -- -D warnings
cargo clippy --no-default-features --features smol --all-targets -- -D warnings
cargo clippy --no-default-features --all-targets -- -D warnings
cargo test --all-features
cargo test # default: tokio backend
cargo test --no-default-features # no_std core + its doctests
cargo test --no-default-features --features smol # smol backend
cargo build --no-default-features # no_std lib builds
cargo bench --no-run --all-features
RUSTFLAGS="--cfg throttle_loom" cargo test --test loom_throttle \
--no-default-features --features adaptive --release
cargo +nightly fuzz run retry_after --target x86_64-unknown-linux-gnu -- -max_total_time=30
cargo +nightly fuzz run provider_headers --target x86_64-unknown-linux-gnu -- -max_total_time=30
cargo doc --no-deps && RUSTDOCFLAGS="-D warnings" cargo doc --no-deps --all-features
cargo deny check
cargo auditAll green.
Installation
[dependencies]
throttle-net = "1"
# Optional features:
throttle-net = { version = "1", features = ["circuit-breaker", "adaptive", "provider-llm", "metrics", "tracing"] }
# On smol instead of tokio:
throttle-net = { version = "1", default-features = false, features = ["smol"] }MSRV: Rust 1.85.
Documentation
Thanks
throttle-net builds on the first-party stack: better-bucket (token-bucket accounting), clock-lib (mockable time), error-forge (domain errors), and ahash (DoS-resistant shard hashing). It is the outbound companion to rate-net.
Changelog: CHANGELOG.md.
v0.9.0 — Polish and hardening
throttle-net v0.9.0 — Polish and hardening
Adversarial, correct, fast, documented. v0.9.0 is the hardening release: fuzzed parsers, a loom model check of the lock-free slot accounting, property tests for every limiter invariant, comparative benchmarks against governor, and two new guides. The public API was frozen at 0.8 and is unchanged here — this release is about confidence, not surface.
What is throttle-net?
A general-purpose outbound throttling and resilience library. Where rate-net protects your service from being overwhelmed (inbound), throttle-net protects your service from overwhelming the downstreams it calls — and from being banned by them. It paces outbound work, composes limits across dimensions and scopes, retries, fails fast, queues fairly, adapts its concurrency, syncs to provider headers, and reports what it is doing — on tokio or smol.
What's new in 0.9.0
Fuzzing
Two cargo-fuzz targets exercise the byte-facing parsers, where untrusted input
arrives from the network:
retry_after—parse_retry_afteragainst arbitrary bytes and reference instants.provider_headers— every provider header profile against arbitrary header sets.
The contract is "malformed input never panics, only ever returns None / drops the
value." A CI job runs each target as a timed smoke check, and
tests/fuzz_smoke.rs carries an always-on deterministic corpus plus a
pseudo-random sweep of the same property, so the guarantee is checked on every
cargo test on every platform.
This already paid for itself: the fuzz-smoke suite caught an integer-overflow panic
in provider header parsing (an extreme rate-limit reset timestamp combined with an
extreme reference time overflowed the reset - now subtraction). Fixed with
saturating arithmetic and a regression test — see Fixes.
A loom concurrency model check
tests/loom_throttle.rs uses loom to
exhaustively explore the legal thread interleavings of the adaptive limiter's
reserve/release path — the one piece of lock-free state the crate owns itself (the
token bucket lives in better-bucket, model-checked there). It proves two
invariants under arbitrary interleavings:
- No over-admission — in-flight permits never exceed the limit.
- No lost slots — once every permit is settled, the in-flight count is back to
zero; a permit can neither leak a slot nor release one twice (including the
drop-as-failure path).
The model check is wired through a crate-private --cfg throttle_loom so it
instruments only throttle-net's own atomics and does not switch on the cfg(loom)
paths inside transitive dependencies.
Property tests for every invariant
The proptest suite gained exact-bound coverage for the remaining algorithms:
- The sliding-window log admits at most
limitper window, and reclaims capacity
exactly as the window slides. - Backoff delays stay within the configured ceiling under every jitter mode, and
the unjittered curve never goes backwards. - The adaptive limit always stays within
[floor, ceiling]after any sequence of
outcomes.
Comparative and contention benchmarks
comparison_benchmeasures the uncontendedtry_acquirefloor against
governoron the same workload.contention_benchmeasures aggregate throughput at 1 / 4 / 16 / 64 concurrent
acquirers on a shared throttle, confirming the lock-free path scales rather than
collapsing under contention.
Guides
docs/COOKBOOK.md— task-oriented recipes for the common problems.docs/MIGRATING_FROM_GOVERNOR.md— API mapping and before/after for moving an outbound setup offgovernor.
Breaking changes
None. The public API was frozen at 0.8. This release adds tests, benchmarks,
docs, and one internal bug fix.
Fixes
- Provider header parsing overflow. An extreme rate-limit reset timestamp
combined with an extreme reference time could overflow thereset - now
subtraction and panic in debug builds. The subtraction now saturates; a past or
unreachable reset clamps the time-until-reset to zero. Found by the new
fuzz-smoke suite; covered by a regression test.
Verification
Run on Windows x86_64, Rust stable; the same commands run in CI on Linux, macOS,
and Windows across stable and MSRV 1.85, with dedicated jobs for the smol +
no_std matrix, the loom model check, and the fuzz smoke run:
cargo fmt --all -- --check
cargo clippy --all-targets --all-features -- -D warnings
cargo test --all-features
cargo bench --no-run --all-features
RUSTFLAGS="--cfg throttle_loom" cargo test --test loom_throttle \
--no-default-features --features adaptive --release
cargo +nightly fuzz run retry_after -- -max_total_time=30
cargo +nightly fuzz run provider_headers -- -max_total_time=30
cargo deny check
cargo auditAll green. Counts at this tag (--all-features): 130 unit tests, 11 property
tests, 4 fuzz-smoke tests, 3 circuit-breaker integration tests, 2 retry
integration tests, 1 observability integration test, 67 doctests, plus 2 loom
model checks. The performance targets hold: uncontended try_acquire is ~27 ns
(target < 50 ns) and is benchmarked head-to-head with governor.
What's next
- v1.0.0 — Stable. First-consumer integration, final captured benchmarks, then
the stable release with the public API frozen until 2.0.
Installation
[dependencies]
throttle-net = "0.9"
# Optional features:
throttle-net = { version = "0.9", features = ["circuit-breaker", "adaptive", "provider-llm", "metrics", "tracing"] }
# On smol instead of tokio:
throttle-net = { version = "0.9", default-features = false, features = ["smol"] }MSRV: Rust 1.85.
Documentation
Changelog: CHANGELOG.md.
Full Changelog: v0.8.0...v0.9.0
v0.8.0 — Runtime flexibility & feature freeze
throttle-net v0.8.0 — Runtime flexibility & feature freeze
Run it on the executor you already use, and freeze the surface for 1.0. v0.8.0 makes the waiting acquire surface runtime-agnostic — it runs on tokio or smol with identical call-site code — adds a no_std-capable algorithm core, fills out the example set, and declares the public API frozen for the run to 1.0. No breaking changes.
What is throttle-net?
A general-purpose outbound throttling and resilience library. Where rate-net protects your service from being overwhelmed (inbound), throttle-net protects your service from overwhelming the downstreams it calls — and from being banned by them. It paces outbound work, composes limits across dimensions and scopes, retries, fails fast, queues fairly, adapts its concurrency, syncs to provider headers, reports what it is doing — and, as of this release, does the waiting on whichever runtime you prefer.
What's new in 0.8.0
Pick your runtime: tokio or smol
The code that does the waiting is now runtime-agnostic. Internally it parks on an event-listener notification and races a wake-up against a timeout with futures-lite, instead of a tokio-specific Notify / select!. The only runtime-specific piece left is the timer, selected by feature. The result: the same async methods run unchanged on either backend.
# tokio (default — nothing to change):
throttle-net = "0.8"
# smol instead:
throttle-net = { version = "0.8", default-features = false, features = ["smol"] }// Identical on both backends:
let throttle = throttle_net::Throttle::per_second(100);
throttle.acquire().await?;Requesting the waiting surface without enabling a backend is a clear compile error rather than a confusing one. async-std is intentionally not supported — it is discontinued (RUSTSEC-2025-0052); tokio and smol cover the runtime-flexibility goal.
A no_std-capable algorithm core
With std off, the pure algorithm types — Backoff, BackoffIter, Jitter, and Decision — compile and run without the standard library, alongside VERSION. No clock, no allocator, no async runtime. The limiter surface still requires std.
# Algorithm core only:
throttle-net = { version = "0.8", default-features = false }use core::time::Duration;
use throttle_net::Backoff;
// Deterministic backoff with no system clock:
let mut delays = Backoff::exponential(Duration::from_millis(50), 2.0).iter_seeded(1);
let _first = delays.next_delay();Under no_std, Backoff::iter() seeds its jitter from a monotonic counter instead of system entropy; iter_seeded(seed) is fully deterministic on both.
The example set, filled out
The roadmap's worked examples are all present and runnable:
cargo run --example llm_budget # multi-dimensional LLM budgets
cargo run --example retry_backoff # retry with backoff + Retry-After
cargo run --example circuit_breaker --features circuit-breaker # trip, shed, recover
cargo run --example adaptive_concurrency --features adaptive # learn the limit from feedback
cargo run --example per_tenant_quotas # per-tenant budgets under a global capper_tenant_quotas is new: a Layered global ceiling over a PerKey per-tenant cap, so each tenant gets its own budget and no single noisy tenant can starve the others or the service.
Feature freeze
The public API is frozen as of this release. The remaining 0.x work — fuzzing, loom model checks, comparative benchmarks — is hardening, not API change. Nothing will change incompatibly before 1.0.
Breaking changes
None. smol and no_std are additive; tokio remains the default. The one dependency-footprint change is internal: the tokio feature now pulls only tokio's time feature.
Verification
Run on Windows x86_64, Rust stable; the same commands run in the CI matrix on Linux, macOS, and Windows across stable and MSRV 1.85, with a dedicated job for the smol + no_std runtime matrix:
cargo fmt --all -- --check
cargo clippy --all-targets --all-features -- -D warnings
cargo clippy --all-targets --no-default-features -- -D warnings
cargo clippy --no-default-features --features smol --all-targets -- -D warnings
cargo test --all-features
cargo test # default: tokio backend
cargo test --no-default-features # no_std core + its doctests
cargo test --no-default-features --features smol # smol backend
cargo build --no-default-features # no_std lib builds
cargo doc --no-deps && RUSTDOCFLAGS="-D warnings" cargo doc --no-deps --all-features
cargo deny check
cargo auditAll green. The runtime-flexibility exit criterion is covered directly: the full test suite passes on both the tokio and smol timer backends (timing tests that assert exact virtual-clock elapsed time stay tokio-only, since they depend on tokio's paused-time test clock), and the no_std core builds and runs its doctests with std off.
What's next
- v0.9.0 → 1.0 — Polish & 1.0. Fuzzing, loom concurrency model checks, comparative benchmarks against the alternatives, and the 1.0 release. The public API is already frozen as of this release.
Installation
[dependencies]
throttle-net = "0.8"
# Optional features:
throttle-net = { version = "0.8", features = ["circuit-breaker", "adaptive", "provider-llm", "metrics", "tracing"] }
# On smol instead of tokio:
throttle-net = { version = "0.8", default-features = false, features = ["smol"] }MSRV: Rust 1.85.
Documentation
Changelog: CHANGELOG.md.
Full Changelog: v0.7.0...v0.8.0
v0.7.0 — Observability
throttle-net v0.7.0 — Observability
See what the limiters are doing in production. v0.7.0 instruments the whole stack — metrics and tracing events around every acquire and every state transition — and it is feature-gated and genuinely zero-cost when off. No breaking changes.
What is throttle-net?
A general-purpose outbound throttling and resilience library. Where rate-net protects your service from being overwhelmed (inbound), throttle-net protects your service from overwhelming the downstreams it calls — and from being banned by them. It paces outbound work, composes limits across dimensions and scopes, retries, fails fast, queues fairly, adapts its concurrency, syncs to provider headers, and — as of this release — reports what it is doing.
What's new in 0.7.0
Metrics (feature metrics)
Emitted through the metrics facade, so they land in whatever recorder your application installs (Prometheus, StatsD, OTLP, …):
| Metric | Type | Emitted when |
|---|---|---|
throttle_acquired_total |
counter (label limiter) |
a waiting acquire is granted |
throttle_wait_duration |
histogram, seconds (label limiter) |
a waiting acquire completes |
throttle_queue_depth |
gauge | the queue's waiter count changes |
throttle_circuit_state |
gauge (0/1/2 = closed/half-open/open) | a circuit-breaker transition |
throttle_rate_current |
gauge | an adaptive limit changes |
Tracing events (feature tracing)
Emitted under the throttle_net target: an acquire event carrying limiter, cost, granted, and wait_secs, plus structured events for every documented state transition — circuit-breaker transitions, adaptive limit changes, queue overflow, and queue deadline exhaustion.
Zero-cost when disabled
The instrumentation routes through a small internal layer of #[inline] hooks. With the features off, each hook compiles to an empty function that the optimizer removes, and — critically — its inputs are never evaluated, so the hot path pays nothing. The wait timer is literally zero-sized in that build, which a test asserts. Each hook is also gated to the feature of the limiter that calls it, so the crate has no dead code in any feature combination.
// Wire up any recorder/subscriber in your app; the limiters emit automatically.
// e.g. with the `metrics` feature and a Prometheus recorder installed:
let throttle = throttle_net::Throttle::per_second(100);
// throttle.acquire().await? now increments `throttle_acquired_total`
// and records `throttle_wait_duration`.Breaking changes
None. Observability is additive and behind the metrics / tracing features.
Verification
Run on Windows x86_64, Rust stable 1.93.1; the same commands run in the CI matrix on Linux, macOS, and Windows across stable and MSRV 1.85:
cargo fmt --all -- --check
cargo clippy --all-targets --all-features -- -D warnings
cargo clippy --all-targets --no-default-features -- -D warnings
cargo test --all-features
cargo test --no-default-features
cargo test # default: metrics/tracing off (zero-cost path)
cargo doc --no-deps && RUSTDOCFLAGS="-D warnings" cargo doc --no-deps --all-features
cargo deny check
cargo auditAll green. Counts at this tag (--all-features): 129 unit tests, 7 property tests, 3 circuit-breaker integration tests, 2 retry integration tests, 1 observability integration test, 67 doctests. The exit criteria are covered directly: a test asserts the wait timer is zero-sized with the features off (zero-cost), and the observability integration test installs a local recorder and asserts the throttle_circuit_state gauge fires on a breaker trip (events fire on transitions). Every transition site routes through a hook, and each is exercised by an existing state-transition test.
This release also confirmed the feature matrix is clean: the lib builds under std-only, circuit-breaker-only, adaptive-only, metrics+tracing-only, and provider-llm-only, with no dead code in any combination.
What's next
- v0.8.0 — Runtime flexibility + feature freeze. async-std and smol support behind feature flags, a
no_std-capable algorithm core, the example set filled out, and the feature surface declared frozen for the run to 1.0.
Installation
[dependencies]
throttle-net = "0.7"
# Observability is behind feature flags:
throttle-net = { version = "0.7", features = ["metrics", "tracing"] }MSRV: Rust 1.85.
Documentation
Changelog: CHANGELOG.md.
Full Changelog: v0.6.0...v0.7.0
v0.6.0 — Provider integration
throttle-net v0.6.0 — Provider integration
Be a good API citizen out of the box. Downstreams already tell you your remaining budget — in their response headers — and the well-known APIs each spell it differently. v0.6.0 reads those headers, reconciles your limiter with the server's view, and gives you provider tier presets to start from. No breaking changes.
What is throttle-net?
A general-purpose outbound throttling and resilience library. Where rate-net protects your service from being overwhelmed (inbound), throttle-net protects your service from overwhelming the downstreams it calls — and from being banned by them. It paces outbound work, composes limits across dimensions and scopes, retries transient failures, fails fast when a dependency is sick, queues callers fairly, adapts its concurrency to what the downstream can take, and — as of this release — listens to what the downstream tells it about its own limits.
What's new in 0.6.0
Response-header parsing — the provider module
Every provider advertises rate-limit state differently. A HeaderProfile captures one convention; ready-made profiles cover the common ones, and parse turns a header set into a normalized RateLimitInfo with separate requests and tokens windows plus a retry_after.
use throttle_net::provider::HeaderProfile;
let headers = [
("x-ratelimit-limit-requests", "5000"),
("x-ratelimit-remaining-requests", "4999"),
("x-ratelimit-reset-requests", "6m0s"),
];
let info = HeaderProfile::OPENAI.parse(&headers);
assert_eq!(info.requests.unwrap().remaining, Some(4999));| Profile | Reset encoding |
|---|---|
OPENAI |
duration string (6m0s, 100ms) |
ANTHROPIC |
RFC 3339 instant |
GITHUB |
absolute Unix timestamp |
RFC (IETF RateLimit draft) |
delta-seconds |
STRIPE, AWS |
Retry-After only |
All four reset encodings are parsed with no date-library dependency (the RFC 3339 and Unix-timestamp paths reuse the same civil-date math as the Retry-After parser, now shared internally). Parsing is defensive: header names match case-insensitively, and malformed values are dropped rather than panicking. parse_at(headers, now) takes an explicit current time for deterministic testing.
State synchronization
RateLimitInfo::sync_requests (and sync_tokens) reconcile a Throttle with the server's reported remaining count:
let drained = info.sync_requests(&throttle);It drains the local budget down to the server's number — and only ever reduces, never adds — so it corrects client/server drift without any chance of raising the throttle above its hard limit.
LLM tier presets — the presets module
Pre-wired MultiLimiter configurations for the common LLM tiers, so you start from a sensible shape instead of hand-building the three dimensions:
use throttle_net::presets;
let limiter = presets::anthropic::tier_2(); // requests + input/output token budgetspresets::anthropic::{tier_1, tier_2, tier_4} and presets::openai::{tier_1, tier_2} are provided. The numbers are illustrative starting points — providers change tier limits and meter per model — so confirm them against current docs.
Breaking changes
None. The new modules are additive and feature-gated (provider-headers, provider-llm).
Verification
Run on Windows x86_64, Rust stable 1.93.1; the same commands run in the CI matrix on Linux, macOS, and Windows across stable and MSRV 1.85:
cargo fmt --all -- --check
cargo clippy --all-targets --all-features -- -D warnings
cargo clippy --all-targets --no-default-features -- -D warnings
cargo test --all-features
cargo test --no-default-features
cargo doc --no-deps && RUSTDOCFLAGS="-D warnings" cargo doc --no-deps --all-features
cargo deny check
cargo auditAll green. Counts at this tag (--all-features): 129 unit tests, 7 property tests, 3 circuit-breaker integration tests, 2 retry integration tests, 67 doctests. Each provider profile is tested against a representative recorded header set (the exit criterion), and synchronization is tested to drain to the server's count while never raising the local budget above the hard limit.
This release also fixes a latent issue: the token-bucket module imported the error type unconditionally, so the std-without-tokio feature combination (now reachable via default-features = false, features = ["provider-llm"]) failed to build under -D warnings. The import is now correctly gated, and that combination builds clean.
What's next
- v0.7.0 — Observability. Metrics (
throttle_acquired_total,throttle_wait_duration,throttle_queue_depth,throttle_circuit_state,throttle_rate_current) and tracing spans aroundacquire(), feature-gated and zero-cost when disabled, plus structured events for circuit transitions, adaptive rate changes, queue overflow, and deadline exhaustion.
Installation
[dependencies]
throttle-net = "0.6"
# Header parsing and LLM presets are behind feature flags:
throttle-net = { version = "0.6", features = ["provider-llm"] }MSRV: Rust 1.85.
Documentation
Changelog: CHANGELOG.md.
Full Changelog: v0.5.0...v0.6.0
v0.5.0 — Adaptive
throttle-net v0.5.0 — Adaptive
Find the right limit without being told it. Every limiter so far needed you to know the downstream's capacity up front. v0.5.0 adds the clever one: an adaptive concurrency limiter that learns the capacity from observed outcomes — opening up while requests succeed, pulling back when they fail or slow — and never exceeds the hard ceiling. No breaking changes.
What is throttle-net?
A general-purpose outbound throttling and resilience library. Where rate-net protects your service from being overwhelmed (inbound), throttle-net protects your service from overwhelming the downstreams it calls — and from being banned by them. It paces outbound work, composes limits across dimensions and scopes, retries transient failures, fails fast when a dependency is sick, queues callers fairly when it is busy, and — as of this release — adapts its concurrency to what the downstream can actually take.
What's new in 0.5.0
AdaptiveLimiter — concurrency that learns
A token bucket needs a number. The adaptive limiter discovers one: it caps in-flight requests at a limit it adjusts from feedback, bounded by a floor and a ceiling. When requests succeed and the limit is saturated, it grows; when they fail or slow, it shrinks. The limit never exceeds the ceiling, so the adaptation can only ever be more conservative than your hard cap.
use throttle_net::{Aimd, AdaptiveLimiter};
let limiter = AdaptiveLimiter::builder()
.floor(2)
.ceiling(50)
.initial(10)
.build(Aimd::default()); // +1 on a saturated success, halve on failure
if let Some(permit) = limiter.try_acquire() {
// ... call the downstream, then report the outcome ...
if downstream_ok { permit.success() } else { permit.failure() }
}Outcomes are reported through an AdaptivePermit: success() measures the round-trip time from acquisition, failure() signals a back-off, and dropping the permit unsettled counts as a failure — so an early return or a panic is treated conservatively. The waiting acquire().await blocks on a slot freeing rather than on a timer.
Two strategies, plus your own
Aimd— additive increase, multiplicative decrease. Probe upward gently on saturated successes, retreat sharply on failure. The classic congestion response.Vegas— latency-based, after TCP Vegas. It estimates the downstream's queue depth from the round-trip time against the best (no-load) latency it has seen —limit * (rtt - min_rtt) / rtt— and grows when the estimate is small, shrinks when it is large. Latency degradation is felt before failures even start.AdaptiveStrategy— implementadjust(current, in_flight, outcome) -> new_limitfor a custom policy; the limiter clamps the result to[floor, ceiling].
The runnable examples/adaptive_concurrency.rs shows the limit climbing while a downstream is healthy, collapsing to the floor the moment it degrades, holding there through the outage, and climbing back on recovery — never crossing the ceiling.
Breaking changes
None. Everything in this release is additive, behind the adaptive feature.
Verification
Run on Windows x86_64, Rust stable 1.93.1; the same commands run in the CI matrix on Linux, macOS, and Windows across stable and MSRV 1.85:
cargo fmt --all -- --check
cargo clippy --all-targets --all-features -- -D warnings
cargo clippy --all-targets --no-default-features -- -D warnings
cargo test --all-features
cargo test --no-default-features
cargo doc --no-deps # default features — now warning-free too
RUSTDOCFLAGS="-D warnings" cargo doc --no-deps --all-features
cargo deny check
cargo auditAll green. Counts at this tag (--all-features): 111 unit tests, 7 property tests, 3 circuit-breaker integration tests, 2 retry integration tests, 64 doctests. The adaptive limiter's exit criteria are covered: a run of failures drives the limit to the floor and a run of successes drives it to the ceiling (and no further); the AIMD and Vegas rules are unit-tested, and the in-flight count never exceeds the limit.
This release also fixes a documentation warning: the crate-level docs no longer link the feature-gated CircuitBreaker type, so cargo doc is clean on the default feature set, not only under --all-features.
What's next
- v0.6.0 — Provider integration. Be a good API citizen out of the box: rate-limit header parsers (OpenAI/Anthropic/Cohere, AWS, GitHub, Stripe, generic RFC 6585), state synchronization from response headers to prevent client/server drift, and LLM provider presets behind
provider-llm.
Installation
[dependencies]
throttle-net = "0.5"
# The adaptive and circuit-breaker limiters are behind feature flags:
throttle-net = { version = "0.5", features = ["adaptive", "circuit-breaker"] }MSRV: Rust 1.85.
Documentation
Changelog: CHANGELOG.md.
Full Changelog: v0.4.0...v0.5.0
v0.4.0 — Resilience
throttle-net v0.4.0 — Resilience
The pieces that decide whether to call at all, and how callers wait. v0.4.0 adds the resilience layer on top of the limiters: a circuit breaker that sheds load when a downstream is unhealthy, a bounded deadline-aware queue for orderly waiting, and an exact sliding-window-log limiter for when a token bucket's boundary burst is unacceptable. No breaking changes.
What is throttle-net?
A general-purpose outbound throttling and resilience library. Where rate-net protects your service from being overwhelmed (inbound), throttle-net protects your service from overwhelming the downstreams it calls — and from being banned by them. It paces outbound work, composes limits across dimensions and scopes, retries transient failures, and — as of this release — fails fast when a dependency is sick and queues callers fairly when it is merely busy.
What's new in 0.4.0
CircuitBreaker — fail fast on failures
A limiter paces requests; a breaker stops them. Wrap any Limiter in a breaker and, once the downstream produces enough failures, it trips open and sheds requests immediately — without consuming the wrapped limiter's tokens — giving the dependency room to recover. After a cooldown it goes half-open, admits a trial or two, and closes again on success.
use std::time::Duration;
use throttle_net::{CircuitBreaker, Throttle, Trip};
let breaker = CircuitBreaker::builder()
.trip(Trip::Consecutive(5))
.cooldown(Duration::from_secs(10))
.build(Throttle::per_second(100));
match breaker.acquire().await {
Ok(permit) => {
// ... call the downstream ...
if downstream_ok { permit.success() } else { permit.failure() }
}
Err(_shed) => { /* breaker open: fail fast */ }
}Three trip conditions cover the common policies: Consecutive(n) failures, a failure Ratio over a rolling window of calls, and Windowed failures within a rolling time window. Outcomes are reported through a Permit whose drop counts as a failure, so an early return or a panic is treated conservatively. Behind the circuit-breaker feature. State transitions are verified by a proptest against a reference model; half-open recovery and load-shedding by integration tests. The runnable examples/circuit_breaker.rs walks the full Closed → Open → HalfOpen → Closed lifecycle.
Queue — bounded, deadline-aware waiting
When a limiter is saturated, callers can be rejected or wait. A Queue lets them wait in an orderly way: bounded so it cannot grow without limit, served by priority (and fairly across keys at equal priority), and — the headline — it drops a waiter whose deadline has passed rather than serving it, so a dead request never consumes a token or blocks a live one.
use std::time::Duration;
use throttle_net::{Overflow, Queue, Throttle};
let queue: Queue<Throttle, &str> = Queue::builder()
.capacity(100)
.overflow(Overflow::DropOldest)
.build(Throttle::per_second(50));
// Wait for a slot, but give up after 2 seconds.
queue.acquire("tenant:1", 0, Some(Duration::from_secs(2))).await?;Overflow policy is configurable — Reject, DropOldest, or DropLowestPriority. Behind the tokio feature (it needs an async runtime to wait).
SlidingWindowLog — exact limiting, no boundary burst
Throttle is a token bucket: smooth and cheap, but it admits a full burst at any instant. SlidingWindowLog is the exact alternative — it records the timestamp of every grant and admits a request only if fewer than limit were granted in the trailing window. No boundary burst, at the cost of remembering recent grants. It implements Limiter, so it drops into a hybrid, a per-key store, a layer, or behind the circuit breaker like any other limiter.
use std::time::Duration;
use throttle_net::SlidingWindowLog;
let limiter = SlidingWindowLog::new(5, Duration::from_secs(1));
for _ in 0..5 {
assert!(limiter.try_acquire());
}
assert!(!limiter.try_acquire()); // the 6th in this window is refusedError additions
The #[non_exhaustive] ThrottleError gains CircuitOpen { retry_after } (retryable), QueueFull (retryable), and DeadlineExceeded (not retryable), each carrying the right error-forge retryability metadata.
Breaking changes
None. Everything in this release is additive; the new error variants slot into the existing #[non_exhaustive] enum.
Verification
Run on Windows x86_64, Rust stable 1.93.1; the same commands run in the CI matrix on Linux, macOS, and Windows across stable and MSRV 1.85:
cargo fmt --all -- --check
cargo clippy --all-targets --all-features -- -D warnings
cargo clippy --all-targets --no-default-features -- -D warnings
cargo test --all-features
cargo test --no-default-features
RUSTDOCFLAGS="-D warnings" cargo doc --no-deps --all-features
cargo bench --bench throttle_bench
cargo deny check
cargo auditAll green. Counts at this tag (--all-features): 103 unit tests, 7 property tests, 3 circuit-breaker integration tests, 2 retry integration tests, 61 doctests. The circuit-breaker state machine is proptested against a reference model; the queue's deadline-drop, overflow, and fair priority scheduling are covered by unit tests over the scheduler's pure selection logic plus async integration tests.
What's next
- v0.5.0 — Adaptive. Back off without an explicit rate-limit signal: an AIMD adaptive limiter (additive increase, multiplicative decrease) with floor and ceiling, a latency-based (Vegas-style) limiter keyed on observed p99, a custom adaptive-strategy trait, and an
observe(outcome)feedback loop — never violating the underlying hard limit.
Installation
[dependencies]
throttle-net = "0.4"
# The circuit breaker is behind a feature flag:
throttle-net = { version = "0.4", features = ["circuit-breaker"] }MSRV: Rust 1.85.
Documentation
Changelog: CHANGELOG.md.
Full Changelog: v0.3.0...v0.4.0
v0.3.0 — Retry and backoff
throttle-net v0.3.0 — Retry and backoff
Resilience that stands on its own. v0.3.0 adds the retry half of the library: a complete backoff taxonomy with jitter, a retry policy that classifies errors and honors Retry-After, and a dependency-free Retry-After parser. None of it requires a limiter — retry any fallible async call — but it composes cleanly with one. No breaking changes.
What is throttle-net?
A general-purpose outbound throttling and resilience library. Where rate-net protects your service from being overwhelmed (inbound), throttle-net protects your service from overwhelming the downstreams it calls — and from being banned by them. It paces your outbound work, composes limits across dimensions and scopes, and — as of this release — retries transient failures without stampeding.
What's new in 0.3.0
Backoff — the delay taxonomy
A backoff is a policy that turns an attempt number into a delay. Three base curves — constant, linear, exponential — each combine with a jitter mode.
use std::time::Duration;
use throttle_net::{Backoff, Jitter};
let backoff = Backoff::exponential(Duration::from_millis(100), 2.0)
.with_max(Duration::from_secs(30))
.with_jitter(Jitter::Decorrelated);
let mut delays = backoff.iter();
let first = delays.next_delay();The jitter modes follow the AWS taxonomy:
- None — exactly the capped curve.
- Full — uniform in
[0, delay]; maximum spread. - Equal —
delay/2 + rand(0, delay/2); keeps a floor while spreading. - Decorrelated —
min(max, rand(base, previous*3)); self-correlated, the strongest at breaking up a thundering herd, and the [Backoff::default].
Randomness comes from a small, no-dependency SplitMix64 generator — jitter needs spread, not cryptography. iter_seeded(seed) produces reproducible sequences for tests. A BackoffIter yields one delay per attempt and is an infinite Iterator; bounding attempts is the retry policy's job.
Retry — the policy
Retry drives a fallible async operation with a backoff, an attempt ceiling, and per-error classification.
use throttle_net::{Backoff, Retry, RetryAction};
let retry = Retry::new(Backoff::default()).max_attempts(5);
let result = retry
.run(
|| async { call_downstream().await },
|err| if err.is_transient() { RetryAction::Retry } else { RetryAction::GiveUp },
)
.await;The classifier returns a RetryAction per error — Retry (use the backoff), RetryAfter(duration) (honor a server hint), or GiveUp — so retry works with any error type. For error-forge errors, retry_if_retryable classifies by the error's own is_retryable().
Retry-After — parsed and honored
parse_retry_after handles both header forms RFC 9110 allows: a delay in seconds and an HTTP date (all three date formats — IMF-fixdate, RFC 850, asctime). It needs no date-library dependency and never panics on malformed input — bad values return None. A classifier turns a parsed value into RetryAction::RetryAfter, which the policy honors over its computed backoff when respect_retry_after(true) is set.
use throttle_net::{parse_retry_after, RetryAction};
let action = match parse_retry_after("120") {
Some(after) => RetryAction::RetryAfter(after),
None => RetryAction::Retry,
};The runnable examples/retry_backoff.rs shows a flaky downstream retried with jittered backoff, switching to the server's Retry-After on a 429.
Breaking changes
None. Everything in this release is additive.
Verification
Run on Windows x86_64, Rust stable 1.93.1; the same commands run in the CI matrix on Linux, macOS, and Windows across stable and MSRV 1.85:
cargo fmt --all -- --check
cargo clippy --all-targets --all-features -- -D warnings
cargo clippy --all-targets --no-default-features -- -D warnings
cargo test --all-features
cargo test --no-default-features
RUSTDOCFLAGS="-D warnings" cargo doc --no-deps --all-features
cargo bench --bench throttle_bench
cargo deny check
cargo auditAll green. Counts at this tag (--all-features): 75 unit tests, 7 property tests, 2 retry integration tests, 55 doctests. The decorrelated-jitter thundering-herd scatter and the Retry-After parse-and-honor path are covered by dedicated integration tests; retry timing is verified deterministically under tokio's paused clock.
What's next
- v0.4.0 — Resilience. Circuit breaker (closed / open / half-open, count-, ratio-, and rolling-window thresholds) that wraps any limiter and fails fast when open; an exact sliding-window log; and queue policies (bounded, deadline-aware, priority, fair-across-keys).
Installation
[dependencies]
throttle-net = "0.3"MSRV: Rust 1.85.
Documentation
Changelog: CHANGELOG.md.
Full Changelog: v0.2.0...v0.3.0
v0.2.0 — Foundation
throttle-net v0.2.0 — Composition
The foundation and the differentiators, in one release. v0.2.0 turns the scaffold into a working library: the waiting outbound acquire, the Limiter trait every algorithm and composite shares, and the four ways to compose limits that motivate the crate — hybrid, multi-dimensional, per-key, and layered. All of it shares one correctness rule (peek-then-commit) so a request never spends in one limiter when another would block it. No breaking changes from 0.1; that release was an empty scaffold.
What is throttle-net?
A general-purpose outbound throttling library. Where rate-net protects your service from being overwhelmed (inbound), throttle-net protects your service from overwhelming the downstreams it calls — and from being banned by them. Its defining operation is to wait for capacity rather than reject the caller: you pace your own outbound work. It consumes better-bucket for token-bucket accounting and clock-lib for time, and builds the waiting, cost-aware, composable surface on top.
What's new in 0.2.0
Throttle — the Tier-1 token bucket
The common case in two calls. Construct with per_second / per_duration, then pace your work with acquire().await, which returns as soon as a token is free rather than rejecting you.
use throttle_net::Throttle;
let throttle = Throttle::per_second(100);
throttle.acquire().await?; // waits if necessary
throttle.acquire_with_cost(10).await?; // a heavier requesttry_acquire() / try_acquire_with_cost(n) are the non-blocking forms, peek(cost) is a non-consuming check, and with_clock injects a ManualClock for deterministic tests. Accounting is lock-free — one atomic compare-and-swap per acquire, ~27ns uncontended.
Limiter — the trait everything shares
pub trait Limiter: Send + Sync {
fn peek(&self, cost: u32) -> Decision; // non-consuming
fn acquire_cost(&self, cost: u32) -> Decision; // consuming
fn available(&self) -> u32;
fn capacity(&self) -> u32;
}acquire_cost is the synchronous, consuming core; the waiting acquire surfaces are thin layers on top. peek is what makes "must pass all" composition correct: a composite peeks every constituent before committing, so an early limiter never spends a token for a request a later one refuses.
Hybrid — must pass all constituents
Combine limiters so a request must satisfy every one — "10 per second and 100 per minute" on a single resource, where either ceiling can bind. A Hybrid is itself a Limiter, so hybrids nest.
use std::time::Duration;
use throttle_net::{Hybrid, Throttle};
let hybrid = Hybrid::builder()
.limiter(Throttle::per_second(10))
.limiter(Throttle::per_duration(100, Duration::from_secs(60)))
.build();The peek-then-commit contract is tested directly: when a later constituent refuses, the earlier ones keep their tokens.
MultiLimiter — multi-dimensional budgets
The killer feature for LLM APIs. One call spends against several named budgets at once — requests, input tokens, output tokens — and is admitted only when every dimension can afford its share.
use std::time::Duration;
use throttle_net::{MultiLimiter, Throttle};
let minute = Duration::from_secs(60);
let limiter = MultiLimiter::builder()
.dimension("requests", Throttle::per_duration(60, minute))
.dimension("input_tokens", Throttle::per_duration(100_000, minute))
.dimension("output_tokens", Throttle::per_duration(20_000, minute))
.build();
limiter
.acquire_costs(&[("requests", 1), ("input_tokens", 1500), ("output_tokens", 200)])
.await?;The runnable examples/llm_budget.rs drives a batch of calls through this and shows the limiter pacing itself when a budget runs low.
PerKey — independent state per key, bounded memory
Each key — a tenant, a user, a token — gets its own bucket, so one noisy key cannot spend another's budget. State lives in a sharded concurrent map: an existing key's acquire takes only a shard read lock plus the bucket's atomic accounting, so unrelated keys never contend. K is any hashable type.
use throttle_net::PerKey;
let limiter: PerKey<String> = PerKey::per_second(100);
limiter.acquire(&"tenant:42".to_string()).await?;Memory is bounded by Eviction — an idle TTL and/or a hard key cap, enforced lazily and per-shard, so a flood of unique keys hits a ceiling instead of growing without limit. The default policy is bounded. A populated 10 000-key lookup measures ~70ns.
Layered — ordered scopes
Stack a global ceiling, a per-key share, and a per-endpoint cap; a request must clear every configured scope. The key and endpoint types are independent (a numeric tenant id and a string route, say).
use throttle_net::{Layered, PerKey, Throttle};
let layered = Layered::<String>::builder()
.global(Throttle::per_second(1000))
.per_key(PerKey::per_second(100))
.per_endpoint(PerKey::per_second(50))
.build();
layered.acquire(&"tenant:42".to_string(), &"/v1/chat".to_string()).await?;Property tests and benchmarks
tests/proptests.rs encodes the defining invariant — a limiter never admits more than its capacity, and a composite never admits more than its binding scope — as an equality checked over a wide input space, for Throttle, Hybrid, PerKey, Layered, and MultiLimiter, plus the per-key flood bound. benches/throttle_bench.rs tracks the single-throttle acquire and the 10k-key per-key lookup.
Breaking changes
None. v0.1.0 was an empty scaffold; this is the first release with a public surface.
Verification
Run on Windows x86_64, Rust stable 1.93.1; the same commands run in the CI matrix on Linux, macOS, and Windows across stable and MSRV 1.85:
cargo fmt --all -- --check
cargo clippy --all-targets --all-features -- -D warnings
cargo clippy --no-default-features
cargo test --all-features
cargo test --no-default-features
RUSTDOCFLAGS="-D warnings" cargo doc --no-deps --all-features
cargo bench --bench throttle_bench
cargo deny check
cargo auditAll green. Counts at this tag (--all-features): 53 unit tests, 7 property tests, 45 doctests. Benchmarks: throttle_try_acquire ~27ns, perkey_lookup_10k_existing ~70ns (target: < 1µs).
What's next
- v0.3.0 — Retry and backoff. Standalone retry that also composes with limiters: constant / linear / exponential backoff with Full, Equal, and Decorrelated jitter; retry-on / give-up-on conditions;
Retry-Afterheader parsing honored over the computed backoff.
Installation
[dependencies]
throttle-net = "0.2"MSRV: Rust 1.85.
Documentation
Changelog: CHANGELOG.md.
v0.1.0 — Scaffold
throttle-net v0.1.0 — Scaffold
The repository bootstrap. v0.1.0 establishes the structure, tooling, and quality gates the implementation will be built on — crate metadata, the feature-flag layout, the test-stack dev-dependencies, the CI matrix, the security gates (cargo-deny + cargo-audit), and the docs/API.md reference skeleton. There is no throttling logic yet; the algorithm surface lands across the 0.x series per the roadmap. Zero public API beyond the crate root.
What is throttle-net?
A general-purpose outbound throttling and resilience library. Where rate-net protects your service from being overwhelmed (inbound), throttle-net protects your service from overwhelming downstream APIs and from being banned by them (outbound). It composes rate-limiting algorithms into hybrid and layered policies and adds the parts nobody ships: multi-dimensional cost-aware limits and adaptive throttling. The common case is one builder and one acquire().await?.
What's in 0.1.0
Crate metadata and release profile
Cargo.toml carries the full published-crate metadata: description, keywords, categories, repository / homepage / documentation links, dual Apache-2.0 OR MIT license, Rust 2024 edition, MSRV 1.85, and docs.rs configuration (all-features + --cfg docsrs). The release profile is tuned for the eventual hot paths: opt-level = 3, fat LTO, a single codegen unit, panic = "abort", and symbol stripping.
Feature-flag layout
The flag surface is declared up front so downstream wiring is stable as features land:
[features]
default = ["std", "tokio"]
std = []
tokio = ["dep:tokio"]
adaptive = []
circuit-breaker = []
provider-headers = []
provider-llm = ["provider-headers"]
metrics = ["dep:metrics"]
tracing = ["dep:tracing"]
serde = ["dep:serde"]Runtime adapters beyond tokio (async-std, smol) are scheduled for v0.8.0 and are intentionally not wired until there is code that uses them — this keeps the dependency graph minimal and avoids carrying an unused, upstream-discontinued runtime through seven phases of development.
Test-stack dev-dependencies
The full verification stack is pinned and ready: criterion for hot-path benchmarks, proptest for invariant coverage, and loom (under cfg(loom)) for concurrency model checks. A harness = false benchmark target is registered so cargo bench resolves cleanly before the first benchmark lands.
Crate-root lint posture
src/lib.rs sets the non-negotiable posture from line one: #![forbid(unsafe_code)], #![deny(missing_docs)], no_std-ready behind the std feature, and doc_cfg annotations under docsrs. A single smoke test proves the crate compiles, links, and runs.
CI matrix and gates
.github/workflows/ci.yml runs three jobs:
- test —
fmt --check,clippy --all-targets --all-features -D warnings,test --all-features, anddocwith-D warnings, across Linux / macOS / Windows on both stable and the 1.85 MSRV. - loom — the concurrency model-check slot, wired ahead of the first lock-free path.
- security —
cargo auditagainst the RustSec advisory database andcargo deny checkfor advisories, bans, licenses, and sources.
.gitattributes normalises line endings to LF so the Windows runner does not fail fmt --check against the newline_style = "Unix" rule.
Breaking changes
None. This is the initial release.
Verification
Run on Windows x86_64, Rust stable 1.93.1; the same commands run in the CI matrix on Linux, macOS, and Windows across stable and MSRV 1.85:
cargo fmt --all -- --check
cargo clippy --all-targets --all-features -- -D warnings
cargo test --all-features
RUSTDOCFLAGS="-D warnings" cargo doc --no-deps --all-features
cargo deny check
cargo auditAll green. At this tag: 1 unit test (smoke), 0 doctests; cargo deny reports advisories ok, bans ok, licenses ok, sources ok; cargo audit scans the dependency tree with zero advisories.
What's next
- v0.2.0 — Composition. The differentiators, front-loaded: hybrid limiters (a request must pass every constituent), layered scopes (global / per-key / per-endpoint), per-key state with a sharded map and inactive-key cleanup, and cost-aware acquisition (
acquire_with_costplus multi-dimensionalacquire_costs). Exit criteria include a 10k-key lookup benchmarked under 1µs and an end-to-end multi-dimensional LLM example.
Installation
[dependencies]
throttle-net = "0.1"MSRV: Rust 1.85.
Documentation
Changelog: CHANGELOG.md.