Skip to content

feat(service): add CircuitBreaker middleware#2545

Open
Mattbusel wants to merge 3 commits intohyperium:masterfrom
Mattbusel:feat/circuit-breaker-layer
Open

feat(service): add CircuitBreaker middleware#2545
Mattbusel wants to merge 3 commits intohyperium:masterfrom
Mattbusel:feat/circuit-breaker-layer

Conversation

@Mattbusel
Copy link

Motivation

gRPC channels can fail silently. A downstream that is overwhelmed or crashing continues to accept connections and return errors rather than refusing them. Without a circuit breaker, callers flood a failing service with retries, wasting resources and extending recovery time.

Tonic already exposes Tower's Interceptor, RecoverError, LayerExt, and Layered helpers through tonic::service. A circuit breaker is the natural complement — it is the canonical resilience pattern for RPC clients.

Change

Adds tonic::service::circuit_breaker::{CircuitBreaker, CircuitBreakerLayer} and re-exports both from tonic::service.

State machine

 ┌────────┐  consecutive_failures >= threshold  ┌──────┐
 │ Closed │ ─────────────────────────────────► │ Open │
 └────────┘                                     └──────┘
     ▲                                              │
     │  success_rate >= success_threshold           │ timeout elapsed
     │                                              ▼
     └────────────────────────────────── ┌──────────┐
                                         │ HalfOpen │
                                         └──────────┘
  • Closed: requests flow through normally.
  • Open: requests are immediately rejected with Status::unavailable("circuit breaker is open").
  • HalfOpen: one probe request is allowed through after timeout; success above success_threshold closes the circuit, failure reopens it.

Usage

use tonic::service::CircuitBreakerLayer;
use tower::ServiceBuilder;
use std::time::Duration;

let channel = tonic::transport::Channel::from_static("http://[::1]:50051")
    .connect()
    .await?;

// Open after 5 consecutive failures; probe after 30 s;
// close when 60 % of the sliding window are successes.
let channel = ServiceBuilder::new()
    .layer(CircuitBreakerLayer::new(5, 0.6, Duration::from_secs(30)))
    .service(channel);

let mut client = MyServiceClient::new(channel);

Implementation notes

  • No async runtime dependency in poll_ready/call — state is guarded by std::sync::Mutex (not tokio::sync), so it works correctly in single-threaded test executors.
  • The circuit gate is checked in poll_ready (not call), respecting the Tower service contract.
  • Uses pin_project and tower_layer/tower_service already in [dependencies] — no new deps.

gRPC channels can fail silently — a downstream that is overwhelmed or
crashing will keep accepting connections and returning errors rather than
refusing them.  Without a circuit breaker, callers retry into a failing
service, wasting resources and extending outages.

Add tonic::service::circuit_breaker::{CircuitBreaker, CircuitBreakerLayer}:

  - Three-state machine: Closed → Open → HalfOpen
  - Closed:   requests flow through normally
  - Open:     requests are immediately rejected with Status::unavailable
              ("circuit breaker is open") until `timeout` elapses
  - HalfOpen: one probe request is allowed; success above
              `success_threshold` closes the circuit, any failure reopens it

The implementation is pure Tower middleware — no async runtime dependency
in poll_ready/call, state guarded by std::sync::Mutex for zero overhead.

Usage:
  let channel = ServiceBuilder::new()
      .layer(CircuitBreakerLayer::new(5, 0.6, Duration::from_secs(30)))
      .service(channel);
  let mut client = MyClient::new(channel);

Uses pin_project and tower_layer/tower_service already in [dependencies].
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant