```{contents}
```
## Circuit Breakers

---

### 1. Motivation and Intuition

**Generative AI systems are probabilistic, expensive, and failure-prone.**
They depend on:

* Remote APIs (LLMs, vector DBs, tools)
* GPU-heavy inference
* Long multi-step pipelines (RAG, agents, tools)

A **circuit breaker** is a **resilience mechanism** that prevents cascading failures by **stopping calls to a failing component before it collapses the entire system**.

> **Analogy:**
> Like an electrical circuit breaker that cuts power during overload, an AI circuit breaker halts requests to unstable services.

---

### 2. Where Circuit Breakers Fit in GenAI Architecture

| Layer         | Failure Example                       |
| ------------- | ------------------------------------- |
| LLM API       | Rate limits, timeouts, hallucinations |
| Vector DB     | Latency spikes, index unavailable     |
| Tools / APIs  | HTTP 5xx, invalid responses           |
| Agents        | Infinite loops, runaway tool usage    |
| GPU Inference | OOM, overload                         |
| User Traffic  | Prompt flooding, abuse                |

**Circuit breakers protect each of these layers.**

---

### 3. Circuit Breaker States

| State         | Behavior                         |
| ------------- | -------------------------------- |
| **Closed**    | Normal operation                 |
| **Open**      | All requests blocked immediately |
| **Half-Open** | Limited test requests allowed    |

```
Closed ──(failures exceed threshold)──► Open
Open ──(cooldown timeout)────────────► Half-Open
Half-Open ──(success)───────────────► Closed
Half-Open ──(failure)───────────────► Open
```

---

### 4. Why GenAI Needs Circuit Breakers

| Risk                 | Consequence                 |
| -------------------- | --------------------------- |
| LLM outage           | Whole product freezes       |
| Rate limiting        | Expensive retries & latency |
| Hallucination bursts | Trust collapse              |
| Tool chain failures  | Agent deadlocks             |
| Prompt attacks       | Resource exhaustion         |

**Without circuit breakers:**

* Failures cascade
* Latency explodes
* Costs spiral
* User experience collapses

---

### 5. Typical GenAI Circuit Breaker Triggers

| Trigger        | Example                      |
| -------------- | ---------------------------- |
| Error rate     | >30% LLM API failures        |
| Latency        | >10s response time           |
| Cost spike     | Token budget exceeded        |
| Quality        | Hallucination score too high |
| Loop detection | Agent stuck in cycle         |
| Safety         | Repeated policy violations   |

---

### 6. Workflow in a RAG System

```
User Prompt
   │
   ▼
[Request Router]
   │
   ├─► Circuit Breaker ──► LLM API
   │         │
   │         └─(open)─► Fallback model / cached answer
   │
   └─► Vector DB ──► Retriever ──► LLM
```

---

### 7. Minimal Implementation (Python)

```python
import time

class CircuitBreaker:
    def __init__(self, failure_limit=5, cooldown=10):
        self.failures = 0
        self.failure_limit = failure_limit
        self.cooldown = cooldown
        self.last_failure_time = 0
        self.state = "CLOSED"

    def call(self, func):
        if self.state == "OPEN":
            if time.time() - self.last_failure_time > self.cooldown:
                self.state = "HALF_OPEN"
            else:
                raise Exception("Circuit open")

        try:
            result = func()
            self.failures = 0
            self.state = "CLOSED"
            return result
        except Exception:
            self.failures += 1
            self.last_failure_time = time.time()
            if self.failures >= self.failure_limit:
                self.state = "OPEN"
            raise
```

**Usage with LLM call:**

```python
cb = CircuitBreaker()

def call_llm():
    return llm_api.generate(prompt)

response = cb.call(call_llm)
```

---

### 8. GenAI-Specific Enhancements

| Enhancement           | Purpose                          |
| --------------------- | -------------------------------- |
| Fallback models       | Switch to smaller/cheaper LLM    |
| Prompt caching        | Serve last good response         |
| Cost-aware breaker    | Open if token burn too high      |
| Quality-aware breaker | Stop if hallucination detected   |
| Agent-step limiter    | Prevent infinite reasoning loops |

---

### 9. Circuit Breaker + Agent Safety

| Failure                  | Protection                   |
| ------------------------ | ---------------------------- |
| Agent infinite tool loop | Step counter breaker         |
| Prompt injection storm   | Rate-limit breaker           |
| Bad model output         | Quality breaker              |
| Vector DB down           | Retrieval breaker + fallback |

---

### 10. Summary

**Circuit breakers are mandatory for production-grade GenAI systems.**

They ensure:

* Stability
* Cost control
* User trust
* Safe agent behavior
* Graceful degradation

