```{contents}
```
## **Exception & Circuit Breaker in LangGraph**

In production-grade LangGraph systems, **exception handling** and **circuit breakers** are critical reliability mechanisms that prevent cascading failures, protect external services, and maintain system stability under partial outages or faulty model behavior.

---

### **1. Motivation: Why Reliability Controls Matter**

LLM workflows interact with unreliable components:

* External APIs
* Tools and databases
* Unstable LLM responses
* Long-running stateful processes

Failures are inevitable.
LangGraph treats them as **first-class control-flow events**.

---

### **2. Exception Handling in LangGraph**

An **exception** is any runtime failure that interrupts node execution.

#### Sources of Exceptions

| Source  | Example                        |
| ------- | ------------------------------ |
| LLM     | Rate limit, malformed response |
| Tool    | API timeout, HTTP error        |
| Code    | Python exception               |
| State   | Invalid schema, missing fields |
| Network | Connection failure             |

#### Exception-Aware Node Design

```python
def tool_node(state):
    try:
        result = external_api(state["query"])
        return {"result": result}
    except Exception as e:
        return {"error": str(e)}
```

State always returns safely — failures are converted into **state signals**.

---

### **3. Failure Routing & Recovery**

LangGraph enables **failure-aware control flow**.

```python
def router(state):
    if "error" in state:
        return "handle_error"
    return "next_step"
```

```python
builder.add_conditional_edges("tool", router, {
    "handle_error": "recovery",
    "next_step": "process"
})
```

This creates **explicit error paths** in the graph.

---

### **4. Circuit Breaker Concept**

A **circuit breaker** prevents repeated execution of failing components.

#### Circuit Breaker States

| State     | Meaning          |
| --------- | ---------------- |
| Closed    | Normal operation |
| Open      | Requests blocked |
| Half-Open | Trial executions |

---

### **5. Implementing Circuit Breaker in LangGraph**

#### State Schema

```python
class State(TypedDict):
    failures: int
    blocked: bool
```

#### Protected Node

```python
def protected_call(state):
    if state["blocked"]:
        return {"error": "Circuit open"}

    try:
        result = unstable_service()
        return {"result": result, "failures": 0}
    except:
        failures = state["failures"] + 1
        blocked = failures >= 3
        return {"failures": failures, "blocked": blocked}
```

#### Routing Logic

```python
def route(state):
    if state["blocked"]:
        return "fallback"
    if "error" in state:
        return "protected"
    return END
```

---

### **6. Full Workflow**

```
Call Service → Check Failure → Retry → Break Circuit → Fallback → Recover
```

---

### **7. Production Enhancements**

| Feature             | Purpose                   |
| ------------------- | ------------------------- |
| Retry policies      | Handle transient failures |
| Exponential backoff | Reduce overload           |
| Timeouts            | Prevent hanging           |
| Dead-letter state   | Preserve failed jobs      |
| Checkpointing       | Resume safely             |
| Metrics             | Observe failure rates     |
| Human override      | Manual recovery           |

---

### **8. Why LangGraph Excels Here**

Traditional pipelines crash.
LangGraph **routes failures**.

| Traditional Code     | LangGraph            |
| -------------------- | -------------------- |
| Exceptions terminate | Exceptions redirect  |
| No recovery path     | Explicit recovery    |
| No memory            | Persistent state     |
| Hard to debug        | Full execution trace |

---

### **9. Mental Model**

LangGraph implements **resilient control systems**:

> **Detect → Isolate → Protect → Recover → Continue**

This makes it suitable for **enterprise LLM systems** where failure is expected but downtime is unacceptable.

---

### Demonstration

In [1]:
# One-cell demonstration: Exception handling + Circuit Breaker in LangGraph

from typing import TypedDict
from langgraph.graph import StateGraph, END
import random

# ---------- State ----------
class State(TypedDict):
    failures: int
    blocked: bool
    result: str

# ---------- Unstable external service ----------
def unstable_service():
    if random.random() < 0.6:     # 60% failure rate
        raise RuntimeError("Service failure")
    return "Service success"

# ---------- Protected node with circuit breaker ----------
def protected_call(state: State):
    if state["blocked"]:
        return {"result": "Circuit open - skipping call"}

    try:
        out = unstable_service()
        return {"result": out, "failures": 0}
    except Exception as e:
        failures = state["failures"] + 1
        blocked = failures >= 3
        return {"result": str(e), "failures": failures, "blocked": blocked}

# ---------- Router ----------
def route(state: State):
    if state["blocked"]:
        return "fallback"
    if state["failures"] > 0:
        return "protected"
    return END

# ---------- Fallback ----------
def fallback(state: State):
    return {"result": "Fallback response (system protected)"}

# ---------- Build Graph ----------
builder = StateGraph(State)

builder.add_node("protected", protected_call)
builder.add_node("fallback", fallback)

builder.set_entry_point("protected")

builder.add_conditional_edges("protected", route, {
    "protected": "protected",
    "fallback": "fallback",
    END: END
})

builder.add_edge("fallback", END)

graph = builder.compile()

# ---------- Run ----------
state = {"failures": 0, "blocked": False, "result": ""}
final = graph.invoke(state, config={"recursion_limit": 10})
print(final)


{'failures': 3, 'blocked': True, 'result': 'Fallback response (system protected)'}
