## 08 - Custom Middleware in LangChain

**Key Concept**: Custom middleware lets you implement hooks that run at specific points in agent execution. Build your own logging, validation, retries, caching, and control flow logic.

**What this covers:**
1. Hook types: Node-style vs Wrap-style
2. Decorator-based middleware (quick prototyping)
3. Class-based middleware (complex logic, multiple hooks)
4. Custom state schemas
5. Execution order and agent jumps
6. Practical examples: logging, retries, dynamic model selection, tool monitoring

**Two styles of hooks:**
- **Node-style**: Run sequentially at specific points (`before_model`, `after_model`, etc.)
- **Wrap-style**: Wrap around execution for control flow (`wrap_model_call`, `wrap_tool_call`)


In [1]:
from langchain_groq import ChatGroq
from langchain.agents import create_agent
from langchain.tools import tool
from langchain_core.messages import HumanMessage, AIMessage, SystemMessage
from pydantic import BaseModel, Field
from typing import Any, Callable, Literal
from typing_extensions import NotRequired
from langgraph.checkpoint.memory import MemorySaver


  from .autonotebook import tqdm as notebook_tqdm


In [2]:
llm = ChatGroq(
    model="meta-llama/llama-4-maverick-17b-128e-instruct",
    temperature=0,
)

# Sample tools for demonstrations
@tool
def calculator(expression: str) -> str:
    """Evaluate a math expression."""
    try:
        result = eval(expression)
        return f"Result: {result}"
    except Exception as e:
        return f"Error: {e}"

@tool
def get_weather(city: str) -> str:
    """Get weather for a city."""
    weather_data = {"NYC": "Sunny, 72°F", "London": "Cloudy, 58°F", "Tokyo": "Rainy, 65°F"}
    return weather_data.get(city, f"Weather for {city}: Clear, 70°F")

@tool
def search_web(query: str) -> str:
    """Search the web for information."""
    return f"Search results for '{query}': Found 3 relevant articles."

print("Model and tools initialized")


Model and tools initialized


### Node-Style Hooks

Run sequentially at specific execution points. Use for logging, validation, and state updates.

**Available hooks:**
- `before_agent` - Before agent starts (once per invocation)
- `before_model` - Before each model call
- `after_model` - After each model response
- `after_agent` - After agent completes (once per invocation)

**Return values:**
- `None` - Continue normal execution
- `dict` - Update state with returned values
- `{"jump_to": "end"}` - Exit early (requires `can_jump_to` config)


In [3]:
# Decorator-based node-style hooks
from langchain.agents.middleware import before_model, after_model, AgentState
from langgraph.types import StreamWriter

# Simple logging middleware using decorators
@before_model
def log_before_model(state: AgentState, writer: StreamWriter) -> dict[str, Any] | None:
    """Log before each model call."""
    print(f"[BEFORE MODEL] Message count: {len(state['messages'])}")
    return None  # Continue normal execution

@after_model
def log_after_model(state: AgentState, writer: StreamWriter) -> dict[str, Any] | None:
    """Log after each model response."""
    last_msg = state["messages"][-1]
    content_preview = str(last_msg.content)[:50] if last_msg.content else "(no content)"
    print(f"[AFTER MODEL] Response: {content_preview}...")
    return None

# Create agent with logging middleware
logging_agent = create_agent(
    llm,
    tools=[calculator, get_weather],
    middleware=[log_before_model, log_after_model],
)

# Test it
result = logging_agent.invoke({
    "messages": [{"role": "user", "content": "What's 25 * 4?"}]
})
print(f"\nFinal answer: {result['messages'][-1].content}")


[BEFORE MODEL] Message count: 1
[AFTER MODEL] Response: (no content)...
[BEFORE MODEL] Message count: 3
[AFTER MODEL] Response: The result of 25 * 4 is 100....

Final answer: The result of 25 * 4 is 100.


### Wrap-Style Hooks

Intercept execution and control when the handler is called. You decide:
- Call handler zero times (short-circuit)
- Call handler once (normal flow)
- Call handler multiple times (retry logic)

**Available hooks:**
- `wrap_model_call` - Around each model call
- `wrap_tool_call` - Around each tool call

Use for: retries, caching, transformation, fallbacks.


In [4]:
# Decorator-based wrap-style hooks
from langchain.agents.middleware import wrap_model_call, wrap_tool_call, ModelRequest, ModelResponse
from langchain.tools.tool_node import ToolCallRequest
from langchain_core.messages import ToolMessage
from langgraph.types import Command
import time

# Retry middleware - wraps model calls with retry logic
@wrap_model_call
def retry_on_error(
    request: ModelRequest,
    handler: Callable[[ModelRequest], ModelResponse],
) -> ModelResponse:
    """Retry model calls up to 3 times on failure."""
    max_retries = 3
    for attempt in range(max_retries):
        try:
            return handler(request)
        except Exception as e:
            if attempt == max_retries - 1:
                raise  # Re-raise on final attempt
            print(f"[RETRY] Attempt {attempt + 1}/{max_retries} failed: {e}")
            time.sleep(1)  # Brief delay before retry

# Tool monitoring middleware - wraps tool calls
@wrap_tool_call
def monitor_tools(
    request: ToolCallRequest,
    handler: Callable[[ToolCallRequest], ToolMessage | Command],
) -> ToolMessage | Command:
    """Log tool execution details."""
    tool_name = request.tool_call["name"]
    tool_args = request.tool_call["args"]
    print(f"[TOOL CALL] {tool_name} with args: {tool_args}")
    
    start_time = time.time()
    result = handler(request)
    elapsed = time.time() - start_time
    
    print(f"[TOOL DONE] {tool_name} completed in {elapsed:.2f}s")
    return result

# Create agent with wrap-style middleware
monitored_agent = create_agent(
    llm,
    tools=[calculator, get_weather],
    middleware=[retry_on_error, monitor_tools],
)

# Test it
result = monitored_agent.invoke({
    "messages": [{"role": "user", "content": "What's the weather in Tokyo?"}]
})
print(f"\nFinal answer: {result['messages'][-1].content}")


[TOOL CALL] get_weather with args: {'city': 'Tokyo'}
[TOOL DONE] get_weather completed in 0.00s

Final answer: It's rainy in Tokyo with a temperature of 65°F.


### Class-Based Middleware

More powerful for complex middleware with multiple hooks or configuration.

**When to use classes:**
- Multiple hooks needed in a single middleware
- Complex configuration (thresholds, models, etc.)
- Reuse across projects with init-time config
- Need both sync and async implementations


In [5]:
from langchain.agents.middleware import AgentMiddleware, hook_config

# Class-based middleware with multiple hooks and configuration
class LoggingMiddleware(AgentMiddleware):
    """Comprehensive logging middleware with configurable verbosity."""
    
    def __init__(self, verbose: bool = True):
        super().__init__()
        self.verbose = verbose
        self.call_count = 0
    
    def before_model(self, state: AgentState, writer: StreamWriter) -> dict[str, Any] | None:
        self.call_count += 1
        if self.verbose:
            print(f"[LOG] Model call #{self.call_count}, messages: {len(state['messages'])}")
        return None
    
    def after_model(self, state: AgentState, writer: StreamWriter) -> dict[str, Any] | None:
        if self.verbose:
            last_msg = state["messages"][-1]
            has_tool_calls = hasattr(last_msg, "tool_calls") and last_msg.tool_calls
            print(f"[LOG] Model responded, tool_calls: {has_tool_calls}")
        return None

# Class-based retry middleware with configurable attempts
class RetryMiddleware(AgentMiddleware):
    """Retry failed model calls with configurable attempts and delay."""
    
    def __init__(self, max_retries: int = 3, delay: float = 1.0):
        super().__init__()
        self.max_retries = max_retries
        self.delay = delay
    
    def wrap_model_call(
        self,
        request: ModelRequest,
        handler: Callable[[ModelRequest], ModelResponse],
    ) -> ModelResponse:
        for attempt in range(self.max_retries):
            try:
                return handler(request)
            except Exception as e:
                if attempt == self.max_retries - 1:
                    raise
                print(f"[RETRY] Attempt {attempt + 1}/{self.max_retries}: {e}")
                time.sleep(self.delay)

# Use class-based middleware
class_agent = create_agent(
    llm,
    tools=[calculator, get_weather],
    middleware=[
        LoggingMiddleware(verbose=True),
        RetryMiddleware(max_retries=2, delay=0.5),
    ],
)

result = class_agent.invoke({
    "messages": [{"role": "user", "content": "Calculate 100 / 5"}]
})
print(f"\nResult: {result['messages'][-1].content}")


[LOG] Model call #1, messages: 1
[LOG] Model responded, tool_calls: [{'name': 'calculator', 'args': {'expression': '100 / 5'}, 'id': 'nqjn5sx67', 'type': 'tool_call'}]
[LOG] Model call #2, messages: 3
[LOG] Model responded, tool_calls: []

Result: The result of 100 divided by 5 is 20.0.


### Custom State Schema

Middleware can extend the agent's state with custom properties. Useful for:
- Tracking metrics (call counts, token usage)
- Storing user context
- Passing data between hooks


In [6]:
# Custom state schema with additional properties
class CustomState(AgentState):
    """Extended state with custom tracking fields."""
    model_call_count: NotRequired[int]
    user_id: NotRequired[str]
    session_start: NotRequired[float]

# Decorator-based middleware with custom state
@before_model(state_schema=CustomState, can_jump_to=["end"])
def check_call_limit(state: CustomState, writer: StreamWriter) -> dict[str, Any] | None:
    """Limit model calls per session."""
    count = state.get("model_call_count", 0)
    print(f"[LIMIT CHECK] Call count: {count}")
    if count >= 5:  # Max 5 model calls
        return {
            "messages": [AIMessage(content="Call limit reached. Session ended.")],
            "jump_to": "end"
        }
    return None

@after_model(state_schema=CustomState)
def increment_counter(state: CustomState, writer: StreamWriter) -> dict[str, Any] | None:
    """Increment call counter after each model call."""
    new_count = state.get("model_call_count", 0) + 1
    print(f"[COUNTER] Incremented to {new_count}")
    return {"model_call_count": new_count}

# Create agent with custom state middleware
limited_agent = create_agent(
    llm,
    tools=[calculator],
    middleware=[check_call_limit, increment_counter],
)

# Invoke with initial custom state
result = limited_agent.invoke({
    "messages": [HumanMessage(content="What's 10 + 20?")],
    "model_call_count": 0,
    "user_id": "user-123",
    "session_start": time.time(),
})
print(f"\nResult: {result['messages'][-1].content}")
print(f"Final call count: {result.get('model_call_count', 'N/A')}")


[LIMIT CHECK] Call count: 0
[COUNTER] Incremented to 1
[LIMIT CHECK] Call count: 1
[COUNTER] Incremented to 2

Result: 30
Final call count: 2


### Class-Based Middleware with Custom State

Same pattern using a class for better encapsulation and reusability.


In [7]:
# Class-based middleware with custom state schema
class CallCounterMiddleware(AgentMiddleware[CustomState]):
    """Track and limit model calls with configurable threshold."""
    state_schema = CustomState
    
    def __init__(self, max_calls: int = 10):
        super().__init__()
        self.max_calls = max_calls
    
    @hook_config(can_jump_to=["end"])
    def before_model(self, state: CustomState, writer: StreamWriter) -> dict[str, Any] | None:
        count = state.get("model_call_count", 0)
        if count >= self.max_calls:
            return {
                "messages": [AIMessage(content=f"Maximum {self.max_calls} calls reached.")],
                "jump_to": "end"
            }
        return None
    
    def after_model(self, state: CustomState, writer: StreamWriter) -> dict[str, Any] | None:
        return {"model_call_count": state.get("model_call_count", 0) + 1}

# Use class-based middleware with custom state
counter_agent = create_agent(
    llm,
    tools=[calculator],
    middleware=[CallCounterMiddleware(max_calls=3)],
)

result = counter_agent.invoke({
    "messages": [HumanMessage(content="What's 5 * 5?")],
    "model_call_count": 0,
})
print(f"Result: {result['messages'][-1].content}")


Result: The result of 5 * 5 is 25.


### Execution Order

When using multiple middleware, execution order matters:

```
agent = create_agent(model, middleware=[mw1, mw2, mw3], tools=[...])
```

**Before hooks**: First to last (mw1 -> mw2 -> mw3)
**Wrap hooks**: Nested (mw1 wraps mw2 wraps mw3 wraps handler)
**After hooks**: Last to first (mw3 -> mw2 -> mw1)


In [12]:
# Demonstrate execution order with numbered middleware
class OrderedMiddleware(AgentMiddleware):
    """Middleware that logs its execution order."""
    
    def __init__(self, middleware_name: str):
        super().__init__()
        self._middleware_name = middleware_name
    
    @property
    def name(self) -> str:
        """Override the name property to return unique instance name."""
        return self._middleware_name
    
    def before_model(self, state: AgentState, writer: StreamWriter) -> dict[str, Any] | None:
        print(f"[BEFORE] {self._middleware_name}")
        return None
    
    def after_model(self, state: AgentState, writer: StreamWriter) -> dict[str, Any] | None:
        print(f"[AFTER] {self._middleware_name}")
        return None
    
    def wrap_model_call(
        self,
        request: ModelRequest,
        handler: Callable[[ModelRequest], ModelResponse],
    ) -> ModelResponse:
        print(f"[WRAP ENTER] {self._middleware_name}")
        result = handler(request)
        print(f"[WRAP EXIT] {self._middleware_name}")
        return result

# Create agent with three ordered middleware
order_agent = create_agent(
    llm,
    tools=[calculator],
    middleware=[
        OrderedMiddleware("MW1"),
        OrderedMiddleware("MW2"),
        OrderedMiddleware("MW3"),
    ],
)

print("Execution order demonstration:")
print("-" * 40)
result = order_agent.invoke({
    "messages": [{"role": "user", "content": "What's 2 + 2?"}]
})
print("-" * 40)
print("Notice: before hooks run first-to-last, after hooks run last-to-first")


Execution order demonstration:
----------------------------------------
[BEFORE] MW1
[BEFORE] MW2
[BEFORE] MW3
[WRAP ENTER] MW1
[WRAP ENTER] MW2
[WRAP ENTER] MW3
[WRAP EXIT] MW3
[WRAP EXIT] MW2
[WRAP EXIT] MW1
[AFTER] MW3
[AFTER] MW2
[AFTER] MW1
[BEFORE] MW1
[BEFORE] MW2
[BEFORE] MW3
[WRAP ENTER] MW1
[WRAP ENTER] MW2
[WRAP ENTER] MW3
[WRAP EXIT] MW3
[WRAP EXIT] MW2
[WRAP EXIT] MW1
[AFTER] MW3
[AFTER] MW2
[AFTER] MW1
----------------------------------------
Notice: before hooks run first-to-last, after hooks run last-to-first


### Agent Jumps

Exit early from middleware by returning `{"jump_to": "target"}`.

**Available jump targets:**
- `"end"` - Jump to end of agent execution
- `"tools"` - Jump to tools node
- `"model"` - Jump to model node

Requires `can_jump_to` config on the hook.


In [13]:
# Content blocking middleware using agent jumps
class ContentBlockerMiddleware(AgentMiddleware):
    """Block responses containing specific keywords."""
    
    def __init__(self, blocked_words: list[str]):
        super().__init__()
        self.blocked_words = [w.lower() for w in blocked_words]
    
    @hook_config(can_jump_to=["end"])
    def after_model(self, state: AgentState, writer: StreamWriter) -> dict[str, Any] | None:
        last_msg = state["messages"][-1]
        content = str(last_msg.content).lower()
        
        for word in self.blocked_words:
            if word in content:
                print(f"[BLOCKED] Response contained '{word}'")
                return {
                    "messages": [AIMessage(content="I cannot provide that information.")],
                    "jump_to": "end"
                }
        return None

# Create agent with content blocker
blocked_agent = create_agent(
    llm,
    tools=[search_web],
    middleware=[ContentBlockerMiddleware(blocked_words=["secret", "confidential"])],
)

# Test with normal query
result = blocked_agent.invoke({
    "messages": [{"role": "user", "content": "Search for public weather data"}]
})
print(f"Normal query result: {result['messages'][-1].content[:100]}...")


Normal query result: You can find public weather data from various sources such as the National Weather Service, OpenWeat...


### Practical Examples

Real-world middleware patterns from the LangChain docs.


In [14]:
# Example 1: Dynamic Model Selection
# Use different models based on conversation complexity

complex_model = ChatGroq(model="llama-3.3-70b-versatile", temperature=0)
simple_model = ChatGroq(model="llama-3.1-8b-instant", temperature=0)

class DynamicModelMiddleware(AgentMiddleware):
    """Select model based on conversation length/complexity."""
    
    def __init__(self, simple_model, complex_model, threshold: int = 5):
        super().__init__()
        self.simple_model = simple_model
        self.complex_model = complex_model
        self.threshold = threshold
    
    def wrap_model_call(
        self,
        request: ModelRequest,
        handler: Callable[[ModelRequest], ModelResponse],
    ) -> ModelResponse:
        msg_count = len(request.messages)
        if msg_count > self.threshold:
            print(f"[MODEL] Using complex model (messages: {msg_count})")
            return handler(request.override(model=self.complex_model))
        else:
            print(f"[MODEL] Using simple model (messages: {msg_count})")
            return handler(request.override(model=self.simple_model))

dynamic_agent = create_agent(
    llm,  # Default model (overridden by middleware)
    tools=[calculator],
    middleware=[DynamicModelMiddleware(simple_model, complex_model, threshold=3)],
)

result = dynamic_agent.invoke({
    "messages": [{"role": "user", "content": "What's 7 * 8?"}]
})
print(f"Result: {result['messages'][-1].content}")


[MODEL] Using simple model (messages: 1)
[MODEL] Using simple model (messages: 3)
Result: The result of the expression 7 * 8 is 56.


In [15]:
# Example 2: Tool Call Monitoring with Metrics
# Track tool usage for analytics and debugging

class ToolMetricsMiddleware(AgentMiddleware):
    """Collect metrics on tool usage."""
    
    def __init__(self):
        super().__init__()
        self.tool_calls = []
        self.total_time = 0.0
    
    def wrap_tool_call(
        self,
        request: ToolCallRequest,
        handler: Callable[[ToolCallRequest], ToolMessage | Command],
    ) -> ToolMessage | Command:
        tool_name = request.tool_call["name"]
        tool_args = request.tool_call["args"]
        
        start = time.time()
        try:
            result = handler(request)
            success = True
        except Exception as e:
            success = False
            raise
        finally:
            elapsed = time.time() - start
            self.total_time += elapsed
            self.tool_calls.append({
                "tool": tool_name,
                "args": tool_args,
                "time_ms": round(elapsed * 1000, 2),
                "success": success,
            })
            print(f"[METRICS] {tool_name}: {elapsed*1000:.1f}ms")
        
        return result
    
    def get_summary(self) -> dict:
        return {
            "total_calls": len(self.tool_calls),
            "total_time_ms": round(self.total_time * 1000, 2),
            "calls": self.tool_calls,
        }

# Use metrics middleware
metrics_mw = ToolMetricsMiddleware()
metrics_agent = create_agent(
    llm,
    tools=[calculator, get_weather],
    middleware=[metrics_mw],
)

result = metrics_agent.invoke({
    "messages": [{"role": "user", "content": "What's 15 * 3 and what's the weather in London?"}]
})
print(f"\nResult: {result['messages'][-1].content}")
print(f"\nTool metrics: {metrics_mw.get_summary()}")


[METRICS] get_weather: 0.5ms
[METRICS] calculator: 0.4ms

Result: The result of 15 * 3 is 45. The weather in London is cloudy, 58°F.

Tool metrics: {'total_calls': 2, 'total_time_ms': 0.93, 'calls': [{'tool': 'get_weather', 'args': {'city': 'London'}, 'time_ms': 0.5, 'success': True}, {'tool': 'calculator', 'args': {'expression': '15 * 3'}, 'time_ms': 0.43, 'success': True}]}


In [16]:
# Example 3: Dynamic Tool Selection
# Filter tools based on context to improve accuracy

@tool
def admin_delete(resource_id: str) -> str:
    """Delete a resource (admin only)."""
    return f"Resource {resource_id} deleted."

@tool
def user_read(resource_id: str) -> str:
    """Read a resource."""
    return f"Resource {resource_id} contents: [data]"

# State with user role
class RoleState(AgentState):
    user_role: NotRequired[str]

class ToolFilterMiddleware(AgentMiddleware[RoleState]):
    """Filter available tools based on user role."""
    state_schema = RoleState
    
    def __init__(self, admin_tools: list[str]):
        super().__init__()
        self.admin_tools = admin_tools
    
    def wrap_model_call(
        self,
        request: ModelRequest,
        handler: Callable[[ModelRequest], ModelResponse],
    ) -> ModelResponse:
        user_role = request.state.get("user_role", "user")
        
        if user_role != "admin":
            # Filter out admin-only tools
            filtered_tools = [
                t for t in request.tools 
                if t.name not in self.admin_tools
            ]
            print(f"[FILTER] Role: {user_role}, Tools: {[t.name for t in filtered_tools]}")
            return handler(request.override(tools=filtered_tools))
        
        print(f"[FILTER] Role: admin, all tools available")
        return handler(request)

# Create agent with tool filtering
all_tools = [calculator, user_read, admin_delete]
filtered_agent = create_agent(
    llm,
    tools=all_tools,
    middleware=[ToolFilterMiddleware(admin_tools=["admin_delete"])],
)

# Test as regular user
result = filtered_agent.invoke({
    "messages": [{"role": "user", "content": "Read resource ABC123"}],
    "user_role": "user",
})
print(f"User result: {result['messages'][-1].content}")


[FILTER] Role: user, Tools: ['calculator', 'user_read']
[FILTER] Role: user, Tools: ['calculator', 'user_read']
User result: The contents of resource ABC123 are available.


In [17]:
# Example 4: Working with System Messages
# Dynamically modify system prompts based on context

class ContextInjectionMiddleware(AgentMiddleware):
    """Inject additional context into system message."""
    
    def __init__(self, context: str):
        super().__init__()
        self.context = context
    
    def wrap_model_call(
        self,
        request: ModelRequest,
        handler: Callable[[ModelRequest], ModelResponse],
    ) -> ModelResponse:
        # Access existing system message content
        existing_content = list(request.system_message.content_blocks)
        
        # Add additional context
        new_content = existing_content + [
            {"type": "text", "text": f"\n\nAdditional context: {self.context}"}
        ]
        
        new_system_message = SystemMessage(content=new_content)
        print(f"[CONTEXT] Injected: {self.context[:50]}...")
        
        return handler(request.override(system_message=new_system_message))

# Create agent with context injection
context_agent = create_agent(
    llm,
    tools=[calculator],
    middleware=[ContextInjectionMiddleware("Today is December 2025. User prefers metric units.")],
    system_prompt="You are a helpful assistant.",
)

result = context_agent.invoke({
    "messages": [{"role": "user", "content": "What's 100 fahrenheit in celsius?"}]
})
print(f"Result: {result['messages'][-1].content}")


[CONTEXT] Injected: Today is December 2025. User prefers metric units....
[CONTEXT] Injected: Today is December 2025. User prefers metric units....
Result: 100 Fahrenheit is approximately 37.78 Celsius.


### Combining Custom with Built-in Middleware

Custom middleware works alongside built-in middleware. Order matters for execution.


In [18]:
# Combine custom middleware with built-in middleware
from langchain.agents.middleware import SummarizationMiddleware, PIIMiddleware

# Custom request logger
class RequestLoggerMiddleware(AgentMiddleware):
    """Log all requests for debugging."""
    
    def before_model(self, state: AgentState, writer: StreamWriter) -> dict[str, Any] | None:
        print(f"[REQUEST LOG] Messages: {len(state['messages'])}")
        return None

# Combine built-in and custom middleware
combined_agent = create_agent(
    llm,
    tools=[calculator, get_weather],
    middleware=[
        # Custom logging first
        RequestLoggerMiddleware(),
        # Then built-in summarization
        SummarizationMiddleware(
            model=llm,
            trigger=("messages", 20),
            keep=("messages", 10),
        ),
        # Then PII protection
        PIIMiddleware("email", strategy="redact", apply_to_output=True),
    ],
)

result = combined_agent.invoke({
    "messages": [{"role": "user", "content": "What's 50 + 50?"}]
})
print(f"Result: {result['messages'][-1].content}")


[REQUEST LOG] Messages: 1
[REQUEST LOG] Messages: 3
Result: The result of 50 + 50 is 100.


### Summary

**Custom middleware gives you fine-grained control over agent execution:**

| Hook Type | Use Case | Examples |
|-----------|----------|----------|
| `before_model` | Validation, logging, state prep | Rate limiting, input sanitization |
| `after_model` | Response processing, metrics | Content filtering, logging |
| `wrap_model_call` | Control flow | Retries, caching, model switching |
| `wrap_tool_call` | Tool interception | Monitoring, permissions, mocking |

**Best practices:**
1. Keep middleware focused - one concern per middleware
2. Handle errors gracefully - don't crash the agent
3. Use decorators for simple cases, classes for complex logic
4. Document custom state properties
5. Test middleware independently before integrating
6. Consider execution order when combining multiple middleware
