# **Middlewares**

Middleware provides a way to more tightly control what happens inside the agent. Middleware is useful for the following:
- Tracking agent behavior with logging, analytics, and debugging.
- Transforming prompts, tool selection, and output formatting.
- Adding retries, fallbacks, and early termination logic.
- Applying rate limits, guardrails, and PII detection.

Add middleware by passing them to `create_agent`:
```python
from langchain.agents import create_agent
from langchain.agents.middleware import SummarizationMiddleware, HumanInTheLoopMiddleware

agent = create_agent(
    model="gpt-4o",
    tools=[...],
    middleware=[
        SummarizationMiddleware(...),
        HumanInTheLoopMiddleware(...)
    ],
)
```

### **Built-in Middleware (Provider Agnostic)**

LangChain provides prebuilt middleware for common use cases. Each middleware is production-ready and configurable for your specific needs. The following middleware work with any LLM provider:
| Middleware            | Description                                                                 |
|-----------------------|-----------------------------------------------------------------------------|
| Summarization         | Automatically summarize conversation history when approaching token limits. |
| Human-in-the-loop     | Pause execution for human approval of tool calls.                            |
| Model call limit      | Limit the number of model calls to prevent excessive costs.                  |
| Tool call limit       | Control tool execution by limiting call counts.                              |
| Model fallback        | Automatically fallback to alternative models when the primary fails.        |
| PII detection         | Detect and handle Personally Identifiable Information (PII).                |
| To-do list            | Equip agents with task planning and tracking capabilities.                  |
| LLM tool selector     | Use an LLM to select relevant tools before calling the main model.           |
| Tool retry            | Automatically retry failed tool calls with exponential backoff.             |
| Model retry           | Automatically retry failed model calls with exponential backoff.            |
| LLM tool emulator     | Emulate tool execution using an LLM for testing purposes.                   |
| Context editing       | Manage conversation context by trimming or clearing tool uses.              |
| Shell tool            | Expose a persistent shell session to agents for command execution.           |
| File search           | Provide Glob and Grep search tools over filesystem files.                    |


### **Middleware**

https://docs.langchain.com/oss/python/langchain/middleware/overview

https://docs.langchain.com/oss/python/langchain/middleware/built-in

https://docs.langchain.com/oss/python/langchain/middleware/custom

https://docs.langchain.com/oss/python/langchain/runtime

Test:
https://docs.langchain.com/oss/python/langchain/test

API Reference:  
https://reference.langchain.com/python/langchain/middleware/?_gl=1*1akd7zx*_gcl_au*ODYzMzY4ODc1LjE3NjM4MjAwNjA.*_ga*MTk3NjE3OTA4MC4xNzIzMDI3MjE3*_ga_47WX3HKKY2*czE3NjU5MDcxNjIkbzY2JGcxJHQxNzY1OTA3MzExJGo2MCRsMCRoMA..

### **Define Agent State with middleware**

Use middleware to define custom state when your custom state needs to be accessed by specific middleware hooks and tools attached to said middleware.

https://docs.langchain.com/oss/python/langchain/agents#memory

### **Context Engineering (Model, Tool and Lifecycle)**

LangChain middleware is the mechanism under the hood that makes context engineering practical for developers using LangChain.

https://docs.langchain.com/oss/python/langchain/context-engineering


## **Dynamic Model Selection and Prompt Selection**

@wrap_model_call:
- https://docs.langchain.com/oss/python/langchain/agents#dynamic-model

@dynamic_prompt:
- https://docs.langchain.com/oss/python/langchain/agents#dynamic-system-prompt

## **Tool Error Handling**

@wrap_tool_call:
- https://docs.langchain.com/oss/python/langchain/agents#tool-error-handling


## **Agent Memory**

Agents maintain conversation history automatically through the message state. You can also configure the agent to use a custom state schema to remember additional information during the conversation.

Information stored in the state can be thought of as the **short-term memory** of the agent:

Custom state schemas must extend AgentState as a TypedDict.

There are two ways to define custom state:
- Via middleware (preferred)
- Via state_schema on create_agent



## Step 3: Custom State Schema (Beyond Messages)

**Add fields like `user_preferences` to state for memory/tools.**

```python
import pydantic
state_schema = pydantic.BaseModel(
    messages=list[str],
    user_preferences=dict[str, str]  # Persists across calls
)

agent = create_agent(
    model=model,
    tools=[get_weather],
    state_schema=state_schema
)
```

## Step 4: Built-in Middleware (Prebuilt Superpowers)

**Add via `middleware=[]` list. Stack them.**

```python
from langchain.agents.middleware import (
    summarization_middleware,
    human_in_the_loop_middleware,
    pii_redaction_middleware
)

agent = create_agent(
    model=model,
    tools=[get_weather],
    middleware=[
        summarization_middleware(model="gpt-4o-mini", trigger={"tokens": 500}),  # Auto-summarize history
        human_in_the_loop_middleware(interrupt_on={"get_weather": {"allowed_decisions": ["approve", "reject"]}}),  # HiT-L
        pii_redaction_middleware(patterns=["email", "phone"])  # Scrub PII
    ]
)
```

## Step 5: Custom Middleware (Advanced Hooks)

**Subclass `AgentMiddleware` for `wrap_model_call`, `wrap_tool_call`, etc.**

```python
class DynamicModelMiddleware(AgentMiddleware):
    def wrap_model_call(self, request, handler):
        if len(request.state["messages"]) > 10:  # Complex conv â†’ better model
            request.model = ChatOpenAI(model="gpt-4o")
        return handler(request)

agent = create_agent(model=model, middleware=[DynamicModelMiddleware()])
```

Hooks: `before_model`, `wrap_tool_call`, `after_model`, etc.

## Step 7: Production Features (LangGraph Under the Hood)

**Persistence, HiT-L, time-travel out-of-box:**

```python
from langgraph.store.memory import InMemoryStore  # Or PostgresStore

agent = create_agent(
    model=model,
    store=InMemoryStore(),  # Checkpoints across sessions
    # Interrupts via middleware (Step 4)
)
```

**Debug:** Traces auto-sent to LangSmith.

## [Next: LangGraph for Multi-Agent/Complex Flows]

When `create_agent` limits hit (custom edges, subgraphs), migrate to raw LangGraph graphs.

**Relevant docs:**
- [Agents](https://docs.langchain.com/oss/python/langchain/agents)
- [Tools](https://docs.langchain.com/oss/python/langchain/tools)
- [Middleware](https://docs.langchain.com/oss/python/langchain/middleware/built-in)
- [Streaming](https://docs.langchain.com/oss/python/langchain/streaming)