## 07 - Built-in Middleware in LangChain

**Key Concept**: Middleware acts as an intermediary layer that intercepts and modifies agent execution flow. It lets you add cross-cutting concerns (logging, validation, safety checks) without touching core agent logic.

**What this covers:**
1. What middleware is and why it matters
2. How middleware integrates with agents
3. Summarization middleware (context window management)
4. Human-in-the-loop middleware (approval workflows)
5. PII detection middleware (privacy compliance)
6. To-do list middleware (task planning)

**Why middleware?**
- **Modularity**: Encapsulate specific behaviors in reusable components
- **Separation of concerns**: Keep agent logic clean, move auxiliary tasks to middleware
- **Composability**: Stack multiple middleware for complex behaviors


In [1]:
from langchain_groq import ChatGroq
from langchain.agents import create_agent
from langchain.tools import tool
from langchain_core.messages import HumanMessage
from pydantic import BaseModel, Field
from typing import Literal, Optional
import time


  from .autonotebook import tqdm as notebook_tqdm


In [34]:
llm = ChatGroq(
    model="openai/gpt-oss-120b",
    temperature=0,
)

# Sample tools for demonstrations
@tool
def search_web(query: str) -> str:
    """Search the web for information."""
    return f"Search results for '{query}': Found 3 relevant articles about {query}."

@tool
def send_email(to: str, subject: str, body: str) -> str:
    """Send an email to a recipient."""
    return f"Email sent to {to} with subject '{subject}'."

@tool
def delete_file(filename: str) -> str:
    """Delete a file from the system."""
    return f"File '{filename}' deleted successfully."

@tool
def get_user_data(user_id: str) -> str:
    """Retrieve user data from database."""
    # Simulated user data with PII
    return f"User {user_id}: John Doe, email: john.doe@example.com, url: https://www.asdasdas.com, age: 24, city: New York"

print("Tools and model initialized")


Tools and model initialized


### How Middleware Works

Middleware wraps around agent execution at specific points:
- **Before model calls**: Modify inputs, add context, filter tools
- **After model calls**: Process outputs, validate responses
- **Before/after tool calls**: Intercept tool execution, require approval

```
User Input -> [Middleware Stack] -> Agent -> [Middleware Stack] -> Tool Execution -> [Middleware Stack] -> Response
```

Middleware is passed as a list to `create_agent()`. They execute in order, each wrapping the next.


### 1. Summarization Middleware

**Problem**: Long conversations exceed model context windows, causing failures or lost context.

**Solution**: `SummarizationMiddleware` automatically compresses older messages when token limits are approached, preserving recent context while summarizing history.

**Use cases:**
- Multi-turn chatbots with extensive history
- Long-running agent sessions
- Applications where full context matters but token limits are tight

**Configuration:**
- `trigger`: When to summarize (token count, message count, or fraction of context)
- `keep`: How much recent context to preserve
- `model`: Which model to use for summarization (can be cheaper/faster than main model)


In [4]:
from langchain.agents.middleware import SummarizationMiddleware

summarization_model = ChatGroq(
    model="llama-3.3-70b-versatile",
    temperature=0,
)
# Basic summarization: trigger at 4000 tokens, keep last 20 messages
agent_with_summarization = create_agent(
    llm,
    tools=[search_web],
    middleware=[
        SummarizationMiddleware(
            model=llm,  # Can use a cheaper model for summarization
            trigger=("tokens", 4000),  # Trigger when conversation exceeds 4000 tokens
            keep=("messages", 20),     # Keep the 20 most recent messages intact
        ),
    ],
)

# Alternative configurations:
# Trigger on message count
summarization_by_messages = SummarizationMiddleware(
    model=llm,
    trigger=("messages", 50),  # Trigger when conversation has 50+ messages
    keep=("messages", 10),
)

# Trigger on fraction of context window (requires model profile data)
# summarization_by_fraction = SummarizationMiddleware(
#     model=llm,
#     trigger=("fraction", 0.8),  # Trigger at 80% of context window
#     keep=("fraction", 0.3),     # Keep 30% of context
# )

# Multiple trigger conditions (OR logic - any condition triggers)
summarization_multi_trigger = SummarizationMiddleware(
    model=llm,
    trigger=[("tokens", 3000), ("messages", 30)],  # Trigger on EITHER condition
    keep=("messages", 15),
)

print("Summarization middleware configured")
print("  - Triggers at 4000 tokens")
print("  - Keeps last 20 messages intact")
print("  - Summarizes older messages to preserve context")


Summarization middleware configured
  - Triggers at 4000 tokens
  - Keeps last 20 messages intact
  - Summarizes older messages to preserve context


In [6]:
# Demonstrate summarization with a lower threshold for testing
from langgraph.checkpoint.memory import MemorySaver

@tool
def calculator(expression: str) -> str:
    """Evaluate a math expression."""
    try:
        result = eval(expression)
        return f"Result: {result}"
    except:
        return "Error evaluating expression"

@tool
def get_weather(city: str) -> str:
    """Get weather for a city."""
    weather_data = {
        "NYC": "Sunny, 72¬∞F",
        "London": "Cloudy, 58¬∞F", 
        "Tokyo": "Rainy, 65¬∞F",
        "Paris": "Partly cloudy, 68¬∞F"
    }
    return weather_data.get(city, f"Weather data for {city}: Clear, 70¬∞F")

# Create agent with low message threshold to demonstrate summarization
summarization_memory = MemorySaver()
summarization_agent = create_agent(
    llm,
    tools=[calculator, get_weather],
    middleware=[
        SummarizationMiddleware(
            model=llm,
            trigger=("messages", 6),  # Low threshold for demo - triggers after 6 messages
            keep=("messages", 3),     # Keep only last 3 messages after summarization
        ),
    ],
    checkpointer=summarization_memory,
)

# Simulate a conversation that exceeds message threshold
config = {"configurable": {"thread_id": "summarization_test"}}

messages = [
    "What's 10 + 15?",
    "What's the weather in NYC?",
    "Calculate 50 * 2",
    "What's the weather in London?",
    "What's 100 / 4?",
    "What's the weather in Tokyo?",
    "What's 7 * 8?",  # This should trigger summarization
]

print("Demonstrating automatic summarization...")
print("(Summarization triggers after 6 messages, keeps last 3)\n")

for i, msg in enumerate(messages, 1):
    print(f"--- Message {i} ---")
    result = summarization_agent.invoke(
        {"messages": [HumanMessage(content=msg)]},
        config=config
    )
    print(f"User: {msg}")
    print(f"Agent:")
    for message in result['messages']:
       message.pretty_print()
    print("\n")

# After summarization, older messages are compressed into a summary
# while recent messages remain intact for context


Demonstrating automatic summarization...
(Summarization triggers after 6 messages, keeps last 3)

--- Message 1 ---
User: What's 10 + 15?
Agent:

What's 10 + 15?
Tool Calls:
  calculator (fc_a40749dd-fd6d-4d1e-937b-02736673f3d0)
 Call ID: fc_a40749dd-fd6d-4d1e-937b-02736673f3d0
  Args:
    expression: 10 + 15
Name: calculator

Result: 25

The sum of‚ÄØ10‚ÄØand‚ÄØ15 is **25**.


--- Message 2 ---
User: What's the weather in NYC?
Agent:

Here is a summary of the conversation to date:

User: What's 10 + 15?
Assistant (using calculator tool): Result: 25
Assistant response: The sum of‚ÄØ10‚ÄØand‚ÄØ15 is **25**.

What's the weather in NYC?
Tool Calls:
  get_weather (fc_aaf1859c-3ca7-410c-8f40-d7fe2cd1c310)
 Call ID: fc_aaf1859c-3ca7-410c-8f40-d7fe2cd1c310
  Args:
    city: New York
Name: get_weather

Weather data for New York: Clear, 70¬∞F

The current weather in New‚ÄØYork City is **clear** with a temperature of **70‚ÄØ¬∞F**. Enjoy your day!


--- Message 3 ---
User: Calculate 50 * 2
Agent:

### 2. Human-in-the-Loop Middleware

**Problem**: Agents performing critical actions (sending emails, deleting files, financial transactions) without human oversight is risky.

**Solution**: `HumanInTheLoopMiddleware` pauses execution before specified tool calls, requiring human approval, editing, or rejection.

**Use cases:**
- High-stakes operations (database writes, payments, external communications)
- Compliance workflows requiring audit trails
- Training scenarios where human feedback improves agent behavior

**Key concepts:**
- `interrupt_on`: Which tools require approval
- `allowed_decisions`: What actions the human can take (approve, edit, reject)
- The agent pauses and resumes based on human input


In [7]:
from langchain.agents.middleware import HumanInTheLoopMiddleware
from langgraph.checkpoint.memory import MemorySaver

# Create checkpointer for state persistence (required for HITL)
memory = MemorySaver()

# Configure which tools require human approval
hitl_middleware = HumanInTheLoopMiddleware(
    interrupt_on={
        # Require approval for sending emails
        "send_email": {
            "allowed_decisions": ["approve", "edit", "reject"] # True also denotes same, False denotes Safe operation, no approval needed
        },
        # Require approval for file deletion (approve/reject only, no editing)
        "delete_file": {
            "allowed_decisions": ["approve", "reject"]
        }
    }
)

# Create agent with HITL middleware
agent_with_hitl = create_agent(
    llm,
    tools=[search_web, send_email, delete_file],
    middleware=[hitl_middleware],
    checkpointer=memory,  # Required for interrupt/resume functionality
)

print("Human-in-the-loop agent configured")
print("  - send_email: requires approval (can approve, edit, or reject)")
print("  - delete_file: requires approval (can approve or reject)")
print("  - search_web: no approval needed")


Human-in-the-loop agent configured
  - send_email: requires approval (can approve, edit, or reject)
  - delete_file: requires approval (can approve or reject)
  - search_web: no approval needed


In [14]:
# Demonstrating the interrupt/resume flow
from langgraph.types import Command

thread_id = "hitl_demo_thread_1"
config = {"configurable": {"thread_id": thread_id}}

# Step 1: Invoke agent with a request that triggers HITL
print("Step 1: Agent attempts to send email...")
result = agent_with_hitl.invoke(
    {"messages": [{"role": "user", "content": "Send an email to boss@company.com about the project update"}]},
    config=config
)
print(result['__interrupt__'])
# Check if agent is interrupted (waiting for approval)
# The agent will pause before executing send_email
print(f"\nAgent state: {'Interrupted - awaiting approval' if '__interrupt__' in result else 'Completed'}")

# Step 2: Resume with approval
print("\nStep 2: Human approves the action...")
result = agent_with_hitl.invoke(
    Command( 
        resume={"decisions": [{"type": "approve"}]}  # or "edit", "reject"
    ),  # Options: "approve", "edit", "reject"
    config=config
)

print("\nFinal result:")
print(result["messages"][-1].content if result.get("messages") else "Action completed")
for message in result['messages']:
    message.pretty_print()
print("\n")

Step 1: Agent attempts to send email...
[Interrupt(value={'action_requests': [{'name': 'send_email', 'args': {'body': 'Hi,\n\nI wanted to provide you with an update on the project. All tasks are progressing as scheduled, and we are on track to meet the upcoming milestones. Please let me know if you need any additional details or have any questions.\n\nBest regards,\n[Your Name]', 'subject': 'Project Update', 'to': 'boss@company.com'}, 'description': "Tool execution requires approval\n\nTool: send_email\nArgs: {'body': 'Hi,\\n\\nI wanted to provide you with an update on the project. All tasks are progressing as scheduled, and we are on track to meet the upcoming milestones. Please let me know if you need any additional details or have any questions.\\n\\nBest regards,\\n[Your Name]', 'subject': 'Project Update', 'to': 'boss@company.com'}"}], 'review_configs': [{'action_name': 'send_email', 'allowed_decisions': ['approve', 'edit', 'reject']}]}, id='60f378c7d7f4d72f0ffe81e5ca4bf6af')]

Ag

In [16]:
# Example: Editing tool arguments before approval
# Useful when the agent's parameters need adjustment

memory2 = MemorySaver()
agent_with_hitl2 = create_agent(
    llm,
    tools=[search_web, send_email, delete_file],
    middleware=[hitl_middleware],
    checkpointer=memory2,
)

thread_id2 = "hitl_edit_demo_2"
config2 = {"configurable": {"thread_id": thread_id2}}

# Agent tries to send email
result = agent_with_hitl2.invoke(
    {"messages": [{"role": "user", "content": "Email john@test.com saying hello"}]},
    config=config2
)
print(result['__interrupt__'])
# Human edits the tool call arguments
print("Human edits the email recipient and subject...")
result = agent_with_hitl2.invoke(
     Command(
        # Decisions are provided as a list, one per action under review.
        # The order of decisions must match the order of actions
        # listed in the `__interrupt__` request.
        resume={
            "decisions": [
                {
                    "type": "edit",
                    # Edited action with tool name and args
                    "edited_action": {
                        # Tool name to call.
                        # Will usually be the same as the original action.
                        "name": "send_email",
                        # Arguments to pass to the tool.
                        "args": {"to": "john.smith@company.com",  # Changed recipient
                                 "subject": "Greetings from the team",  # Changed subject
                                 "body": "Hello John, hope you're doing well!"},
                    }
                }
            ]
        }
    ),
    config=config2
)
for message in result['messages']:
    message.pretty_print()
print("Email sent with edited parameters")


[Interrupt(value={'action_requests': [{'name': 'send_email', 'args': {'body': 'hello', 'subject': 'Hello', 'to': 'john@test.com'}, 'description': "Tool execution requires approval\n\nTool: send_email\nArgs: {'body': 'hello', 'subject': 'Hello', 'to': 'john@test.com'}"}], 'review_configs': [{'action_name': 'send_email', 'allowed_decisions': ['approve', 'edit', 'reject']}]}, id='fb42251b1c91f1ede6e5394a652d3dab')]
Human edits the email recipient and subject...

Email john@test.com saying hello
Tool Calls:
  send_email (fc_c272424f-82aa-49a7-a7a2-83fb7fcefd5c)
 Call ID: fc_c272424f-82aa-49a7-a7a2-83fb7fcefd5c
  Args:
    to: john.smith@company.com
    subject: Greetings from the team
    body: Hello John, hope you're doing well!
Name: send_email

Email sent to john.smith@company.com with subject 'Greetings from the team'.
Tool Calls:
  send_email (fc_51f8b330-4e19-4bfb-b28d-a08033a847bd)
 Call ID: fc_51f8b330-4e19-4bfb-b28d-a08033a847bd
  Args:
    body: Hello
    subject: Hello
    to: j

### 3. PII Detection Middleware

**Problem**: Agents may inadvertently expose or process sensitive personal information (emails, SSNs, phone numbers, credit cards).

**Solution**: `PIIMiddleware` detects and handles PII in inputs/outputs using configurable strategies.

**Built-in PII types:**
- `email`, `phone_number`, `ssn`, `credit_card`, `ip_address`, `date_of_birth`

**Strategies:**
- `redact`: Replace PII with placeholder (e.g., `[EMAIL REDACTED]`)
- `block`: Prevent the message/response entirely if PII detected
- `warn`: Log warning but allow processing

**Apply to:**
- `apply_to_input`: Scan user messages
- `apply_to_output`: Scan agent responses and tool outputs


In [38]:
from langchain.agents.middleware import PIIMiddleware

# Redact emails in both inputs and outputs
pii_email = PIIMiddleware(
    "email",
    strategy="redact",
    apply_to_input=True,
    apply_to_output=True,
)

# Redact SSNs - critical for compliance
pii_ssn = PIIMiddleware(
    "credit_card",
    strategy="redact",
    apply_to_output=True,  # Mainly concerned about leaking in outputs
)

# Redact phone numbers
pii_phone = PIIMiddleware(
    "url",
    strategy="redact",
    apply_to_output=True,
)

# Create agent with multiple PII middleware (they stack)
agent_with_pii = create_agent(
    llm,
    tools=[get_user_data, search_web],
    middleware=[pii_email, pii_ssn, pii_phone],
)

print("PII-protected agent configured")
print("  - Emails: redacted in input and output")
print("  - url: redacted in output")
print("  - credit_card: redacted in output")


PII-protected agent configured
  - Emails: redacted in input and output
  - url: redacted in output
  - credit_card: redacted in output


In [39]:
# Test PII redaction - the tool returns sensitive data, middleware redacts it
result = agent_with_pii.invoke({
    "messages": [{"role": "user", "content": "Get user data for user ID 12345"}]
})

print("Agent response (PII should be redacted):")
for message in result['messages']:
    message.pretty_print()


# Note: The get_user_data tool returns:
# "User 12345: John Doe, email: john.doe@example.com, SSN: 123-45-6789, phone: 555-123-4567"
# After PII middleware, sensitive data will be replaced with redaction placeholders


Agent response (PII should be redacted):

Get user data for user ID 12345
Tool Calls:
  get_user_data (fc_76cae11e-98f0-4a3a-872d-bb98d37ec6c4)
 Call ID: fc_76cae11e-98f0-4a3a-872d-bb98d37ec6c4
  Args:
    user_id: 12345
Name: get_user_data

User 12345: John Doe, email: john.doe@example.com, url: https://www.asdasdas.com, age: 24, city: New York

Here‚Äôs the information we have for user **ID‚ÄØ12345**:

- **Name:** John‚ÄØDoe  
- **Email:** [REDACTED_EMAIL]  
- **Website:** [REDACTED_URL]  
- **Age:** 24  
- **City:** New‚ÄØYork  

Let me know if you need anything else!


In [40]:
# Custom PII types - define your own patterns
# Useful for domain-specific sensitive data (employee IDs, account numbers, etc.)

custom_pii = PIIMiddleware(
    "employee_id",  # Custom PII type name
    strategy="redact",
    apply_to_output=True,
    detector=r"EMP-\d{6}",  # Regex pattern: EMP- followed by 6 digits
)

# Multiple custom patterns for different data types
api_key_pii = PIIMiddleware(
    "api_key",
    strategy="redact",
    apply_to_output=True,
    detector=r"sk-[a-zA-Z0-9]{32,}",  # OpenAI-style API keys
)

# Block strategy - completely prevents response if PII detected
# Use for extremely sensitive scenarios
blocking_pii = PIIMiddleware(
    "credit_card",
    strategy="block",  # Will raise an error if credit card detected
    apply_to_output=True,
)

print("Custom PII patterns configured:")
print("  - employee_id: EMP-XXXXXX pattern")
print("  - api_key: sk-... pattern")
print("  - credit_card: blocked entirely")


Custom PII patterns configured:
  - employee_id: EMP-XXXXXX pattern
  - api_key: sk-... pattern
  - credit_card: blocked entirely


### 4. To-Do List Middleware

**Problem**: Complex tasks require planning and tracking. Agents often lose track of multi-step goals or forget subtasks.

**Solution**: `TodoMiddleware` equips agents with task planning and tracking capabilities. The agent can create, update, and check off tasks as it works.

**Use cases:**
- Multi-step workflows requiring organization
- Long-running tasks where progress tracking matters
- Scenarios where the agent needs to decompose complex goals

**How it works:**
- Adds a `todo` tool to the agent
- Agent can create tasks, mark them complete, and track progress
- Tasks persist across turns within a session


In [42]:
from langchain.agents.middleware import TodoListMiddleware

# Create agent with to-do list capability
todo_middleware = TodoListMiddleware()

agent_with_todo = create_agent(
    llm,
    tools=[search_web, send_email],
    middleware=[todo_middleware],
    system_prompt="""You are a helpful assistant that plans and tracks tasks.
When given a complex request, break it down into subtasks using your todo tool.
Mark tasks as complete as you finish them."""
)

print("To-do list agent configured")


To-do list agent configured


In [43]:
# Test the to-do list agent with a multi-step task
result = agent_with_todo.invoke({
    "messages": [{
        "role": "user", 
        "content": "I need to prepare for a product launch. Help me plan the tasks."
    }]
})

print("Agent response:")
for message in result['messages']:
    message.pretty_print()

# The agent should:
# 1. Create a to-do list with subtasks
# 2. Potentially use search_web to gather information
# 3. Track progress through the task list


Agent response:

I need to prepare for a product launch. Help me plan the tasks.
Tool Calls:
  write_todos (fc_959a1708-b560-477e-a815-7b75c6e09cb3)
 Call ID: fc_959a1708-b560-477e-a815-7b75c6e09cb3
  Args:
    todos: [{'content': 'Define launch objectives, success metrics, and key performance indicators (KPIs)', 'status': 'in_progress'}, {'content': 'Conduct market research and competitor analysis to validate positioning', 'status': 'pending'}, {'content': 'Finalize product features, packaging, pricing, and inventory readiness', 'status': 'pending'}, {'content': 'Develop comprehensive marketing strategy (channels, messaging, budget, timeline)', 'status': 'pending'}, {'content': 'Create marketing collateral (website landing page, demo videos, brochures, social assets)', 'status': 'pending'}, {'content': 'Set up sales and distribution channels (e‚Äëcommerce, retail partners, logistics)', 'status': 'pending'}, {'content': 'Plan launch event (venue, agenda, speakers, invitations, virtual 

### Combining Multiple Middleware

Middleware can be stacked for comprehensive agent control. Order matters - middleware execute in the order they're listed.

**Common combinations:**
- Summarization + HITL: Long conversations with approval for critical actions
- PII + Summarization: Privacy-compliant long-running agents
- Todo + HITL: Planned workflows with human checkpoints


In [44]:
# Production-ready agent with multiple middleware layers
from langchain.agents.middleware import (
    SummarizationMiddleware,
    HumanInTheLoopMiddleware,
    PIIMiddleware,
    TodoListMiddleware,
)

production_memory = MemorySaver()

# Stack middleware for comprehensive protection
production_agent = create_agent(
    llm,
    tools=[search_web, send_email, delete_file, get_user_data],
    middleware=[
        # 1. Summarization - manage context window
        SummarizationMiddleware(
            model=llm,
            trigger=("tokens", 3000),
            keep=("messages", 15),
        ),
        # 2. PII protection - redact sensitive data
        PIIMiddleware("email", strategy="redact", apply_to_output=True),
        PIIMiddleware("url", strategy="redact", apply_to_output=True),
        PIIMiddleware("mac_address", strategy="redact", apply_to_output=True),
        # 3. Human approval for critical actions
        HumanInTheLoopMiddleware(
            interrupt_on={
                "send_email": {"allowed_decisions": ["approve", "edit", "reject"]},
                "delete_file": {"allowed_decisions": ["approve", "reject"]},
            }
        ),
        # 4. Task planning
        TodoListMiddleware(),
    ],
    checkpointer=production_memory,
    system_prompt="You are a secure, privacy-aware assistant. Plan complex tasks and seek approval for sensitive operations."
)

print("Production agent configured with:")
print("  1. Summarization (context management)")
print("  2. PII redaction (email, SSN, phone)")
print("  3. Human-in-the-loop (email, file deletion)")
print("  4. To-do list (task planning)")


Production agent configured with:
  1. Summarization (context management)
  2. PII redaction (email, SSN, phone)
  3. Human-in-the-loop (email, file deletion)
  4. To-do list (task planning)


### Summary

**Middleware transforms basic agents into production-ready systems:**

| Middleware | Purpose | Key Config |
|------------|---------|------------|
| `SummarizationMiddleware` | Context window management | `trigger`, `keep`, `model` |
| `HumanInTheLoopMiddleware` | Approval workflows | `interrupt_on`, `allowed_decisions` |
| `PIIMiddleware` | Privacy compliance | `strategy`, `apply_to_input/output`, `detector` |
| `TodoMiddleware` | Task planning | Adds `todo` tool automatically |

**Best practices:**
- Order middleware intentionally (PII before HITL if you want redacted data in approval UI)
- Use cheaper models for summarization
- Test HITL flows thoroughly before production
- Define custom PII patterns for domain-specific data
