# Middleware in LangChain Agents

**Purpose:**  
Explain how middleware intercepts and augments agent execution, and demonstrate built-in middleware for summarization and human-in-the-loop control.

## Middleware

Middleware is a composable interception layer that runs inside the agent execution loop.  
It can observe, modify, pause, or terminate execution between planning, tool calls, and responses.

Common use cases:
- Logging, tracing, analytics, and debugging
- Prompt or message transformation
- Retries, fallbacks, and early termination
- Guardrails (rate limits, PII detection, policy checks)
- Human approval workflows

In [2]:
import os
from dotenv import load_dotenv

load_dotenv()
os.environ["GROQ_API_KEY"] = os.getenv("GROQ_API_KEY")

## Summarization middleware

SummarizationMiddleware automatically compresses older conversation history when a size threshold is reached.
It preserves recent messages and replaces older ones with a summary.

Trigger and keep policies can be defined in:
- message count,
- token count,
- fraction of model context window.

In [3]:
from langchain.agents import create_agent
from langchain.agents.middleware import SummarizationMiddleware
from langgraph.checkpoint.memory import InMemorySaver
from langchain_core.messages import HumanMessage

In [4]:
# Message-count-based summarization
from langchain.agents import create_agent
from langchain.agents.middleware import SummarizationMiddleware
from langchain_groq import ChatGroq
from langgraph.checkpoint.memory import InMemorySaver

# Initialize Groq LLM
llm = ChatGroq(
    api_key=os.getenv("GROQ_API_KEY"),
    model= "openai/gpt-oss-20b",
    temperature=0
)

agent = create_agent(
    model=llm,
    checkpointer=InMemorySaver(),
    middleware=[
        SummarizationMiddleware(
            model=llm,
            trigger=("messages", 10),
            keep=("messages", 4),
        )
    ],
)

config = {"configurable": {"thread_id": "test-1"}}

In [5]:
questions = [
    "What is 2+2?",
    "What is 10*5?",
    "What is 100/4?",
    "What is 15-7?",
    "What is 3*3?",
    "What is 4*4?",
]

for q in questions:
    response = agent.invoke({"messages": [HumanMessage(content=q)]}, config)
    print(f"Messages: {len(response['messages'])}")

Messages: 2
Messages: 4
Messages: 6
Messages: 8
Messages: 10
Messages: 6


## Token-based summarization

Token-based thresholds allow control relative to provider limits and cost constraints.

In [4]:
from langchain_core.tools import tool

@tool
def search_hotels(city: str) -> str:
    """Returns a long response to intentionally increase token usage."""
    return f"""Hotels in {city}:
    1. Grand Hotel - 5 star, $350/night
    2. City Inn - 4 star, $180/night
    3. Budget Stay - 3 star, $75/night"""


In [6]:
# Initialize Groq LLM
from langchain_groq import ChatGroq

llm = ChatGroq(
    api_key=os.getenv("GROQ_API_KEY"),
    model= "openai/gpt-oss-20b",
    temperature=0
)

agent = create_agent(
    model= llm,
    tools=[search_hotels],
    checkpointer=InMemorySaver(),
    middleware=[
        SummarizationMiddleware(
            model= llm,
            trigger=("tokens", 550),
            keep=("tokens", 200),
        )
    ],
)

config = {"configurable": {"thread_id": "test-2"}}


In [7]:
def count_tokens(messages):
    return sum(len(str(m.content)) for m in messages) // 4

cities = ["Paris", "London", "Tokyo", "New York", "Dubai", "Singapore"]

for city in cities:
    response = agent.invoke({"messages": [HumanMessage(content=f"Find hotels in {city}")]}, config)
    tokens = count_tokens(response["messages"])
    print(f"{city}: ~{tokens} tokens, {len(response['messages'])} messages")

Paris: ~133 tokens, 4 messages
London: ~277 tokens, 8 messages
Tokyo: ~417 tokens, 12 messages
New York: ~411 tokens, 6 messages
Dubai: ~258 tokens, 5 messages
Singapore: ~399 tokens, 9 messages


## Fraction-based summarization

Fraction thresholds are relative to the model's total context window, making them portable across models.

In [8]:
from langchain_groq import ChatGroq

llm = ChatGroq(
    api_key=os.getenv("GROQ_API_KEY"),
    model= "openai/gpt-oss-20b",
    temperature=0
)

agent = create_agent(
    model=llm,
    tools=[search_hotels],
    checkpointer=InMemorySaver(),
    middleware=[
        SummarizationMiddleware(
            model=llm,
            trigger=("fraction", 0.005),
            keep=("fraction", 0.002),
        )
    ],
)

config = {"configurable": {"thread_id": "test-3"}}

In [9]:
for city in cities:
    response = agent.invoke({"messages": [HumanMessage(content=f"Hotels in {city}")]}, config)
    tokens = count_tokens(response["messages"])
    fraction = tokens / 16000
    print(f"{city}: ~{tokens} tokens ({fraction:.4%}), {len(response['messages'])} msgs")


Paris: ~154 tokens (0.9625%), 4 msgs
London: ~371 tokens (2.3188%), 8 msgs
Tokyo: ~661 tokens (4.1313%), 12 msgs
New York: ~515 tokens (3.2188%), 5 msgs
Dubai: ~840 tokens (5.2500%), 7 msgs
Singapore: ~4548 tokens (28.4250%), 3 msgs


## Human-in-the-loop middleware

HumanInTheLoopMiddleware pauses execution before sensitive tool calls and waits for explicit human decisions.
This enables approval, rejection, or editing before irreversible actions.

In [12]:
from langchain.agents import create_agent
from langchain.agents.middleware import HumanInTheLoopMiddleware
from langgraph.checkpoint.memory import InMemorySaver

def read_email_tool(email_id: str) -> str:
    """Mock function to read an email by its ID."""
    return f"Email content for ID: {email_id}"

def send_email_tool(recipient: str, subject: str, body: str) -> str:
    """Mock function to send an email."""
    return f"Email sent to {recipient} with subject '{subject}'"

In [13]:
from langchain_groq import ChatGroq

llm = ChatGroq(
    api_key=os.getenv("GROQ_API_KEY"),
    model= "openai/gpt-oss-20b",
    temperature=0
)

agent=create_agent(
    model= llm,
    tools=[read_email_tool,send_email_tool],
    checkpointer=InMemorySaver(),
    middleware=[
        HumanInTheLoopMiddleware(
            interrupt_on={
                "send_email_tool":{
                    "allowed_decisions":["approve","edit","reject"]
                },
                "read_email_tool":False,

            }
        )
    ]
)

In [14]:
config = {"configurable": {"thread_id": "test-approve"}}
# Step 1: Request
result = agent.invoke(
    {"messages": [HumanMessage(content="Send email to john@test.com with subject 'Hello' and body 'How are you?'")]},
    config=config
)

In [15]:
result

{'messages': [HumanMessage(content="Send email to john@test.com with subject 'Hello' and body 'How are you?'", additional_kwargs={}, response_metadata={}, id='19a7179b-7709-4e57-a706-d9692f91576a'),
  AIMessage(content='', additional_kwargs={'reasoning_content': 'We need to call send_email_tool.', 'tool_calls': [{'id': 'fc_179a65a9-671d-4f94-b5fe-a282c1b0b77b', 'function': {'arguments': '{"body":"How are you?","recipient":"john@test.com","subject":"Hello"}', 'name': 'send_email_tool'}, 'type': 'function'}]}, response_metadata={'token_usage': {'completion_tokens': 46, 'prompt_tokens': 174, 'total_tokens': 220, 'completion_time': 0.050250452, 'completion_tokens_details': {'reasoning_tokens': 9}, 'prompt_time': 0.010642247, 'prompt_tokens_details': None, 'queue_time': 0.003460262, 'total_time': 0.060892699}, 'model_name': 'openai/gpt-oss-20b', 'system_fingerprint': 'fp_2b688e7cc3', 'service_tier': 'on_demand', 'finish_reason': 'tool_calls', 'logprobs': None, 'model_provider': 'groq'}, id=

In [16]:
from langgraph.types import Command
# Step 2: Approve
if "__interrupt__" in result:
    print("⏸️ Paused! Approving...")
    
    result = agent.invoke(
        Command(
            resume={
                "decisions": [
                    {"type": "approve"}
                ]
            }
        ),
        config=config
    )
    
    print(f"✅ Result: {result['messages'][-1].content}")

⏸️ Paused! Approving...
✅ Result: ✅ Email sent to john@test.com with subject “Hello” and body “How are you?”


In [18]:
result

{'messages': [HumanMessage(content="Send email to john@test.com with subject 'Hello' and body 'How are you?'", additional_kwargs={}, response_metadata={}, id='19a7179b-7709-4e57-a706-d9692f91576a'),
  AIMessage(content='', additional_kwargs={'reasoning_content': 'We need to call send_email_tool.', 'tool_calls': [{'id': 'fc_179a65a9-671d-4f94-b5fe-a282c1b0b77b', 'function': {'arguments': '{"body":"How are you?","recipient":"john@test.com","subject":"Hello"}', 'name': 'send_email_tool'}, 'type': 'function'}]}, response_metadata={'token_usage': {'completion_tokens': 46, 'prompt_tokens': 174, 'total_tokens': 220, 'completion_time': 0.050250452, 'completion_tokens_details': {'reasoning_tokens': 9}, 'prompt_time': 0.010642247, 'prompt_tokens_details': None, 'queue_time': 0.003460262, 'total_time': 0.060892699}, 'model_name': 'openai/gpt-oss-20b', 'system_fingerprint': 'fp_2b688e7cc3', 'service_tier': 'on_demand', 'finish_reason': 'tool_calls', 'logprobs': None, 'model_provider': 'groq'}, id=

### To reject

In [21]:
from langchain.agents import create_agent
from langchain.agents.middleware import HumanInTheLoopMiddleware
from langgraph.checkpoint.memory import InMemorySaver


def read_email_tool(email_id: str) -> str:
    """Mock function to read an email by its ID."""
    return f"Email content for ID: {email_id}"

def send_email_tool(recipient: str, subject: str, body: str) -> str:
    """Mock function to send an email."""
    return f"Email sent to {recipient} with subject '{subject}'"

from langchain_groq import ChatGroq

llm = ChatGroq(
    api_key=os.getenv("GROQ_API_KEY"),
    model= "openai/gpt-oss-20b",
    temperature=0
)

agent = create_agent(
    model=llm,
    tools=[read_email_tool,send_email_tool],
    checkpointer=InMemorySaver(),
    middleware=[
        HumanInTheLoopMiddleware(
            interrupt_on={
                "send_email_tool": {
                    "allowed_decisions": ["approve", "edit", "reject"],
                },
                "read_email_tool": False,
            }
        ),
    ],
)

In [22]:
config = {"configurable": {"thread_id": "test-reject"}}
# Step 1: Request
result = agent.invoke(
    {"messages": [HumanMessage(content="Send email to john@test.com with subject 'Hello' and body 'How are you?'")]},
    config=config)

In [23]:
# Step 2: Reject
if "__interrupt__" in result:
    print("⏸️ Paused! Approving...")
    
    result = agent.invoke(
        Command(
            resume={
                "decisions": [
                    {"type": "reject"}
                ]
            }
        ),
        config=config
    )
    
    print(f"✅ Result: {result['messages'][-1].content}")

⏸️ Paused! Approving...
✅ Result: I’m sorry the email wasn’t sent. Would you like me to try again, or is there something else you’d like to do?


In [24]:
result

{'messages': [HumanMessage(content="Send email to john@test.com with subject 'Hello' and body 'How are you?'", additional_kwargs={}, response_metadata={}, id='d25857a8-522e-432b-801b-264a97af54cb'),
  AIMessage(content='', additional_kwargs={'reasoning_content': 'We need to call send_email_tool.', 'tool_calls': [{'id': 'fc_8ad0416f-53d5-47d3-a5af-80c38be3c126', 'function': {'arguments': '{"body":"How are you?","recipient":"john@test.com","subject":"Hello"}', 'name': 'send_email_tool'}, 'type': 'function'}]}, response_metadata={'token_usage': {'completion_tokens': 46, 'prompt_tokens': 174, 'total_tokens': 220, 'completion_time': 0.047249367, 'completion_tokens_details': {'reasoning_tokens': 9}, 'prompt_time': 0.009747466, 'prompt_tokens_details': None, 'queue_time': 0.033715854, 'total_time': 0.056996833}, 'model_name': 'openai/gpt-oss-20b', 'system_fingerprint': 'fp_d6de37e6be', 'service_tier': 'on_demand', 'finish_reason': 'tool_calls', 'logprobs': None, 'model_provider': 'groq'}, id=

### To Edit

In [25]:
from langchain.agents import create_agent
from langchain.agents.middleware import HumanInTheLoopMiddleware
from langgraph.checkpoint.memory import InMemorySaver


def read_email_tool(email_id: str) -> str:
    """Mock function to read an email by its ID."""
    return f"Email content for ID: {email_id}"

def send_email_tool(recipient: str, subject: str, body: str) -> str:
    """Mock function to send an email."""
    return f"Email sent to {recipient} with subject '{subject}'"

from langchain_groq import ChatGroq

llm = ChatGroq(
    api_key=os.getenv("GROQ_API_KEY"),
    model= "openai/gpt-oss-20b",
    temperature=0
)

agent = create_agent(
    model=llm,
    tools=[read_email_tool,send_email_tool],
    checkpointer=InMemorySaver(),
    middleware=[
        HumanInTheLoopMiddleware(
            interrupt_on={
                "send_email_tool": {
                    "allowed_decisions": ["approve", "edit", "reject"],
                },
                "read_email_tool": False,
            }
        ),
    ],
)

In [26]:
config = {"configurable": {"thread_id": "test-edit"}}

# Step 1: Request (with wrong info)
result = agent.invoke(
    {"messages": [HumanMessage(content="Send email to wrong@email.com with subject 'Test' and body 'Hello'")]},
    config=config
)

In [28]:
result

{'messages': [HumanMessage(content="Send email to wrong@email.com with subject 'Test' and body 'Hello'", additional_kwargs={}, response_metadata={}, id='98121ad0-2018-467d-a28d-adbcc95e3f7b'),
  AIMessage(content='', additional_kwargs={'reasoning_content': "We need to use the send_email_tool. The user wants to send an email to wrong@email.com with subject 'Test' and body 'Hello'. We should call the function.", 'tool_calls': [{'id': 'fc_ca3034cf-30fc-4f42-8618-915bcc9fb64c', 'function': {'arguments': '{"body":"Hello","recipient":"wrong@email.com","subject":"Test"}', 'name': 'send_email_tool'}, 'type': 'function'}]}, response_metadata={'token_usage': {'completion_tokens': 71, 'prompt_tokens': 172, 'total_tokens': 243, 'completion_time': 0.075423743, 'completion_tokens_details': {'reasoning_tokens': 37}, 'prompt_time': 0.008458867, 'prompt_tokens_details': None, 'queue_time': 0.003326764, 'total_time': 0.08388261}, 'model_name': 'openai/gpt-oss-20b', 'system_fingerprint': 'fp_e2cb7a84ec',

In [29]:
# Step 2: Edit and approve
if "__interrupt__" in result:
    print("⏸️ Paused! Editing...")
    
    result = agent.invoke(
        Command(
            resume={
                "decisions": [
                    {
                        "type": "edit",
                        "edited_action": {
                            "name": "send_email_tool",      # Tool name
                            "args": {                   # New arguments
                                "recipient": "correct@email.com",
                                "subject": "Corrected Subject",
                                "body": "This was edited by human before sending"
                            }
                        }
                    }
                ]
            }
        ),
        config=config
    )
    
    print(f"✏️ Result: {result['messages'][-1].content}")

⏸️ Paused! Editing...
✏️ Result: 


In [121]:
result

{'messages': [HumanMessage(content="Send email to wrong@email.com with subject 'Test' and body 'Hello'", additional_kwargs={}, response_metadata={}, id='95e2923a-4d36-4d6a-9e28-6582e1bce620'),
  AIMessage(content='', additional_kwargs={'reasoning_content': "We need to use the send_email_tool. The user wants to send an email to wrong@email.com with subject 'Test' and body 'Hello'. We should call the function.", 'tool_calls': [{'id': 'fc_aa362afb-9016-4bfa-88dc-3de3be04f382', 'function': {'arguments': '{"body":"Hello","recipient":"wrong@email.com","subject":"Test"}', 'name': 'send_email_tool'}, 'type': 'function'}]}, response_metadata={'token_usage': {'completion_tokens': 71, 'prompt_tokens': 172, 'total_tokens': 243, 'completion_time': 0.074343417, 'completion_tokens_details': {'reasoning_tokens': 37}, 'prompt_time': 0.009992015, 'prompt_tokens_details': None, 'queue_time': 0.003106396, 'total_time': 0.084335432}, 'model_name': 'openai/gpt-oss-20b', 'system_fingerprint': 'fp_e2cb7a84ec'