### Middleware

Middleware provides a way to more tightly control what happens inside the agent. Middleware is useful for the following:

- Tracking agent behavior with logging, analytics, and debugging.
- Transforming prompts, tool selection, and output formatting.
- Adding retries, fallbacks, and early termination logic.
- Applying rate limits, guardrails, and PII detection.


In [1]:
import os
from dotenv import load_dotenv
from langchain_openai import ChatOpenAI
from langchain.chat_models import init_chat_model

load_dotenv()
# Setup OpenRouter as an OpenAI-compatible provider
llm = init_chat_model(
    model="openai/gpt-oss-120b",  # Use the OpenRouter model string (e.g., "google/gemini-2.0-flash-001")
    model_provider="openai",
    openai_api_key=os.getenv("OPENROUTER_API_KEY"),
    base_url="https://openrouter.ai/api/v1",
    # This enables the "reasoning" feature you liked in your snippet
    extra_body={"reasoning": {"enabled": True}},
)

In [7]:
from langchain.agents import create_agent
from langchain.agents.middleware import SummarizationMiddleware
from langgraph.checkpoint.memory import InMemorySaver
from langchain_core.messages import HumanMessage, SystemMessage

In [8]:
### Message Based Summarization

agent = create_agent(
    model=llm,
    checkpointer=InMemorySaver(),
    middleware=[
        SummarizationMiddleware(
            model=llm,
            trigger=("messages", 10),
            keep=("messages", 4),
        )
    ],
)

In [9]:
### run with thread id
config = {
    "configurable": {"thread_id": "test-1"}
}  # this particualr thread is our unique user

In [10]:
# Alternative test data
questions = [
    "What is 2+2?",
    "What is 10*5?",
    "What is 100/4?",
    "What is 15-7?",
    "What is 3*3?",
    "What is 4*4?",
]

for q in questions:
    response = agent.invoke({"messages": [HumanMessage(content=q)]}, config)
    print(f"Messages: {response}")
    print(f"Messages: {len(response["messages"])}")

Messages: {'messages': [HumanMessage(content='What is 2+2?', additional_kwargs={}, response_metadata={}, id='8e3f7a29-18c3-475a-b0d9-91c46908853a'), AIMessage(content='2\u202f+\u202f2\u202f=\u202f4.', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 43, 'prompt_tokens': 85, 'total_tokens': 128, 'completion_tokens_details': {'accepted_prediction_tokens': None, 'audio_tokens': None, 'reasoning_tokens': 23, 'rejected_prediction_tokens': None, 'image_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0, 'video_tokens': 0}, 'cost': 1.5e-05, 'is_byok': False, 'cost_details': {'upstream_inference_cost': None, 'upstream_inference_prompt_cost': 4.25e-06, 'upstream_inference_completions_cost': 1.075e-05}}, 'model_provider': 'openai', 'model_name': 'openai/gpt-oss-120b', 'system_fingerprint': '', 'id': 'gen-1768539167-UQ3VDXdVoomirspMu3Zu', 'finish_reason': 'stop', 'logprobs': None}, id='lc_run--019bc526-1881-7f21-8fbb-c0940e217e

In [None]:
print(
    "Here is a summary of the conversation to date:\n\nHuman: What is 2+2?\nAI: 2\u202f+\u202f2\u202f=\u202f4.\nHuman: What is 10*5?\nAI: 10\u202f×\u202f5\u202f=\u202f50.\nHuman: What is 100/4?\nAI: 100\u202f÷\u202f4\u202f=\u202f25.\nHuman: What is 15-7?"
)

Here is a summary of the conversation to date:

Human: What is 2+2?
AI: 2 + 2 = 4.
Human: What is 10*5?
AI: 10 × 5 = 50.
Human: What is 100/4?
AI: 100 ÷ 4 = 25.
Human: What is 15-7?


#### **Human In loop**

Pause agent execution for human approval, editing, or rejection of tool calls before they execute. Human-in-the-loop is useful for the following:

- High-stakes operations requiring human approval (e.g. database writes, financial transactions).
- Compliance workflows where human oversight is mandatory.
- Long-running conversations where human feedback guides the agent.


In [15]:
from langchain.agents import create_agent
from langchain.agents.middleware import HumanInTheLoopMiddleware
from langgraph.checkpoint.memory import InMemorySaver


def read_email_tool(email_id: str) -> str:
    """Mock function to read an email by its ID."""
    return f"Email content for ID: {email_id}"


def send_email_tool(recipient: str, subject: str, body: str) -> str:
    """Mock function to send an email."""
    return f"Email sent to {recipient} with subject '{subject}'"

In [16]:
agent = create_agent(
    model=llm,
    tools=[read_email_tool, send_email_tool],
    checkpointer=InMemorySaver(),
    middleware=[
        HumanInTheLoopMiddleware(
            interrupt_on={
                "send_email_tool": {"allowed_decisions": ["approve", "edit", "reject"]},
                "read_email_tool": False,
            }
        )
    ],
)

In [17]:
config = {"configurable": {"thread_id": "test-approve"}}
# Step 1: Request
result = agent.invoke(
    {
        "messages": [
            HumanMessage(
                content="Send email to john@test.com with subject 'Hello' and body 'How are you?'"
            )
        ]
    },
    config=config,
)

In [18]:
result

{'messages': [HumanMessage(content="Send email to john@test.com with subject 'Hello' and body 'How are you?'", additional_kwargs={}, response_metadata={}, id='659e3c76-50cd-47bf-b0b2-7d1c6a0ad9e7'),
  AIMessage(content='The email has been sent successfully.', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 73, 'prompt_tokens': 172, 'total_tokens': 245, 'completion_tokens_details': {'accepted_prediction_tokens': None, 'audio_tokens': None, 'reasoning_tokens': 15, 'rejected_prediction_tokens': None, 'image_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0, 'video_tokens': 0}, 'cost': 5.297e-05, 'is_byok': False, 'cost_details': {'upstream_inference_cost': None, 'upstream_inference_prompt_cost': 1.72e-05, 'upstream_inference_completions_cost': 3.577e-05}}, 'model_provider': 'openai', 'model_name': 'openai/gpt-oss-120b', 'system_fingerprint': None, 'id': 'gen-1768540410-5lSx4g1imMa265ZJAsTy', 'finish_reason': 'tool_cal

In [21]:
from langgraph.types import Command

# Step 2: Approve
if "__interrupt__" in result:
    print("Paused! Approving...")

    result = agent.invoke(
        Command(resume={"decisions": [{"type": "approve"}]}), config=config
    )

    print(f"Result: {result['messages'][-1].content}")

In [22]:
result

{'messages': [HumanMessage(content="Send email to john@test.com with subject 'Hello' and body 'How are you?'", additional_kwargs={}, response_metadata={}, id='659e3c76-50cd-47bf-b0b2-7d1c6a0ad9e7'),
  AIMessage(content='The email has been sent successfully.', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 73, 'prompt_tokens': 172, 'total_tokens': 245, 'completion_tokens_details': {'accepted_prediction_tokens': None, 'audio_tokens': None, 'reasoning_tokens': 15, 'rejected_prediction_tokens': None, 'image_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0, 'video_tokens': 0}, 'cost': 5.297e-05, 'is_byok': False, 'cost_details': {'upstream_inference_cost': None, 'upstream_inference_prompt_cost': 1.72e-05, 'upstream_inference_completions_cost': 3.577e-05}}, 'model_provider': 'openai', 'model_name': 'openai/gpt-oss-120b', 'system_fingerprint': None, 'id': 'gen-1768540410-5lSx4g1imMa265ZJAsTy', 'finish_reason': 'tool_cal