### Middleware

Middleware provides a way to more tightly control what happens inside the agent. Middleware is useful for the following:
- Tracking agent behavior with logging, analytics, and debugging.
- Transforming prompts, tool selection, and output formatting.
- Adding retries, fallbacks, and early termination logic.
- Applying rate limits, guardrails, and PII detection.
- Langchain documentation has many more compared to what i have done here

In [1]:
import os
from dotenv import load_dotenv
load_dotenv()

os.environ["GROQ_API_KEY"] = os.getenv("GROQ_API_KEY")

## Summarization MiddleWare
Automatically summarize conversation history when approaching token limits, preserving recent messages while compressing older context. Summarization is useful for the following:
- Long-running conversations that exceed context windows.
- Multi-turn dialogues with extensive history.
- Applications where preserving full conversation context matters.

In [4]:
from langchain.agents import create_agent
from langchain.agents.middleware import SummarizationMiddleware
from langgraph.checkpoint.memory import InMemorySaver
from langchain_core.messages import HumanMessage, SystemMessage

In [None]:
#Message based summarization

agent = create_agent(
    model="groq:openai/gpt-oss-20b",
    checkpointer=InMemorySaver(),
    middleware=[
        SummarizationMiddleware(
            model="groq:openai/gpt-oss-20b",
            trigger=("messages",10),
            keep=("messages",4)
        )
    ]
)

In [12]:
# Run with thread id
config={"configurable":{"thread_id":"test-2"}} # conversation

In [13]:
# Alternative test data
questions = [
    "What is 2+2?",
    "What is 10*5?",
    "What is 100/4?",
    "What is 15-7?",
    "What is 3*3?",
    "What is 4*4?",
]

In [16]:
for q in questions:
    response = agent.invoke({"messages":[HumanMessage(content=q)]},config)
    print(f"Messages :{response}")
    print(f"Length : {len(response['messages'])}")

Messages :{'messages': [HumanMessage(content='Here is a summary of the conversation to date:\n\n## SESSION INTENT\nProvide correct answers to simple arithmetic queries.\n\n## SUMMARY\nThe user asked three basic math questions:\n1. “What is 2+2?” – Assistant responded with **4**.\n2. “What is 10*5?” – Assistant responded with **10\u202f×\u202f5\u202f=\u202f50**.\n3. “What is 100/4?” – Assistant responded with **100 ÷ 4 = 25**.\n\n## ARTIFACTS\nNone.\n\n## NEXT STEPS\nNone.', additional_kwargs={'lc_source': 'summarization'}, response_metadata={}, id='fc485ac8-1620-4b04-bb1a-9a97860196b8'), HumanMessage(content='What is 15-7?', additional_kwargs={}, response_metadata={}, id='ae861def-4cb9-40c7-860a-a875f660606c'), AIMessage(content='15\u202f−\u202f7\u202f=\u202f8', additional_kwargs={'reasoning_content': 'User is asking simple arithmetic. Provide answer: 15-7=8.'}, response_metadata={'token_usage': {'completion_tokens': 35, 'prompt_tokens': 159, 'total_tokens': 194, 'completion_time': 0.0

# Token Size

In [17]:
from langchain.agents import create_agent
from langchain.agents.middleware import SummarizationMiddleware
from langchain_core.tools import tool
from langchain_core.messages import HumanMessage
from langgraph.checkpoint.memory import InMemorySaver

In [18]:
@tool
def search_hotels(city: str) -> str:
    """Search hotels - returns long response to use more tokens."""
    return f"""Hotels in {city}:
    1. Grand Hotel - 5 star, $350/night, spa, pool, gym
    2. City Inn - 4 star, $180/night, business center
    3. Budget Stay - 3 star, $75/night, free wifi"""

In [19]:
agent=create_agent(
    model="groq:openai/gpt-oss-20b",
    tools=[search_hotels],
    checkpointer=InMemorySaver(),
    middleware=[
        SummarizationMiddleware(
            model="groq:openai/gpt-oss-20b",
            trigger=("tokens",550),
            keep=("tokens",200),
        ),
    ]
)

config = {"configurable": {"thread_id": "test-3"}}

# Token counter (approximate)
def count_tokens(messages):
    total_chars = sum(len(str(m.content)) for m in messages)
    return total_chars // 6  # 6 chars ≈ 1 token

In [20]:
# Run test
cities = ["Hyderabad", "London", "Bengaluru", "Mumbai", "Dubai", "Singapore"]

for city in cities:
    response = agent.invoke(
        {"messages": [HumanMessage(content=f"Find hotels in {city}")]},
        config=config
    )
    
    tokens = count_tokens(response["messages"])
    print(f"{city}: ~{tokens} tokens, {len(response['messages'])} messages")
    print(f"{(response['messages'])}")

Hyderabad: ~176 tokens, 4 messages
[HumanMessage(content='Find hotels in Hyderabad', additional_kwargs={}, response_metadata={}, id='b76ea2ef-0958-4977-aa05-a89a35c300f8'), AIMessage(content='', additional_kwargs={'reasoning_content': 'We need to use the function to search hotels. Provide city "Hyderabad".', 'tool_calls': [{'id': 'fc_ccf05fb4-83e7-4aaa-8976-3fccf1d75af6', 'function': {'arguments': '{"city":"Hyderabad"}', 'name': 'search_hotels'}, 'type': 'function'}]}, response_metadata={'token_usage': {'completion_tokens': 42, 'prompt_tokens': 129, 'total_tokens': 171, 'completion_time': 0.044780293, 'completion_tokens_details': {'reasoning_tokens': 17}, 'prompt_time': 0.006241713, 'prompt_tokens_details': None, 'queue_time': 0.056472126, 'total_time': 0.051022006}, 'model_name': 'openai/gpt-oss-20b', 'system_fingerprint': 'fp_80501ff3a1', 'service_tier': 'on_demand', 'finish_reason': 'tool_calls', 'logprobs': None, 'model_provider': 'groq'}, id='lc_run--019c18c3-d0cc-7af3-bef3-034b2d

Fraction

In [21]:

@tool
def search(city: str) -> str:
    """Search hotels."""
    return f"Hotels in {city}: Grand Hotel $350, City Inn $180, Budget Stay $75"

In [27]:
agent=create_agent(
    model="groq:openai/gpt-oss-20b",
    tools=[search],
    checkpointer=InMemorySaver(),
    middleware=[
        SummarizationMiddleware(
            model="groq:openai/gpt-oss-20b",
            trigger=("fraction",0.005),
            keep=("fraction",0.002),
        ),
    ]
)

config = {"configurable": {"thread_id": "test-5"}}

In [28]:
def count_tokens(messages):
    return sum(len(str(m.content)) for m in messages) // 4

# Test
cities = ["Paris", "London", "Tokyo", "New York", "Dubai", "Singapore"]

for city in cities:
    response = agent.invoke(
        {"messages": [HumanMessage(content=f"Hotels in {city}")]},
        config=config
    )
    tokens = count_tokens(response["messages"])
    fraction = tokens / 128000  #gpt-oss-20b context length
    print(f"{city}: ~{tokens} tokens ({fraction:.4%}), {len(response['messages'])} msgs")
    print(response['messages'])

Paris: ~128 tokens (0.1000%), 4 msgs
[HumanMessage(content='Hotels in Paris', additional_kwargs={}, response_metadata={}, id='6e1a7c48-e43e-4b88-8ac8-03c20a4e71ff'), AIMessage(content='', additional_kwargs={'reasoning_content': 'We need to use the search function. The user asked "Hotels in Paris". We should call search with city "Paris".', 'tool_calls': [{'id': 'fc_ad588521-4ef7-4edb-a0fa-1d3152f386cd', 'function': {'arguments': '{"city":"Paris"}', 'name': 'search'}, 'type': 'function'}]}, response_metadata={'token_usage': {'completion_tokens': 48, 'prompt_tokens': 118, 'total_tokens': 166, 'completion_time': 0.049287372, 'completion_tokens_details': {'reasoning_tokens': 26}, 'prompt_time': 0.006625642, 'prompt_tokens_details': None, 'queue_time': 0.047046548, 'total_time': 0.055913014}, 'model_name': 'openai/gpt-oss-20b', 'system_fingerprint': 'fp_c5a89987dc', 'service_tier': 'on_demand', 'finish_reason': 'tool_calls', 'logprobs': None, 'model_provider': 'groq'}, id='lc_run--019c18ce-

# Human In the LOOP
for some tasks human involvement is necessary - financial risks like stock investment, 
sending emails,etc

In [29]:
from langchain.agents import create_agent
from langchain.agents.middleware import HumanInTheLoopMiddleware
from langgraph.checkpoint.memory import InMemorySaver

def read_email_tool(email_id: str) -> str:
    """Mock function to read an email by its ID."""
    return f"Email content for ID: {email_id}"

def send_email_tool(recipient: str, subject: str, body: str) -> str:
    """Mock function to send an email."""
    return f"Email sent to {recipient} with subject '{subject}'"

In [30]:
agent=create_agent(
    model="groq:openai/gpt-oss-20b",
    tools=[read_email_tool,send_email_tool],
    checkpointer=InMemorySaver(),
    middleware=[
        HumanInTheLoopMiddleware(
            interrupt_on={
                "send_email_tool":{
                    "allowed_decisions":["approve","edit","reject"]
                },
                "read_email_tool":False,

            }
        )
    ]
)

In [31]:
config = {"configurable": {"thread_id": "test-approve"}}
# Request
result = agent.invoke(
    {"messages": [HumanMessage(content="Send email to john@test.com with subject 'Hello' and body 'How are you?'")]},
    config=config
)
result

{'messages': [HumanMessage(content="Send email to john@test.com with subject 'Hello' and body 'How are you?'", additional_kwargs={}, response_metadata={}, id='e7c51a23-64f3-4be1-8cb4-46313c6b29b1'),
  AIMessage(content='', additional_kwargs={'reasoning_content': 'We need to use the send_email_tool function.', 'tool_calls': [{'id': 'fc_32422dd0-49e8-438e-9dcb-9e1fccda36e6', 'function': {'arguments': '{"body":"How are you?","recipient":"john@test.com","subject":"Hello"}', 'name': 'send_email_tool'}, 'type': 'function'}]}, response_metadata={'token_usage': {'completion_tokens': 48, 'prompt_tokens': 174, 'total_tokens': 222, 'completion_time': 0.058652374, 'completion_tokens_details': {'reasoning_tokens': 11}, 'prompt_time': 0.012373556, 'prompt_tokens_details': None, 'queue_time': 0.051695524, 'total_time': 0.07102593}, 'model_name': 'openai/gpt-oss-20b', 'system_fingerprint': 'fp_c5a89987dc', 'service_tier': 'on_demand', 'finish_reason': 'tool_calls', 'logprobs': None, 'model_provider': 

# Approving

In [33]:
from langgraph.types import Command

#saying something to the agent - to continue,to edit,to reject

if "__interrupt__" in result:
    print(" Paused! Approving...")
    
    result = agent.invoke(
        Command(
            resume={
                "decisions": [
                    {"type": "approve"}
                ]
            }
        ),
        config=config
    )
    
    print(f"Result: {result['messages'][-1].content}")

 Paused! Approving...
Result: ✅ Email sent to **john@test.com** with subject **“Hello”** and body **“How are you?”**.


In [34]:
result

{'messages': [HumanMessage(content="Send email to john@test.com with subject 'Hello' and body 'How are you?'", additional_kwargs={}, response_metadata={}, id='e7c51a23-64f3-4be1-8cb4-46313c6b29b1'),
  AIMessage(content='', additional_kwargs={'reasoning_content': 'We need to use the send_email_tool function.', 'tool_calls': [{'id': 'fc_32422dd0-49e8-438e-9dcb-9e1fccda36e6', 'function': {'arguments': '{"body":"How are you?","recipient":"john@test.com","subject":"Hello"}', 'name': 'send_email_tool'}, 'type': 'function'}]}, response_metadata={'token_usage': {'completion_tokens': 48, 'prompt_tokens': 174, 'total_tokens': 222, 'completion_time': 0.058652374, 'completion_tokens_details': {'reasoning_tokens': 11}, 'prompt_time': 0.012373556, 'prompt_tokens_details': None, 'queue_time': 0.051695524, 'total_time': 0.07102593}, 'model_name': 'openai/gpt-oss-20b', 'system_fingerprint': 'fp_c5a89987dc', 'service_tier': 'on_demand', 'finish_reason': 'tool_calls', 'logprobs': None, 'model_provider': 

Now lets do 
# Rejection

In [38]:
config = {"configurable": {"thread_id": "test-reject2"}}
# Request
result = agent.invoke(
    {"messages": [HumanMessage(content="Send email to reject@test.com with subject 'Hello' and body 'Why are you not rejected yet?'")]},
    config=config
)
result

{'messages': [HumanMessage(content="Send email to reject@test.com with subject 'Hello' and body 'Why are you not rejected yet?'", additional_kwargs={}, response_metadata={}, id='faafcc6d-f5e3-4d38-aaba-91f752dc3394'),
  AIMessage(content='', additional_kwargs={'reasoning_content': 'The user wants to send an email. We should use the send_email_tool.', 'tool_calls': [{'id': 'fc_6dd449fb-1866-48e0-8225-b1becfa90744', 'function': {'arguments': '{"body":"Why are you not rejected yet?","recipient":"reject@test.com","subject":"Hello"}', 'name': 'send_email_tool'}, 'type': 'function'}]}, response_metadata={'token_usage': {'completion_tokens': 57, 'prompt_tokens': 177, 'total_tokens': 234, 'completion_time': 0.058365346, 'completion_tokens_details': {'reasoning_tokens': 17}, 'prompt_time': 0.008536415, 'prompt_tokens_details': None, 'queue_time': 0.047184945, 'total_time': 0.066901761}, 'model_name': 'openai/gpt-oss-20b', 'system_fingerprint': 'fp_c5a89987dc', 'service_tier': 'on_demand', 'fini

In [39]:


#saying something to the agent - to continue,to edit,to reject

if "__interrupt__" in result:
    print(" Paused! Rejecting...")
    
    result = agent.invoke(
        Command(
            resume={
                "decisions": [
                    {"type": "reject"}
                ]
            }
        ),
        config=config
    )
    
    print(f"Result: {result['messages'][-1].content}")

 Paused! Rejecting...
Result: I tried to send the email, but the request was rejected. Could you let me know if you’d like me to try again, or if there’s anything you’d like to change (recipient, subject, body, etc.)?


Now we edit it and send
# Editing

In [46]:
config = {"configurable": {"thread_id": "test-edit3"}}
# Request
result = agent.invoke(
    {"messages": [HumanMessage(content="Send email to Karthik@test.com with subject 'Hello' and body 'This bro?'")]},
    config=config
)
result

{'messages': [HumanMessage(content="Send email to Karthik@test.com with subject 'Hello' and body 'This bro?'", additional_kwargs={}, response_metadata={}, id='7a5899b5-501e-47fe-890b-e5df9955a3cc'),
  AIMessage(content='', additional_kwargs={'reasoning_content': "We need to send an email via the send_email_tool. We must provide body, recipient, subject. We'll call the function.", 'tool_calls': [{'id': 'fc_f5bbfdf8-33c6-4752-94c4-729715e41e47', 'function': {'arguments': '{"body":"This bro?","recipient":"Karthik@test.com","subject":"Hello"}', 'name': 'send_email_tool'}, 'type': 'function'}]}, response_metadata={'token_usage': {'completion_tokens': 65, 'prompt_tokens': 175, 'total_tokens': 240, 'completion_time': 0.071113756, 'completion_tokens_details': {'reasoning_tokens': 27}, 'prompt_time': 0.011838084, 'prompt_tokens_details': None, 'queue_time': 0.052563174, 'total_time': 0.08295184}, 'model_name': 'openai/gpt-oss-20b', 'system_fingerprint': 'fp_3d587a02fb', 'service_tier': 'on_dema

In [53]:


#saying something to the agent - to continue,to edit,to reject

if "__interrupt__" in result:
    print(" Paused! Editing and sending...")
    
    result = agent.invoke(
        Command(
    resume={
        "decisions": [
            {
                "type": "edit",
                "edited_action": {
                    "name": "send_email_tool",
                    "args": {
                        "recipient": "morty@gmail.com",
                        "subject": "Corrected Subject",
                        "body": "This was edited by god before sending"
                    }
                }
            }
        ],
        "final": True   #  THIS is the key, the key didn't work
    }
)
,
        config=config

    )
    
    print(f"Result: {result['messages'][-1].content}")

 Paused! Editing and sending...
Result: 


In [54]:
result

{'messages': [HumanMessage(content="Send email to Karthik@test.com with subject 'Hello' and body 'This bro?'", additional_kwargs={}, response_metadata={}, id='7a5899b5-501e-47fe-890b-e5df9955a3cc'),
  AIMessage(content='', additional_kwargs={'reasoning_content': "We need to send an email via the send_email_tool. We must provide body, recipient, subject. We'll call the function.", 'tool_calls': [{'id': 'fc_f5bbfdf8-33c6-4752-94c4-729715e41e47', 'function': {'arguments': '{"body":"This bro?","recipient":"Karthik@test.com","subject":"Hello"}', 'name': 'send_email_tool'}, 'type': 'function'}]}, response_metadata={'token_usage': {'completion_tokens': 65, 'prompt_tokens': 175, 'total_tokens': 240, 'completion_time': 0.071113756, 'completion_tokens_details': {'reasoning_tokens': 27}, 'prompt_time': 0.011838084, 'prompt_tokens_details': None, 'queue_time': 0.052563174, 'total_time': 0.08295184}, 'model_name': 'openai/gpt-oss-20b', 'system_fingerprint': 'fp_3d587a02fb', 'service_tier': 'on_dema