## MiddleWare:=>
MidlleWare provides a way to more tightly control waht happen inside the agent.
MiddleWare is useful for the following:
- Tracking agent behaviour with loging,analytics and debugging
- Transforming prompts,tools selection and output formatting
- Adding retries, fallbacks and early termination logic
- Applying rate limits,guardrails and PII detection


https://docs.langchain.com/oss/python/langchain/middleware/built-in

Analogy -> Agent and client are chatting and we have memoe=ry for that 
but after sometime chat become large so for this we use Conversation Summariser middleware

In [None]:
import os
from dotenv import load_dotenv
load_dotenv()
os.environ["GROQ_API_KEY"]=os.getenv("GROQ_API_KEY")
os.environ["GEMINI_API_KEY"]=os.getenv("GEMINI_API_KEY")

Summarisation MiddleWare

In [None]:
from langchain_core.messages.utils import message_chunk_to_message
from langchain_core.caches import InMemoryCache
from langchain.agents import create_agent
from langgraph.checkpoint.memory import InMemorySaver
from langchain.agents.middleware import SummarizationMiddleware
agent=create_agent(
    model="groq:llama-3.1-8b-instant",
    checkpointer=InMemorySaver(),
    # When 10 or more chat summarise and keep last 4 chats  
    middleware=[
        SummarizationMiddleware(
        model="groq:llama-3.1-8b-instant",
        trigger=("messages",10),    
        keep=("messages",4)
        ),
    ],
)
agent

In [None]:

### Run with thread id
config={"configurable":{"thread_id":"test-1"}}

In [None]:
# Alternative test data
questions = [
    "What is 2+2?",
    "What is 10*5?",
    "What is 100/4?",
    "What is 15-7?",
    "What is 3*3?",
    "What is 4*4?",
]

for q in questions:
    response=agent.invoke({"messages":[HumanMessage(content=q)]},config)
    print(f"Messages: {response}")
    print(f"Messages: {len(response['messages'])}")

### Using Token Size via Tool

In [None]:
from aiohttp.web_middlewares import middleware
from langchain.agents import create_agent
from langchain.tools import tool
@tool
def search_hotels(city: str) -> str:
    """Search hotels - returns long response to use more tokens."""
    return f"""Hotels in {city}:
    1. Grand Hotel - 5 star, $350/night, spa, pool, gym
    2. City Inn - 4 star, $180/night, business center
    3. Budget Stay - 3 star, $75/night, free wifi"""

agent=create_agent(
    model="groq:llama-3.1-8b-instant",
    tools=[search_hotels],
    checkpointer=InMemorySaver(),
    middleware=[
        SummarizationMiddleware(
            # model="groq:llama-3.1-8b-instant",
            model = "groq:llama-3.1-70b-versatile",
            trigger=("tokens",550),
            keep=("tokens",200),
        )
    ]
)

config={"configurable":{"thread_id":"test-1"}}
# if here we are usinf it,this means using the same memory for the all the thing nd token will summarise
# If use in next block then each city has different memory  

def count_tokens(messages):
    total_char=sum(len(n.content) for n in messages)
    return total_char//4  # 4 char => 1 token


In [None]:
# Run test
cities = ["Paris", "London", "Tokyo", "New York", "Dubai", "Singapore"]

for city in cities:
    # create different memory for every place
    # config={"configurable":{"thread_id":f"test-{city}"}}
    response = agent.invoke(
        {"messages": [HumanMessage(content=f"Find hotels in {city}")]},
        config
        )
    
    tokens = count_tokens(response["messages"])
    print(f"{city}: ~{tokens} tokens, {len(response['messages'])} messages")
    print(f"{(response['messages'])}")

Example -> With reference to the fractions

In [None]:
from langchain.agents import create_agent
from langchain.agents.middleware import SummarizationMiddleware
from langchain_core.tools import tool
from langchain_core.messages import HumanMessage
from langgraph.checkpoint.memory import InMemorySaver

@tool
def search_hotels(city: str) -> str:
    """Search hotels."""
    return f"Hotels in {city}: Grand Hotel $350, City Inn $180, Budget Stay $75"

# LOW fraction for testing!
agent = create_agent(
    model="groq:llama-3.3-70b-versatile", 
    tools=[search_hotels],
    checkpointer=InMemorySaver(),
    system_prompt="You are a hotel assistant. Use ONLY the 'search_hotels' tool. Do not use external search.",
    middleware=[
        #  This doesn't as don't have token limit so 
        # SummarizationMiddleware(
        # model="groq:llama-3.1-70b-versatile",
        #     trigger=("fraction", 0.005),  # 0.5% = ~640 tokens
        #     keep=("fraction", 0.002),     # 0.2% = ~256 tokens
        # ),
        SummarizationMiddleware(
            model="groq:llama-3.1-8b-instant",
            # Trigger summarization at 640 tokens
            trigger=("tokens", 640), 
            # Keep at least 6 messages to preserve tool-response pairs
            keep=("messages", 6),    
        )
    ],
)

config = {"configurable": {"thread_id": "test-1"}}

# Token counter
def count_tokens(messages):
    return sum(len(str(m.content)) for m in messages) // 4

# Test
cities = ["Paris", "London", "Tokyo", "New York", "Dubai", "Singapore"]

for city in cities:
    response = agent.invoke(
        {"messages": [HumanMessage(content=f"Hotels in {city}")]},
        config
    )
    tokens = count_tokens(response["messages"])
    fraction = tokens / 128000  # gpt-4o-mini context
    print(f"{city}: ~{tokens} tokens ({fraction:.4%}), {len(response['messages'])} msgs")
    print(response['messages'])

In [None]:
In thr above there might be error as  model's attempt to call a tool 
and the tools registered with the agent. Specifically, your Groq model
is attempting to use a tool named brave_search (likely due to 
internal knowledge or a pre-trained bias for searching), but your create_agent only has search_hotels in its toolset. 

### Human in the Loop MiddleWare

Pause agent execution for human approval, editing, or rejection of tool calls before they execute. 

Human-in-the-loop is useful for the following:

- High-stakes operations requiring human approval (e.g. database writes, financial transactions).
- Compliance workflows where human oversight is mandatory.
- Long-running conversations where human feedback guides the agent.

In [227]:
from langchain.agents import create_agent
from langchain.agents.middleware import HumanInTheLoopMiddleware
from langgraph.checkpoint.memory import InMemorySaver


# mock function in software testing is a fake, 
# controllable version of a real function

def read_email_tool(email_id:str)->str:
    """Mock function to read an email by its ID."""
    return f"Email content for ID: {email_id}"

def send_email_tool(recipient:str,subject:str,body:str)->str:
    """Mock function to send an email by its ID."""
    return f"Email sent to {recipient} with subject {subject} and body {body}"


In [None]:
from langchain.agents.middleware import HumanInTheLoopMiddleware
agent=create_agent(
    model="groq:llama-3.1-8b-instant",
    tools=[send_email_tool,read_email_tool],
    checkpointer=InMemorySaver(),
    middleware=[
        HumanInTheLoopMiddleware(   
        interrupt_on={
            "send_email_tool": {"allowed_decisions": ["approve", "reject","edit"]},
            "read_email_tool": False,
            # No interruption
        }
        ),
    ],
)
agent

In [None]:
config = {"configurable": {"thread_id": "test-approve"}}
# Step 1: Request
result = agent.invoke(
    {"messages": [HumanMessage(content="Send email to john@test.com with subject 'Hello' and body 'How are you?'")]},
    config=config
)
# print(result)
print(result['__interrupt__'])

In [None]:
from langgraph.types import Command
# Step 2: Approve
if "__interrupt__" in result:
    print("⏸️ Paused! Approving...")
    
    result = agent.invoke(
        Command(
            resume={
                "decisions": [
                    {"type": "approve"}
                ]
            }
        ),
        config=config
    )
    
    print(f"✅ Result: {result['messages'][-1].content}")

In [None]:
from langgraph.types import Command
# Step 2: Reject
if "__interrupt__" in result:
    print("⏸️ Paused! Approving...")
    
    result = agent.invoke(
        Command(
            resume={
                "decisions": [
                    {"type": "reject"}
                ]
            }
        ),
        config=config
    )
    
    print(f"✅ Result: {result['messages'][-1].content}")

## Editing

In [228]:
from langchain.agents.middleware import HumanInTheLoopMiddleware
agent=create_agent(
    model="groq:llama-3.1-8b-instant",
    tools=[send_email_tool,read_email_tool],
    checkpointer=InMemorySaver(),
    middleware=[
        HumanInTheLoopMiddleware(   
        interrupt_on={
            "send_email_tool": {"allowed_decisions": ["approve", "reject","edit"]},
            "read_email_tool": False,
            # No interruption
        }
        ),
    ],
)

In [229]:
config = {"configurable": {"thread_id": "test-edit"}}

# Step 1: Request (with wrong info)
result = agent.invoke(
    {"messages": [HumanMessage(content="Send email to wrong@email.com with subject 'Test' and body 'Hello'")]},
    config=config
)

In [230]:

print(result)
print(result['__interrupt__'])

{'messages': [HumanMessage(content="Send email to wrong@email.com with subject 'Test' and body 'Hello'", additional_kwargs={}, response_metadata={}, id='eceeb23f-50b3-4a08-8f99-2cbad895f4c0'), AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'xqea5nmrr', 'function': {'arguments': '{"body":"Hello","recipient":"wrong@email.com","subject":"Test"}', 'name': 'send_email_tool'}, 'type': 'function'}]}, response_metadata={'token_usage': {'completion_tokens': 30, 'prompt_tokens': 313, 'total_tokens': 343, 'completion_time': 0.030698407, 'completion_tokens_details': None, 'prompt_time': 0.473867629, 'prompt_tokens_details': None, 'queue_time': 0.069045209, 'total_time': 0.504566036}, 'model_name': 'llama-3.1-8b-instant', 'system_fingerprint': 'fp_1151d4f23c', 'service_tier': 'on_demand', 'finish_reason': 'tool_calls', 'logprobs': None, 'model_provider': 'groq'}, id='lc_run--019bd1e9-8757-7140-bbf1-8e74e28a3f6f-0', tool_calls=[{'name': 'send_email_tool', 'args': {'body': 'Hello', 'r

In [None]:
#  Error while doing check this !!!!!!!!!!!

# # Step 2: Edit and approve
# if "__interrupt__" in result:
#     print("⏸️ Paused! Editing...")
    
#     result = agent.invoke(
#         Command(
#             resume={
#                 "decisions": [
#                     {
#                         "type": "edit",
#                         "edited_action": {
#                             "name": "send_email_tool",
#                             "args": {
#                                 "recipient": "correct@email.com",
#                                 "subject": "Corrected subject",
#                                 "body": "This was edited by human before sending"
#                             }
#                         }
#                     }
#                 ]
#             }
#         ),
#         config=config
#     )


#     print(result)
#     print(f"✏️ Result: {result['messages'][-1].content}")

⏸️ Paused! Editing...
✏️ Result: 
