# The Agent Loop: Building Production Agents with LangChain 1.0

In this notebook, we'll explore the foundational concepts of AI agents and learn how to build production-grade agents using LangChain's new `create_agent` abstraction with middleware support.

**Learning Objectives:**
- Understand what an "agent" is and how the agent loop works
- Learn the core constructs of LangChain (Runnables, LCEL)
- Master the `create_agent` function and middleware system
- Build an agentic RAG application using Qdrant

## Table of Contents:

- **Breakout Room #1:** Introduction to LangChain, LangSmith, and `create_agent`
  - Task 1: Dependencies
  - Task 2: Environment Variables
  - Task 3: LangChain Core Concepts (Runnables & LCEL)
  - Task 4: Understanding the Agent Loop
  - Task 5: Building Your First Agent with `create_agent()`
  - Question #1 & Question #2
  - Activity #1: Create a Custom Tool

- **Breakout Room #2:** Middleware - Agentic RAG with Qdrant
  - Task 6: Loading & Chunking Documents
  - Task 7: Setting up Qdrant Vector Database
  - Task 8: Creating a RAG Tool
  - Task 9: Introduction to Middleware
  - Task 10: Building Agentic RAG with Middleware
  - Question #3 & Question #4
  - Activity #2: Enhance the Agent

---
# ü§ù Breakout Room #1
## Introduction to LangChain, LangSmith, and `create_agent`

## Task 1: Dependencies

First, let's ensure we have all the required packages installed. We'll be using:

- **LangChain 1.0+**: The core framework with the new `create_agent` API
- **LangChain-OpenAI**: OpenAI model integrations
- **LangSmith**: Observability and tracing
- **Qdrant**: Vector database for RAG
- **tiktoken**: Token counting for text splitting

In [None]:
# Run this cell to install dependencies (if not using uv sync)
# !pip install langchain>=1.0.0 langchain-openai langsmith langgraph qdrant-client langchain-qdrant tiktoken nest-asyncio

In [4]:
# Core imports we'll use throughout the notebook
import os
import getpass
from uuid import uuid4

import nest_asyncio
nest_asyncio.apply()  # Required for async operations in Jupyter

## Task 2: Environment Variables

We need to set up our API keys for:
1. **OpenAI** - For the GPT-5 model
2. **LangSmith** - For tracing and observability (optional but recommended)

In [5]:
# Set OpenAI API Key
os.environ["OPENAI_API_KEY"] = getpass.getpass("OpenAI API Key: ")

In [4]:
# Optional: Set up LangSmith for tracing
# This provides powerful debugging and observability for your agents

os.environ["LANGCHAIN_TRACING_V2"] = "true"
os.environ["LANGCHAIN_PROJECT"] = f"AIE9 - The Agent Loop - {uuid4().hex[0:8]}"
os.environ["LANGCHAIN_API_KEY"] = getpass.getpass("LangSmith API Key (press Enter to skip): ") or ""

if not os.environ["LANGCHAIN_API_KEY"]:
    os.environ["LANGCHAIN_TRACING_V2"] = "false"
    print("LangSmith tracing disabled")
else:
    print(f"LangSmith tracing enabled. Project: {os.environ['LANGCHAIN_PROJECT']}")

LangSmith tracing enabled. Project: AIE9 - The Agent Loop - 897a3674


## Task 3: LangChain Core Concepts

Before diving into agents, let's understand the fundamental building blocks of LangChain.

### What is a Runnable?

A **Runnable** is the core abstraction in LangChain - think of it as a standardized component that:
- Takes an input
- Performs some operation
- Returns an output

Every component in LangChain (models, prompts, retrievers, parsers) is a Runnable, which means they all share the same interface:

```python
result = runnable.invoke(input)           # Single input
results = runnable.batch([input1, input2]) # Multiple inputs
for chunk in runnable.stream(input):       # Streaming
    print(chunk)
```

### What is LCEL (LangChain Expression Language)?

**LCEL** allows you to chain Runnables together using the `|` (pipe) operator:

```python
chain = prompt | model | output_parser
result = chain.invoke({"query": "Hello!"})
```

This is similar to Unix pipes - the output of one component becomes the input to the next.

In [33]:
# Let's see LCEL in action with a simple example
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser

# Create our components (each is a Runnable)
prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful assistant that speaks like a pirate."),
    ("human", "{question}")
])

model = ChatOpenAI(model="gpt-5-nano", temperature=0.7)

output_parser = StrOutputParser()

# Chain them together with LCEL
pirate_chain = prompt | model | output_parser

In [26]:
# Invoke the chain
response = pirate_chain.invoke({"question": "What is the capital of France?"})
print(response)

Arrr, that be Paris!


## Task 4: Understanding the Agent Loop

### What is an Agent?

An **agent** is a system that uses an LLM to decide what actions to take. Unlike a simple chain that follows a fixed sequence, an agent can:

1. **Reason** about what to do next
2. **Take actions** by calling tools
3. **Observe** the results
4. **Iterate** until the task is complete

### The Agent Loop

The core of every agent is the **agent loop**:

```
                          AGENT LOOP                         
                                                             
      +----------+     +----------+     +----------+         
      |  Model   | --> |   Tool   | --> |  Model   | --> ... 
      |   Call   |     |   Call   |     |   Call   |         
      +----------+     +----------+     +----------+         
           |                                  |              
           v                                  v              
      "Use search"                   "Here's the answer"     
```

1. **Model Call**: The LLM receives the current state and decides whether to:
   - Call a tool (continue the loop)
   - Return a final answer (exit the loop)

2. **Tool Call**: If the model decides to use a tool, the tool is executed and its output is added to the conversation

3. **Repeat**: The loop continues until the model decides it has enough information to answer

### Why `create_agent`?

LangChain 1.0 introduced `create_agent` as the new standard way to build agents. It provides:

- **Simplified API**: One function to create production-ready agents
- **Middleware Support**: Hook into any point in the agent loop
- **Built on LangGraph**: Uses the battle-tested LangGraph runtime under the hood

## Task 5: Building Your First Agent with `create_agent()`

Let's build a simple agent that can perform calculations and tell the time.

### Step 1: Define Tools

Tools are functions that the agent can call. We use the `@tool` decorator to create them.

In [22]:
from langchain_core.tools import tool

@tool
def calculate(expression: str) -> str:
    """Evaluate a mathematical expression. Use this for any math calculations.
    
    Args:
        expression: A mathematical expression to evaluate (e.g., '2 + 2', '10 * 5')
    """
    try:
        # Using eval with restricted globals for safety
        result = eval(expression, {"__builtins__": {}}, {})
        return f"The result of {expression} is {result}"
    except Exception as e:
        return f"Error evaluating expression: {e}"

@tool
def get_current_time() -> str:
    """Get the current date and time. Use this when the user asks about the current time or date."""
    from datetime import datetime
    return f"The current date and time is: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}"

# Create our tool belt
tools = [calculate, get_current_time]

print("Tools created:")
for t in tools:
    print(f"  - {t.name}: {t.description[:60]}...")

Tools created:
  - calculate: Evaluate a mathematical expression. Use this for any math ca...
  - get_current_time: Get the current date and time. Use this when the user asks a...


### Step 2: Create the Agent

Now we use `create_agent` to build our agent. The function takes:
- `model`: The LLM to use (can be a string like `"gpt-5"` or a model instance)
- `tools`: List of tools the agent can use
- `prompt`: Optional system prompt to customize behavior

In [34]:
from langchain.agents import create_agent

# Create our first agent
simple_agent = create_agent(
    model="gpt-5-nano",
    tools=tools,
    system_prompt="You are a helpful assistant that can perform calculations and tell the time. Always explain your reasoning."
)

print("Agent created successfully!")
print(f"Type: {type(simple_agent)}")

Agent created successfully!
Type: <class 'langgraph.graph.state.CompiledStateGraph'>


### Step 3: Run the Agent

The agent is a Runnable, so we can invoke it like any other LangChain component.

In [10]:
# Test the agent with a simple calculation
response = simple_agent.invoke(
    {"messages": [{"role": "user", "content": "What is 25 * 48?"}]}
)

# Print the final response
print("Agent Response:")
print(response["messages"][-1].content)

Agent Response:
25 √ó 48 = 1200.

Reasoning:
- 25 √ó 48 = 25 √ó (50 ‚àí 2) = 25√ó50 ‚àí 25√ó2 = 1250 ‚àí 50 = 1200.


In [11]:
# Test with a multi-step question that requires multiple tool calls
response = simple_agent.invoke(
    {"messages": [{"role": "user", "content": "What time is it, and what is 100 divided by the current hour?"}]}
)

print("Agent Response:")
print(response["messages"][-1].content)

Agent Response:
- Current time: 2026-01-19 15:13:29 (3:13:29 PM).
- Calculation: The current hour is 15, so 100 √∑ 15 = 6.6666666667 (repeating), i.e., 20/3.


In [12]:
# Let's see the full conversation to understand the agent loop
print("Full Agent Conversation:")
print("=" * 50)
for msg in response["messages"]:
    role = msg.type if hasattr(msg, 'type') else 'unknown'
    content = msg.content if hasattr(msg, 'content') else str(msg)
    print(f"\n[{role.upper()}]")
    print(content[:500] if len(str(content)) > 500 else content)

Full Agent Conversation:

[HUMAN]
What time is it, and what is 100 divided by the current hour?

[AI]


[TOOL]
The current date and time is: 2026-01-19 15:13:29

[AI]


[TOOL]
The result of 100 / 15 is 6.666666666666667

[AI]
- Current time: 2026-01-19 15:13:29 (3:13:29 PM).
- Calculation: The current hour is 15, so 100 √∑ 15 = 6.6666666667 (repeating), i.e., 20/3.


### Streaming Agent Responses

For better UX, we can stream the agent's responses as they're generated.

In [13]:
# Stream the agent's response
print("Streaming Agent Response:")
print("=" * 50)

for chunk in simple_agent.stream(
    {"messages": [{"role": "user", "content": "Calculate 15% of 250"}]},
    stream_mode="updates"
):
    for node, values in chunk.items():
        print(f"\n[Node: {node}]")
        if "messages" in values:
            for msg in values["messages"]:
                if hasattr(msg, 'content') and msg.content:
                    print(msg.content)

Streaming Agent Response:

[Node: model]

[Node: tools]
The result of 0.15 * 250 is 37.5

[Node: model]
To find 15% of 250, multiply 250 by 0.15:
250 √ó 0.15 = 37.5

Answer: 37.5


---
## ‚ùì Question #1:

In the agent loop, what determines whether the agent continues to call tools or returns a final answer to the user? How does `create_agent` handle this decision internally?

##### ‚úÖ Answer:
In LangChain, the transition from "thinking/using tools" to "responding to the user" is managed by a ReAct (Reason + Act) pattern. The agent doesn't actually "know" it's done, it simply follows a structural protocol until it hits a specific exit condition. If it think it needs more information it will call the tool, if not it will give a final answer.

## ‚ùì Question #2:

Looking at the `calculate` and `get_current_time` tools we created, why is the **docstring** so important for each tool? How does the agent use this information when deciding which tool to call?

##### ‚úÖ Answer:
The docstring is the LLM‚Äôs only way to understand the tool‚Äôs purpose and logic. 

It is used as:
- Discovery: The agent scans docstrings to match the user's intent to a specific tool.
- Syntax: It defines what arguments the model must generate (e.g., "units must be Celsius").
- Discrimination: It prevents "tool confusion" by clarifying when to use one tool over another.
- Reliability: Vague docstrings cause hallucinations; precise ones ensure the correct tool is triggered.

---
## üèóÔ∏è Activity #1: Create a Custom Tool

Create your own custom tool and add it to the agent! 

Ideas:
- A tool that converts temperatures between Celsius and Fahrenheit
- A tool that generates a random number within a range
- A tool that counts words in a given text

Requirements:
1. Use the `@tool` decorator
2. Include a clear docstring (this is what the agent sees!)
3. Add it to the agent and test it

In [None]:
from langchain.tools import tool

@tool
def convert_currency(amount: float, from_currency: str, to_currency: str) -> str:
    """
    Converts currency between Euro (EUR) and Japanese Yen (JPY).
    Use this tool when the user wants to know how much money is worth in EUR or JPY.
    Arguments:
    - amount: The numerical value to convert.
    - from_currency: The currency code you are converting FROM (EUR or JPY).
    - to_currency: The currency code you are converting TO (EUR or JPY).
    """
    eur_to_jpy_rate = 183.08
    
    from_curr = from_currency.upper()
    to_curr = to_currency.upper()

    if from_curr == "EUR" and to_curr == "JPY":
        result = amount * eur_to_jpy_rate
        return f"{amount} EUR is approximately {result:.2f} JPY."
    
    elif from_curr == "JPY" and to_curr == "EUR":
        result = amount / eur_to_jpy_rate
        return f"{amount} JPY is approximately {result:.2f} EUR."
    
    else:
        return "Error: This tool only supports conversions between EUR and JPY."

tools = [convert_currency]

print("Agent updated with Currency Converter!")

Agent updated with Currency Converter!


In [2]:
from langchain.agents import create_agent

simple_agent = create_agent(
    model="gpt-5-nano",
    tools=tools,
    system_prompt="""You are a helpful financial assistant. 
    Use the convert_currency tool for any money-related questions involving Euros or Yen. 
    Always explain your reasoning and state the exchange rate used."""
)

print("Agent created successfully!")
print(f"Type: {type(simple_agent)}")

Agent created successfully!
Type: <class 'langgraph.graph.state.CompiledStateGraph'>


In [7]:
response = simple_agent.invoke(
    {"messages": [{"role": "user", "content": "How many Japanese Yen is 50 Euros?"}]}
)

print("Agent Response:")
print(response["messages"][-1].content)

Agent Response:
Approximately 9,154 Japanese Yen.

Reasoning:
- The conversion result indicates 50 EUR ‚âà 9,154 JPY.
- Implied exchange rate: 9,154 JPY / 50 EUR = 183.08 JPY per EUR.
- Calculation: 50 EUR √ó 183.08 JPY/EUR ‚âà 9,154 JPY.

Note: Exchange rates fluctuate and may vary at the exact time of your transaction; fees could also affect the final amount.


In [8]:
# Test your custom tool with the agent
# Let's see the full conversation to understand the agent loop
print("Full Agent Conversation:")
print("=" * 50)
for msg in response["messages"]:
    role = msg.type if hasattr(msg, 'type') else 'unknown'
    content = msg.content if hasattr(msg, 'content') else str(msg)
    print(f"\n[{role.upper()}]")
    print(content[:500] if len(str(content)) > 500 else content)

Full Agent Conversation:

[HUMAN]
How many Japanese Yen is 50 Euros?

[AI]


[TOOL]
50.0 EUR is approximately 9154.00 JPY.

[AI]
Approximately 9,154 Japanese Yen.

Reasoning:
- The conversion result indicates 50 EUR ‚âà 9,154 JPY.
- Implied exchange rate: 9,154 JPY / 50 EUR = 183.08 JPY per EUR.
- Calculation: 50 EUR √ó 183.08 JPY/EUR ‚âà 9,154 JPY.

Note: Exchange rates fluctuate and may vary at the exact time of your transaction; fees could also affect the final amount.


---
# ü§ù Breakout Room #2
## Middleware - Agentic RAG with Qdrant

Now that we understand the basics of agents, let's build something more powerful: an **Agentic RAG** system.

Traditional RAG follows a fixed pattern: retrieve ‚Üí generate. But **Agentic RAG** gives the agent control over when and how to retrieve information, making it more flexible and intelligent.

We'll also introduce **middleware** - hooks that let us customize the agent's behavior at every step.

## Task 6: Loading & Chunking Documents

We'll use the same Health & Wellness Guide from Session 2 to maintain continuity.

In [9]:
# Load the document using our aimakerspace utilities
from aimakerspace.text_utils import TextFileLoader, CharacterTextSplitter

# Load the document
text_loader = TextFileLoader("data/HealthWellnessGuide.txt")
documents = text_loader.load_documents()

print(f"Loaded {len(documents)} document(s)")
print(f"Total characters: {sum(len(doc) for doc in documents):,}")

Loaded 1 document(s)
Total characters: 16,206


In [10]:
# Split the documents into chunks
text_splitter = CharacterTextSplitter(
    chunk_size=500,
    chunk_overlap=100
)

chunks = text_splitter.split_texts(documents)

print(f"Split into {len(chunks)} chunks")
print(f"\nSample chunk:")
print("-" * 50)
print(chunks[0][:300] + "...")

Split into 41 chunks

Sample chunk:
--------------------------------------------------
The Personal Wellness Guide
A Comprehensive Resource for Health and Well-being

PART 1: EXERCISE AND MOVEMENT

Chapter 1: Understanding Exercise Basics

Exercise is one of the most important things you can do for your health. Regular physical activity can improve your brain health, help manage weigh...


## Task 7: Setting up Qdrant Vector Database

Qdrant is a production-ready vector database. We'll use an in-memory instance for development, but the same code works with a hosted Qdrant instance.

Key concepts:
- **Collection**: A namespace for storing vectors (like a table in SQL)
- **Points**: Individual vectors with optional payloads (metadata)
- **Distance**: How similarity is measured (we'll use cosine similarity)

In [11]:
os.environ['NO_PROXY'] = 'openaipublic.blob.core.windows.net'

In [12]:
os.environ['REQUESTS_CA_BUNDLE'] = 'C:/Users/dznidaric/Documents/certs/cacert.pem'
os.environ['SSL_CERT_FILE'] = 'C:/Users/dznidaric/Documents/certs/cacert.pem'

In [13]:
from langchain_openai import OpenAIEmbeddings
from langchain_qdrant import QdrantVectorStore
from qdrant_client import QdrantClient
from qdrant_client.http.models import Distance, VectorParams

# Initialize the embedding model
embedding_model = OpenAIEmbeddings(model="text-embedding-3-small")

# Get embedding dimension
sample_embedding = embedding_model.embed_query("test")
embedding_dim = len(sample_embedding)
print(f"Embedding dimension: {embedding_dim}")

Embedding dimension: 1536


In [14]:
# Create Qdrant client (in-memory for development)
qdrant_client = QdrantClient(":memory:")

# Create a collection for our wellness documents
collection_name = "wellness_knowledge_base"

qdrant_client.create_collection(
    collection_name=collection_name,
    vectors_config=VectorParams(
        size=embedding_dim,
        distance=Distance.COSINE
    )
)

print(f"Created collection: {collection_name}")

Created collection: wellness_knowledge_base


In [15]:
# Create the vector store and add documents
from langchain_core.documents import Document

# Convert chunks to LangChain Document objects
langchain_docs = [Document(page_content=chunk) for chunk in chunks]

# Create vector store
vector_store = QdrantVectorStore(
    client=qdrant_client,
    collection_name=collection_name,
    embedding=embedding_model
)

# Add documents to the vector store
vector_store.add_documents(langchain_docs)

print(f"Added {len(langchain_docs)} documents to vector store")

Added 41 documents to vector store


In [16]:
# Test the retriever
retriever = vector_store.as_retriever(search_kwargs={"k": 3})

test_results = retriever.invoke("How can I improve my sleep?")

print("Retrieved documents:")
for i, doc in enumerate(test_results, 1):
    print(f"\n--- Document {i} ---")
    print(doc.page_content[:200] + "...")

Retrieved documents:

--- Document 1 ---
 memory and learning

Chapter 8: Improving Sleep Quality

Sleep hygiene refers to habits and practices that promote consistent, quality sleep.

Essential sleep hygiene practices:
- Maintain a consiste...

--- Document 2 ---
 Avoid caffeine after 2 PM
- Exercise regularly, but not too close to bedtime
- Limit alcohol and heavy meals before bed

Creating an optimal sleep environment:
- Temperature: 65-68 degrees Fahrenheit...

--- Document 3 ---
de for sunlight
4. Power pose for 2 minutes
5. Healthy snack (nuts, fruit)
6. Brief walk around the block
7. Upbeat music
8. Splash cold water on face

Sleep Checklist:
- Room temperature 65-68F
- Bla...


## Task 8: Creating a RAG Tool

Now we'll wrap our retriever as a tool that the agent can use. This is the key to **Agentic RAG** - the agent decides when to retrieve information.

In [20]:
from langchain_core.tools import tool

@tool
def search_wellness_knowledge(query: str) -> str:
    """Search the wellness knowledge base for information about health, fitness, nutrition, sleep, and mental wellness.
    
    Use this tool when the user asks questions about:
    - Physical health and fitness
    - Nutrition and diet
    - Sleep and rest
    - Mental health and stress management
    - General wellness tips
    
    Args:
        query: The search query to find relevant wellness information
    """
    results = retriever.invoke(query)
    
    if not results:
        return "No relevant information found in the wellness knowledge base."
    
    # Format the results
    formatted_results = []
    for i, doc in enumerate(results, 1):
        formatted_results.append(f"[Source {i}]:\n{doc.page_content}")
    
    return "\n\n".join(formatted_results)

print(f"Tool created: {search_wellness_knowledge.name}")
print(f"Description: {search_wellness_knowledge.description[:100]}...")

Tool created: search_wellness_knowledge
Description: Search the wellness knowledge base for information about health, fitness, nutrition, sleep, and ment...


## Task 9: Introduction to Middleware

**Middleware** in LangChain 1.0 allows you to hook into the agent loop at various points:

```
                       MIDDLEWARE HOOKS                 
                                                        
   +--------------+                    +--------------+ 
   | before_model | --> MODEL CALL --> | after_model  | 
   +--------------+                    +--------------+ 
                                                        
   +-------------------+                                
   | wrap_model_call   |  (intercept and modify calls)  
   +-------------------+                                
```

Common use cases:
- **Logging**: Track what the agent is doing
- **Guardrails**: Filter or modify inputs/outputs
- **Rate limiting**: Control API usage
- **Human-in-the-loop**: Pause for human approval

LangChain provides middleware through **decorator functions** that hook into specific points in the agent loop.

In [17]:
from langchain.agents.middleware import before_model, after_model

# Track how many model calls we've made
model_call_count = 0

@before_model
def log_before_model(state, runtime):
    """Called before each model invocation."""
    global model_call_count
    model_call_count += 1
    message_count = len(state.get("messages", []))
    print(f"[LOG] Model call #{model_call_count} - Messages in state: {message_count}")
    return None  # Return None to continue without modification

@after_model
def log_after_model(state, runtime):
    """Called after each model invocation."""
    last_message = state.get("messages", [])[-1] if state.get("messages") else None
    if last_message:
        has_tool_calls = hasattr(last_message, 'tool_calls') and last_message.tool_calls
        print(f"[LOG] After model - Tool calls requested: {has_tool_calls}")
    return None

print("Logging middleware created!")

Logging middleware created!


In [None]:
# You can also use the built-in ModelCallLimitMiddleware to prevent runaway agents
from langchain.agents.middleware import ModelCallLimitMiddleware

# This middleware will stop the agent after 10 model calls per thread
call_limiter = ModelCallLimitMiddleware(
    thread_limit=10,  # Max calls per conversation thread
    run_limit=5,      # Max calls per single run
    exit_behavior="end"  # What to do when limit is reached
)

print("Call limit middleware created!")
print(f"  - Thread limit: {call_limiter.thread_limit}")
print(f"  - Run limit: {call_limiter.run_limit}")

Call limit middleware created!
  - Thread limit: 10
  - Run limit: 5


## Task 10: Building Agentic RAG with Middleware

Now let's put it all together: an agentic RAG system with middleware support!

In [23]:
from langchain.agents import create_agent

# Reset the call counter
model_call_count = 0

# Define our tools - include the RAG tool and the calculator from earlier
rag_tools = [
    search_wellness_knowledge,
    calculate,
    get_current_time
]

# Create the agentic RAG system with middleware
wellness_agent = create_agent(
    model="gpt-5-nano",
    tools=rag_tools,
    system_prompt="""You are a helpful wellness assistant with access to a comprehensive health and wellness knowledge base.

Your role is to:
1. Answer questions about health, fitness, nutrition, sleep, and mental wellness
2. Always search the knowledge base when the user asks wellness-related questions
3. Provide accurate, helpful information based on the retrieved context
4. Be supportive and encouraging in your responses
5. If you cannot find relevant information, say so honestly

Remember: Always cite information from the knowledge base when applicable.""",
    middleware=[
        log_before_model,
        log_after_model,
        call_limiter
    ]
)

print("Wellness Agent created with middleware!")

Wellness Agent created with middleware!


In [24]:
# Test the wellness agent
print("Testing Wellness Agent")
print("=" * 50)

response = wellness_agent.invoke(
    {"messages": [{"role": "user", "content": "What are some tips for better sleep?"}]}
)

print("\n" + "=" * 50)
print("FINAL RESPONSE:")
print("=" * 50)
print(response["messages"][-1].content)

Testing Wellness Agent
[LOG] Model call #1 - Messages in state: 1
[LOG] After model - Tool calls requested: [{'name': 'search_wellness_knowledge', 'args': {'query': 'tips for better sleep'}, 'id': 'call_HshQeAzWHj6Twcd3wvNAzatg', 'type': 'tool_call'}]
[LOG] Model call #2 - Messages in state: 3
[LOG] After model - Tool calls requested: []

FINAL RESPONSE:
Here are practical tips for better sleep, based on reputable sleep-hygiene guidance:

- Keep a consistent schedule: go to bed and wake up at roughly the same times every day, even on weekends. This helps regulate your body clock. (Source 3)

- Create a relaxing pre-bed routine: activities like reading, gentle stretching, or a warm bath can signal your body it‚Äôs time to wind down. (Source 3)

- Make the sleep environment comfortable: keep the bedroom cool, dark, and quiet. A temperature around 65‚Äì68¬∞F (18‚Äì20¬∞C) is often recommended, and consider blackout curtains or a sleep mask, plus white noise if needed. (Sources 2, 3)

- Lim

In [25]:
# Test with a more complex query
print("Testing with complex query")
print("=" * 50)

response = wellness_agent.invoke(
    {"messages": [{"role": "user", "content": "I'm feeling stressed and having trouble sleeping. What should I do, and if I sleep 6 hours a night for a week, how many total hours is that?"}]}
)
print("\n" + "=" * 50)
print("FINAL RESPONSE:")
print("=" * 50)
print(response["messages"][-1].content)

Testing with complex query
[LOG] Model call #3 - Messages in state: 1
[LOG] After model - Tool calls requested: [{'name': 'search_wellness_knowledge', 'args': {'query': 'stress sleep tips sleep hygiene techniques for sleep difficulties adults'}, 'id': 'call_Dxtpocfr5VudKfS3aJdlJRLU', 'type': 'tool_call'}]
[LOG] Model call #4 - Messages in state: 3
[LOG] After model - Tool calls requested: []

FINAL RESPONSE:
I‚Äôm glad you asked. Here‚Äôs a practical plan based on our wellness guidelines, plus the quick math you asked for.

Ideas to help with stress and sleep tonight
- Set a consistent sleep schedule: try to go to bed and wake up at about the same times every day, including weekends. (Source: Sleep hygiene basics)
- Create a relaxing pre-sleep routine: reading, gentle stretching, or a warm bath can help signal your body it‚Äôs time to wind down. (Source: Sleep hygiene basics)
- Make your sleep environment comfortable: keep the room cool, dark, and quiet. Aim for about 65‚Äì68¬∞F; use b

In [26]:
# Test the agent's ability to know when NOT to use RAG
print("Testing agent decision-making (should NOT use RAG)")
print("=" * 50)

response = wellness_agent.invoke(
    {"messages": [{"role": "user", "content": "What is 125 * 8?"}]}
)

print("\n" + "=" * 50)
print("FINAL RESPONSE:")
print("=" * 50)
print(response["messages"][-1].content)

Testing agent decision-making (should NOT use RAG)
[LOG] Model call #5 - Messages in state: 1
[LOG] After model - Tool calls requested: [{'name': 'calculate', 'args': {'expression': '125 * 8'}, 'id': 'call_sRxbS0tOovcVOgwt8Dl5VXUL', 'type': 'tool_call'}]
[LOG] Model call #6 - Messages in state: 3
[LOG] After model - Tool calls requested: []

FINAL RESPONSE:
125 √ó 8 = 1,000.


### Visualizing the Agent

The agent created by `create_agent` is built on LangGraph, so we can visualize its structure.

In [31]:
import os
import ssl

# This tells the 'requests' library and others to ignore certificate errors
os.environ['CURL_CA_BUNDLE'] = ''
ssl._create_default_https_context = ssl._create_unverified_context

In [None]:
# Display the agent graph
try:
    from IPython.display import display, Image
    display(Image(wellness_agent.get_graph().draw_mermaid_png()))
except Exception as e:
    print(f"Could not display graph: {e}")
    print("\nAgent structure:")
    print(wellness_agent.get_graph().draw_ascii())

In [29]:
%pip install grandalf

Note: you may need to restart the kernel to use updated packages.


c:\Users\dznidaric\Documents\github\AIE9\03_The_Agent_Loop\.venv\Scripts\python.exe: No module named pip


In [33]:
import grandalf
print("\nAgent structure:")
print(wellness_agent.get_graph().draw_ascii())

ModuleNotFoundError: No module named 'grandalf'

---
## ‚ùì Question #3:

How does **Agentic RAG** differ from traditional RAG? What are the advantages and potential disadvantages of letting the agent decide when to retrieve information?

##### ‚úÖ Answer:
Traditional RAG follows a linear "retrieve-then-answer" script, whereas Agentic RAG acts as a researcher that can plan, self-correct, and use multiple tools iteratively. It chooses the best tool for the job, whether that's a database, a calculator, or the internet. Let's say we need two different complex tasks, Agentic RAG can combine multiple tools to get the job done.
Disadvantages of agentic RAG is higher cost, longer response time.

## ‚ùì Question #4:

Looking at the middleware examples (`log_before_model`, `log_after_model`, and `ModelCallLimitMiddleware`), describe a real-world scenario where middleware would be essential for a production agent. What specific middleware hooks would you use and why?

##### ‚úÖ Answer: 
Budget protection and safety compliance - In this scenario, we use middleware to ensure the agent stays within financial limits, respects privacy, and doesn't get stuck in infinite reasoning loops.


#### Middleware hooks:

- **1.`before_model`**: Input sanitization and budget checks
    - calculate the total tokens in the current context
    - scan the message user sent for sensitive data (like phone number or private national ID number) and mask them before they ever reach the model provider's servers
- **2. `after_model`**: Response validation and loop prevention
    - essential for "sanity checking" what the AI just produced before the system acts on it
    - run a small, fast heuristic to see if the model's output contains restricted phrases or if it‚Äôs hallucinating tool names that don't exist
    - If log_after_model detects that the agent has called the same RAG tool 5 times with the exact same query, the middleware can force the agent to stop and apologize to the user rather than continuing to loop


---
## üèóÔ∏è Activity #2: Enhance the Agentic RAG System

Now it's your turn! Enhance the wellness agent by implementing ONE of the following:

### Option A: Add a New Tool
Create a new tool that the agent can use. Ideas:
- A tool that calculates BMI given height and weight
- A tool that estimates daily calorie needs
- A tool that creates a simple workout plan

### Option B: Create Custom Middleware
Build middleware that adds new functionality:
- Middleware that tracks which tools are used most frequently
- Middleware that adds a friendly greeting to responses
- Middleware that enforces a response length limit

### Option C: Improve the RAG Tool
Enhance the retrieval tool:
- Add metadata filtering
- Implement reranking of results
- Add source citations with relevance scores

In [52]:
@before_model
def safety_guardrail(state, runtime):
    """Checks user input for crisis keywords before the LLM processes it."""
    user_input = state.get("messages", [])[-1].content.lower()
    
    crisis_keywords = ["suicide", "kill myself", "end it all", "self-harm", "want to die", "kms"]

    CRISIS_MESSAGE = """It sounds like you're going through a very difficult time. Please know that you are not alone. 
    In the EU, you can call 112 for emergency services anytime. 
    You can also call 116 123 (the European emotional support number) to talk to someone who can help. 
    These services are free, confidential, and available 24/7."""
    
    if any(word in user_input for word in crisis_keywords):
        send_alert_to_human_team(user_input)
        
        return {
            "messages": state["messages"] + [{"role": "assistant", "content": CRISIS_MESSAGE}],
            "status": "escalated_to_human",
            "interrupt": True  # This locks the agent so it won't respond anymore
        }
    return None

def send_alert_to_human_team(message):
    # API call to Slack, PagerDuty, or your custom dashboard
    print(f"ALARM: User needs human help. Message: {message}")

In [53]:
rag_tools = [
    search_wellness_knowledge,
    calculate,
    get_current_time
]

safety_agent = create_agent(
    model="gpt-5-nano",
    tools=rag_tools,
    middleware=[safety_guardrail, log_before_model, call_limiter],
    system_prompt="""You are a helpful wellness assistant with access to a comprehensive health and wellness knowledge base.

Your role is to:
1. Answer questions about health, fitness, nutrition, sleep, and mental wellness
2. Always search the knowledge base when the user asks wellness-related questions
3. Provide accurate, helpful information based on the retrieved context
4. Be supportive and encouraging in your responses
5. If you cannot find relevant information, say so honestly

Remember: Always cite information from the knowledge base when applicable."""
)

In [55]:
# Test your enhanced agent here
print("Testing Wellness Agent")
print("=" * 50)

response = safety_agent.invoke(
    {"messages": [{"role": "user", "content": "What are some tips for better sleep?"}]}
)

print("\n" + "=" * 50)
print("FINAL RESPONSE:")
print("=" * 50)
print(response["messages"][-1].content)

Testing Wellness Agent
[LOG] Model call #10 - Messages in state: 1
[LOG] Model call #11 - Messages in state: 3

FINAL RESPONSE:
Here are practical tips for better sleep, based on our sleep resources:

- Maintain a consistent sleep and wake time every day (including weekends) to reinforce your body's rhythm. (Source 1)
- Create a relaxing bedtime routine to signal to your body that it‚Äôs time to wind down. (Source 1)
- Optimize your sleep environment:
  - Keep the room at about 65‚Äì68¬∞F (18‚Äì20¬∞C). (Source 1)
  - Use blackout curtains or a sleep mask to keep things dark. (Source 1)
  - Keep the room quiet (consider white noise or earplugs). (Source 3)
  - Ensure a comfortable mattress and pillows. (Source 1)
- Limit exposure to screens for at least 1 hour before bed. (Source 1)
- Avoid caffeine after 2 PM. (Source 3)
- Exercise regularly, but don‚Äôt do vigorous workouts right before bed. (Source 3)
- Limit alcohol and heavy meals close to bedtime. (Source 3)
- Get some daylight ex

In [54]:
# Test your enhanced agent here
print("Testing Wellness Agent")
print("=" * 50)

response = safety_agent.invoke(
    {"messages": [{"role": "user", "content": "I feel like I want to end it all"}]}
)

print("\n" + "=" * 50)
print("FINAL RESPONSE:")
print("=" * 50)
print(response["messages"][-1].content)

Testing Wellness Agent
ALARM: User needs human help. Message: i feel like i want to end it all
[LOG] Model call #8 - Messages in state: 2
[LOG] Model call #9 - Messages in state: 4

FINAL RESPONSE:
I'm really glad you told me how you‚Äôre feeling. I‚Äôm here with you, and you don‚Äôt have to face this alone.

If you‚Äôre in the EU, you can:
- Call 112 for emergency services anytime.
- Call 116 123 for emotional support. These services are free, confidential, and available 24/7.

If you‚Äôre outside the EU, please contact your local emergency number or a crisis line in your country. If you‚Äôre in immediate danger, please seek local emergency help right away.

In the moment, you might find these quick relief techniques helpful:
- Box breathing: inhale for 4, hold for 4, exhale for 4, hold for 4.
- Grounding: Name 5 things you can see, 4 you can hear, 3 you can feel, 2 you can smell, 1 you can taste.
- Step outside for fresh air or take a short walk.
- Progressive muscle relaxation: tens