# Stage 1: The Baseline RAG Agent (Information Overload)

<a href="https://colab.research.google.com/github/redislabs-training/ce-redis-langchain/blob/main/section-1-context-engineering-foundations/04_stage_1_baseline_rag.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Introduction

Welcome to the first stage of the course. Here, we'll begin by exploring context through the lens of a basic RAG agent. More specifically, we'll do the following:

1.  Explore the baseline architecture of the simple RAG agent. Our Stage 1 Agent takes the simplest possible approach: Find relevant courses and give the LLM *everything* about them.
2.  Witness "Information Overload": See firsthand what happens when you retrieve *too much* context.
3.  Analyze the Cost: Measure the token usage and latency of a naive approach.

Let's now dive in by going over how the agent works.

## Agent Overview

The code for the baseline agent lives in `progressive_agents/stage1_baseline_rag/`. You can reference the code at any time throughout this lesson. The agent has three important components:

### 1. LangGraph Nodes (`agent/nodes.py`)
The logic is split into two functions (nodes):
*   `research_node`:
    *   Searches Redis for the top 5 courses matching the user's query.
    *   The Flaw: It retrieves the FULL hierarchical data for all 5 courses. This includes every single week of the syllabus, every homework assignment, and every reading list.
*   `synthesize_node`:
    *   Receives this massive block of text.
    *   Sends it all to the LLM with a prompt to answer the user's question.

### 2. The "Memory": State (`agent/state.py`)
We use a simple `TypedDict` to pass data between nodes.
```python
class AgentState(TypedDict):
    query: str              # The user's question
    raw_context: str        # The massive JSON blob of course data
    final_answer: str       # The LLM's response
    total_tokens: int       # Tracking our inefficiency
```

### 3. The "Blueprint": Workflow (`agent/workflow.py`)
We use LangGraph to orchestrate the flow. It's a simple linear graph:

```mermaid
graph LR
    START([Start]) --> Research[Research Node]
    Research --> Synthesize[Synthesize Node]
    Synthesize --> END([End])
    
    style Research fill:#ff9999,stroke:#333,stroke-width:2px
    style Synthesize fill:#99ccff,stroke:#333,stroke-width:2px
```

# Setup and Initialization

First, let's set up our environment. We need to import the agent code from the `progressive_agents` directory.

In [None]:
import sys
import os
import asyncio
from pathlib import Path
from dotenv import load_dotenv

# 1. Configure Paths
# We are currently in 'section-1...', so we look up two levels to find 'progressive_agents'
project_root = Path("../../").resolve()
stage1_path = project_root / "progressive_agents" / "stage1_baseline_rag"
sys.path.append(str(stage1_path))

# 2. Load Environment Variables (API Keys, Redis URL)
load_dotenv(project_root / ".env")

print(f"Project Root: {project_root}")
print(f"Agent Path Added: {stage1_path}")

### Initialize the Agent
We will use the `setup_agent` helper. This function performs a crucial step:
*   It connects to your Redis instance.
*   It checks if the course data exists.
*   If not, it generates 50 sample courses and loads them into Redis.

*Note: This might take a few seconds the first time you run it.*

In [None]:
from agent import setup_agent

print("Initializing Stage 1 Agent...")
# auto_load_courses=True ensures we have data to query
workflow, course_manager = setup_agent(auto_load_courses=True)
print("Agent is ready!")

## The Experiment: "What ML courses are available?"

Imagine a student just wants a quick list of options. They ask:
> *"What machine learning courses are available?"*

A human advisor would say: *"We have CS001 (Intro to ML) and CS002 (Deep Learning)."*

Let's see what our Baseline Agent does.

In [None]:
# Define the user's query
query = "What machine learning courses are available?"

print(f"User asks: '{query}'")
print("Running workflow...")

# Run the graph!
# We use .ainvoke() because our agent is async
result = await workflow.ainvoke({"query": query})

print("Workflow complete!")

## Analysis: The Cost of "Naive" RAG

The agent answered the question. But at what cost?

Let's inspect the metrics returned in the state.

In [None]:
# Display the Answer
print("="*60)
print(f"Agent Answer:\n\n{result['final_answer']}")
print("="*60)

# Display the Metrics
courses_found = result.get('courses_found', 0)
total_tokens = result.get('total_tokens', 0)

print(f"\nStatistics:")
print(f"   Courses Retrieved: {courses_found}")
print(f"   Total Tokens Used: {total_tokens:,}")

### Stop and Look

Look at that Total Tokens number. It is likely over 6,000 tokens.

For a simple question like *"What courses are available?"*, we used enough tokens to write a short essay.

Why?
Because the `research_node` retrieved the FULL details for every course it found. Let's peek at the `raw_context` that was sent to the LLM.

In [None]:
# Let's look at the first 2000 characters of the context sent to the LLM
raw_context = result.get('raw_context', '')

print(f"Total Context Size: {len(raw_context):,} characters")
print("-" * 40)
print("PREVIEW OF CONTEXT SENT TO LLM")
print("-" * 40)
print(raw_context[:2000] + "\n\n... [TRUNCATED 20,000+ CHARACTERS] ...")

### The "Needle in the Haystack"

Notice what is in that context:
*   `"week_1_topics"`, `"week_2_topics"`... all the way to Week 14.
*   `"assignments"`: Detailed lists of every homework.
*   `"grading_policy"`: Breakdowns of percentages.

The LLM didn't need ANY of this. It just needed the course titles and descriptions.

### The Consequences
1.  Financial Waste: You pay per token. 90% of these tokens were wasted.
2.  Latency: Processing 6,000 tokens takes significantly longer than processing 500.
3.  Distraction: When you flood the LLM with irrelevant data, it can "hallucinate" or get confused. It's like trying to find a phone number in a library by reading every book.

## Conclusion

We have successfully built a Baseline RAG Agent. It works, but it is inefficient and expensive.

Key Takeaways:
*   More is not always better. Retrieving full documents is rarely the right strategy.
*   Context matters. We need to curate what we send to the LLM.

### Next Lesson: Context Engineering
In Stage 2, we will fix this. We will introduce Context Engineering techniques to:
1.  Trim the data (remove nulls and empty fields).
2.  Filter the data (only send summaries for the initial search).
3.  Format the data (use clean markdown instead of raw JSON).

See you in the next notebook!

In [None]:
# Optional: Cleanup
# If you want to clear the database, uncomment the lines below.
# from agent import cleanup_courses
# await cleanup_courses(course_manager)
# print("Database cleaned.")