![Redis](https://redis.io/wp-content/uploads/2024/04/Logotype.svg?auto=webp&quality=85,75&width=120)

# Engineering Retrieved Context with RAG

## From Context Engineering to Retrieval-Augmented Generation

In Section 1, you learned about the four core context types:
1. **System Context** - The AI's role and domain knowledge
2. **User Context** - Personal profiles and preferences  
3. **Conversation Context** - Dialogue history and flow
4. **Retrieved Context** - Dynamic information from external sources

This notebook focuses on **Retrieved Context** - the most powerful and complex context type. You'll learn how to build a production-ready RAG (Retrieval-Augmented Generation) system that dynamically fetches relevant information to enhance AI responses.

## What You'll Learn

**RAG Fundamentals:**
- What RAG is and why it's essential for context engineering
- How vector embeddings enable semantic search
- Building a complete RAG pipeline with LangChain and Redis

**Practical Implementation:**
- Generate and ingest course data using existing utilities
- Set up Redis vector store for semantic search
- Implement retrieval and generation workflows
- Combine retrieved context with user and system context

**Foundation for Advanced Topics:**
- This RAG system becomes the base for Section 3 (Memory Systems for Context Engineering)
- You'll add LangGraph state management and tools in later sections
- Focus here is purely on retrieval ‚Üí context assembly ‚Üí generation

**Time to complete:** 45-50 minutes

---

## Why RAG Matters for Context Engineering

### The Challenge: Static vs. Dynamic Knowledge

In Section 1, we used **hardcoded** course information in the system context:

```python
system_context = """You are a Redis University course advisor.

Available Courses:
- RU101: Introduction to Redis (Beginner, 4-6 hours)
- RU201: Redis for Python (Intermediate, 6-8 hours)
...
"""
```

**Problems with this approach:**
- ‚ùå **Doesn't scale** - Can't hardcode thousands of courses
- ‚ùå **Wastes tokens** - Includes irrelevant courses in every request
- ‚ùå **Hard to update** - Requires code changes to add/modify courses
- ‚ùå **No personalization** - Same courses shown to everyone

### The Solution: Retrieval-Augmented Generation (RAG)

RAG solves these problems by **dynamically retrieving** only the most relevant information:

```
User Query: "I want to learn about vector search"
     ‚Üì
Semantic Search: Find courses matching "vector search"
     ‚Üì
Retrieved Context: RU301 - Vector Similarity Search with Redis
     ‚Üì
LLM Generation: Personalized recommendation using retrieved context
```

**Benefits:**
- ‚úÖ **Scales infinitely** - Store millions of documents
- ‚úÖ **Token efficient** - Only retrieve what's relevant
- ‚úÖ **Easy to update** - Add/modify data without code changes
- ‚úÖ **Personalized** - Different results for different queries

### RAG as "Retrieved Context" from Section 1

Remember the four context types? RAG is how we implement **Retrieved Context** in production:

| Context Type | Storage | Retrieval Method | Example |
|--------------|---------|------------------|---------|
| System Context | Hardcoded | Always included | AI role, instructions |
| User Context | Database | User ID lookup | Student profile |
| Conversation Context | Session store | Session ID lookup | Chat history |
| **Retrieved Context** | **Vector DB** | **Search** | **Relevant courses** |

---

## Setup and Environment

Let's prepare our environment with the necessary dependencies.

In [1]:
import json
import os
import sys

import tiktoken
from dotenv import load_dotenv

# Load environment variables
load_dotenv()

# Verify required environment variables
required_vars = ["OPENAI_API_KEY"]
missing_vars = [var for var in required_vars if not os.getenv(var)]

if missing_vars:
    print(
        f"""
‚ö†Ô∏è  Missing required environment variables: {', '.join(missing_vars)}

Please create a .env file with:
OPENAI_API_KEY=your_openai_api_key
REDIS_URL=redis://localhost:6379

For Redis setup:
- Local: docker run -d -p 6379:6379 redis/redis-stack-server:latest
- Cloud: https://redis.com/try-free/
"""
    )
    sys.exit(1)
REDIS_URL = "redis://localhost:6379"
print("‚úÖ Environment variables loaded")
print(f"   REDIS_URL: {REDIS_URL}")
print(f"   OPENAI_API_KEY: {'‚úì Set' if os.getenv('OPENAI_API_KEY') else '‚úó Not set'}")

‚úÖ Environment variables loaded
   REDIS_URL: redis://localhost:6379
   OPENAI_API_KEY: ‚úì Set


In [2]:
# Utility: Token counter
def count_tokens(text: str, model: str = "gpt-4o") -> int:
    """Count tokens in text using tiktoken."""
    encoding = tiktoken.encoding_for_model(model)
    return len(encoding.encode(text))

print("‚úÖ Utility functions loaded")

‚úÖ Utility functions loaded


### Install Dependencies

We'll use LangChain for RAG orchestration and Redis for vector storage.

In [3]:
# Install required packages (uncomment if needed)
# %pip install -q langchain langchain-openai langchain-redis redisvl redis python-dotenv

print("‚úÖ Dependencies ready")

‚úÖ Dependencies ready


---

## üìä Step 1: Understanding Vector Embeddings

Before building our RAG system, let's understand the core concept: **vector embeddings**.

### What Are Embeddings?

Embeddings convert text into numerical vectors that capture semantic meaning:

```
Text: "Introduction to Redis"
  ‚Üì (embedding model)
Vector: [0.23, -0.45, 0.67, ..., 0.12]  # 1536 dimensions for OpenAI
```

**Key insight:** Similar texts have similar vectors (measured by cosine similarity).

### Why Embeddings Enable Semantic Search

Traditional keyword search:
- Query: "machine learning courses" 
- Matches: Only documents containing exact words "machine learning"
- Misses: "AI courses", "neural network classes", "deep learning programs"

Semantic search with embeddings:
- Query: "machine learning courses"
- Matches: All semantically similar content (AI, neural networks, deep learning, etc.)
- Works across synonyms, related concepts, and different phrasings

Let's see this in action:

In [4]:
from langchain_openai import OpenAIEmbeddings

# Initialize embedding model
embeddings = OpenAIEmbeddings(model="text-embedding-3-small")

# Generate embeddings for similar and different texts
texts = [
    "Introduction to machine learning and neural networks",
    "Learn about AI and deep learning fundamentals",
    "Database administration and SQL queries",
]

# Get embeddings (this calls OpenAI API)
vectors = embeddings.embed_documents(texts)

print(f"‚úÖ Generated embeddings for {len(texts)} texts")
print(f"   Vector dimensions: {len(vectors[0])}")
print(
    f"   First vector preview: [{vectors[0][0]:.3f}, {vectors[0][1]:.3f}, {vectors[0][2]:.3f}, ...]"
)

‚úÖ Generated embeddings for 3 texts
   Vector dimensions: 1536
   First vector preview: [-0.030, -0.013, 0.001, ...]


### Measuring Semantic Similarity

Let's calculate cosine similarity to see which texts are semantically related:

In [5]:
import numpy as np


def cosine_similarity(vec1, vec2):
    """Calculate cosine similarity between two vectors."""
    return np.dot(vec1, vec2) / (np.linalg.norm(vec1) * np.linalg.norm(vec2))


# Compare similarities
sim_1_2 = cosine_similarity(vectors[0], vectors[1])  # ML vs AI (related)
sim_1_3 = cosine_similarity(vectors[0], vectors[2])  # ML vs Database (unrelated)
sim_2_3 = cosine_similarity(vectors[1], vectors[2])  # AI vs Database (unrelated)

print("Semantic Similarity Scores (0=unrelated, 1=identical):")
print(f"   ML vs AI:       {sim_1_2:.3f} ‚Üê High similarity (related topics)")
print(f"   ML vs Database: {sim_1_3:.3f} ‚Üê Low similarity (different topics)")
print(f"   AI vs Database: {sim_2_3:.3f} ‚Üê Low similarity (different topics)")

Semantic Similarity Scores (0=unrelated, 1=identical):
   ML vs AI:       0.623 ‚Üê High similarity (related topics)
   ML vs Database: 0.171 ‚Üê Low similarity (different topics)
   AI vs Database: 0.177 ‚Üê Low similarity (different topics)


**üí° Key Takeaway:** Embeddings capture semantic meaning, allowing us to find relevant information even when exact keywords don't match.

---

## üìö Step 2: Generate Course Data

Now let's create realistic course data for our RAG system. We'll use the existing utilities from the reference agent.

### Understanding the Course Generation Script

The `generate_courses.py` script creates realistic course data with:
- Multiple majors (CS, Data Science, Math, Business, Psychology)
- Course templates with descriptions, prerequisites, schedules
- Realistic metadata (instructors, enrollment, difficulty levels)

Let's generate our course catalog:

In [6]:
# IGNORE: Add reference-agent to Python path because I installed reference-agent with pip
# IGNORE: sys.path.insert(0, os.path.join(os.getcwd(), 'python-recipes/context-engineering/reference-agent'))

# Initialize generator with a seed for reproducibility
import random

from redis_context_course.scripts.generate_courses import CourseGenerator

random.seed(42)

# Create generator
generator = CourseGenerator()

print("üìö Generating course catalog...")
print()

# Generate majors
majors = generator.generate_majors()
print(f"‚úÖ Generated {len(majors)} majors:")
for major in majors:
    print(f"   - {major.name} ({major.code})")

print()

# Generate courses (10 per major)
courses = generator.generate_courses(courses_per_major=10)
print(f"‚úÖ Generated {len(courses)} courses")

# Show a sample course
sample_course = courses[0]
print(
    f"""
Sample Course:
   Code: {sample_course.course_code}
   Title: {sample_course.title}
   Department: {sample_course.department}
   Difficulty: {sample_course.difficulty_level.value}
   Credits: {sample_course.credits}
   Description: {sample_course.description[:100]}...
"""
)

üìö Generating course catalog...

‚úÖ Generated 5 majors:
   - Computer Science (CS)
   - Data Science (DS)
   - Mathematics (MATH)
   - Business Administration (BUS)
   - Psychology (PSY)

‚úÖ Generated 50 courses

Sample Course:
   Code: CS001
   Title: Introduction to Programming
   Department: Computer Science
   Difficulty: beginner
   Credits: 3
   Description: Fundamental programming concepts using Python. Variables, control structures, functions, and basic d...



### Save Course Catalog to JSON

Let's save this data so we can ingest it into Redis:

In [7]:
catalog_file = "course_catalog_section2.json"
generator.save_to_json(catalog_file)

print(f"‚úÖ Course catalog saved to {catalog_file}")
print(f"   Ready for ingestion into Redis vector store")

Generated 5 majors and 50 courses
Data saved to course_catalog_section2.json
‚úÖ Course catalog saved to course_catalog_section2.json
   Ready for ingestion into Redis vector store


---

## üîß Step 3: Set Up Redis Vector Store

Now we'll configure Redis to store our course embeddings and enable semantic search.

### Understanding Redis Vector Search

Redis Stack provides vector similarity search capabilities:
- **Storage:** Courses stored as Redis hashes with vector fields
- **Indexing:** Vector index for fast similarity search (HNSW algorithm)
- **Search:** Find top-k most similar courses to a query vector using cosine similarity

### Using the Reference Agent Utilities

Instead of configuring Redis from scratch, we'll use the **production-ready utilities** from the reference agent. These utilities are already configured and tested, allowing you to focus on context engineering concepts rather than Redis configuration details.

### Import Redis Configuration

Let's import the pre-configured Redis setup:

What we're importing:
 - redis_config: A global singleton that manages all Redis connections

What it provides (lazy-initialized properties):
 - redis_config.redis_client: Redis connection for data storage
 - redis_config.embeddings: OpenAI embeddings (text-embedding-3-small)
 - redis_config.vector_index: RedisVL SearchIndex with pre-configured schema
 - redis_config.checkpointer: RedisSaver for LangGraph (used in Section 3)

Why use this:
 - Production-ready configuration (same as reference agent)
 - Proper schema with all course metadata fields
 - Vector field: 1536 dims, cosine distance, HNSW algorithm
 - No boilerplate - just import and use

In [8]:
from redis_context_course.redis_config import redis_config

print("‚úÖ Redis configuration imported")
print(f"   Redis URL: {redis_config.redis_url}")
print(f"   Vector index name: {redis_config.vector_index_name}")

‚úÖ Redis configuration imported
   Redis URL: redis://localhost:6379
   Vector index name: course_catalog


### Test Redis Connection

Let's verify Redis is running and accessible:

In [9]:
# Test connection using built-in health check
if redis_config.health_check():
    print("‚úÖ Connected to Redis")
    print(f"   Redis is healthy and ready")
else:
    print("‚ùå Redis connection failed")
    print("   Make sure Redis is running:")
    print("   - Local: docker run -d -p 6379:6379 redis/redis-stack-server:latest")
    print("   - Cloud: https://redis.com/try-free/")
    sys.exit(1)

‚úÖ Connected to Redis
   Redis is healthy and ready


### Initialize Course Manager

Now let's import the `CourseManager` - this handles all course operations, such as storage, retrieval, and search:

What it provides:
 - store_course(): Store a course with vector embedding
 - search_courses(): Semantic search with filters
 - get_course(): Retrieve course by ID
 - get_course_by_code(): Retrieve course by course code
 - recommend_courses(): Generate personalized recommendations

How it works:
 - Uses redis_config for connections (redis_client, vector_index, embeddings)
 - Automatically generates embeddings from course content
 - Uses RedisVL's VectorQuery for semantic search
 - Supports metadata filters (department, difficulty, format, etc.)

Why use this:
 - Encapsulates all Redis/RedisVL complexity
 - Same code used in reference agent (Sections 3 & 4)
 - Focus on RAG concepts, not Redis implementation details

In [10]:
from redis_context_course.course_manager import CourseManager

# Initialize course manager
course_manager = CourseManager()

print("‚úÖ Course manager initialized")
print(f"   Ready for course storage and search")
print(f"   Using RedisVL for vector operations")

21:54:55 redisvl.index.index INFO   Index already exists, not overwriting.


‚úÖ Course manager initialized
   Ready for course storage and search
   Using RedisVL for vector operations


---

## üì• Step 4: Ingest Courses into Redis

Now we'll load our course catalog into Redis with vector embeddings for semantic search.

### Understanding the Ingestion Process

The ingestion pipeline:
1. **Load** course data from JSON
2. **Generate embeddings** for each course (title + description + tags)
3. **Store** in Redis with metadata for filtering
4. **Index** vectors for fast similarity search

Let's use the existing ingestion utilities:

In [11]:
import asyncio

from redis_context_course.scripts.ingest_courses import CourseIngestionPipeline

# What we're importing:
# - CourseIngestionPipeline: Handles bulk ingestion of course data
#
# What it does:
# - Loads course catalog from JSON file
# - For each course: generates embedding + stores in Redis
# - Uses CourseManager internally for storage
# - Provides progress tracking and verification
#
# Why use this:
# - Handles batch ingestion efficiently
# - Same utility used to populate reference agent
# - Includes error handling and progress reporting

# Initialize ingestion pipeline
pipeline = CourseIngestionPipeline()

print("üöÄ Starting course ingestion...")
print()

# Run ingestion (clear existing data first)
success = await pipeline.run_ingestion(catalog_file=catalog_file, clear_existing=True)

if success:
    print()
    print("‚úÖ Course ingestion complete!")

    # Verify what was ingested
    verification = pipeline.verify_ingestion()
    print(f"   Courses in Redis: {verification['courses']}")
    print(f"   Majors in Redis: {verification['majors']}")
else:
    print("‚ùå Ingestion failed")

üöÄ Starting course ingestion...



Output()

Output()

21:54:55 httpx INFO   HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"


21:54:56 httpx INFO   HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"


21:54:57 httpx INFO   HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"


21:54:57 httpx INFO   HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"


21:54:57 httpx INFO   HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"


21:54:57 httpx INFO   HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"


21:54:57 httpx INFO   HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"


21:54:57 httpx INFO   HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"


21:54:58 httpx INFO   HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"


21:54:58 httpx INFO   HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"


21:54:58 httpx INFO   HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"


21:54:59 httpx INFO   HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"


21:54:59 httpx INFO   HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"


21:54:59 httpx INFO   HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"


21:54:59 httpx INFO   HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"


21:55:00 httpx INFO   HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"


21:55:00 httpx INFO   HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"


21:55:00 httpx INFO   HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"


21:55:00 httpx INFO   HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"


21:55:01 httpx INFO   HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"


21:55:01 httpx INFO   HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"


21:55:01 httpx INFO   HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"


21:55:01 httpx INFO   HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"


21:55:02 httpx INFO   HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"


21:55:02 httpx INFO   HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"


21:55:02 httpx INFO   HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"


21:55:03 httpx INFO   HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"


21:55:03 httpx INFO   HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"


21:55:03 httpx INFO   HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"


21:55:04 httpx INFO   HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"


21:55:04 httpx INFO   HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"


21:55:04 httpx INFO   HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"


21:55:05 httpx INFO   HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"


21:55:05 httpx INFO   HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"


21:55:05 httpx INFO   HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"


21:55:05 httpx INFO   HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"


21:55:05 httpx INFO   HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"


21:55:05 httpx INFO   HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"


21:55:06 httpx INFO   HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"


21:55:06 httpx INFO   HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"


21:55:06 httpx INFO   HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"


21:55:06 httpx INFO   HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"


21:55:07 httpx INFO   HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"


21:55:07 httpx INFO   HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"


21:55:07 httpx INFO   HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"


21:55:07 httpx INFO   HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"


21:55:08 httpx INFO   HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"


21:55:08 httpx INFO   HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"


21:55:08 httpx INFO   HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"


21:55:08 httpx INFO   HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"



‚úÖ Course ingestion complete!
   Courses in Redis: 50
   Majors in Redis: 5


### What Just Happened?

For each course, the ingestion pipeline:

1. **Created searchable content:**
   ```python
   content = f"{course.title} {course.description} {course.department} {' '.join(course.tags)}"
   ```

2. **Generated embedding vector:**
   ```python
   embedding = await embeddings.aembed_query(content)  # 1536-dim vector
   ```

3. **Stored in Redis:**
   ```python
   redis_client.hset(f"course_idx:{course.id}", mapping={
       "course_code": "CS001",
       "title": "Introduction to Programming",
       "description": "...",
       "content_vector": embedding.tobytes()  # Binary vector
   })
   ```

4. **Indexed for search:**
   - Redis automatically indexes the vector field
   - Enables fast k-NN (k-nearest neighbors) search

---

## üîç Step 5: Semantic Search - Finding Relevant Courses

Now comes the magic: semantic search. Let's query our vector store to find relevant courses.

### Basic Semantic Search

Let's search for courses related to "machine learning".

When this is called:
```python
await course_manager.search_courses(
    query=query,
    limit=3  # top_k parameter
)
```
It is performing semantic search under the hood:
1. Generates embedding for the query using OpenAI
2. Performs vector similarity search in Redis (cosine distance)
3. Returns top-k most similar courses
4. Uses RedisVL's VectorQuery under the hood

In [12]:
# We already initialized course_manager in Step 3
# It's ready to use for semantic search

# Search for machine learning courses
query = "machine learning and artificial intelligence"
print(f"üîç Searching for: '{query}'\n")

# Perform semantic search (returns top 3 most similar courses)
results = await course_manager.search_courses(query=query, limit=3)  # top_k parameter

print(f"‚úÖ Found {len(results)} relevant courses:\n")

for i, course in enumerate(results, 1):
    print(f"{i}. {course.course_code}: {course.title}")
    print(f"   Department: {course.department}")
    print(f"   Difficulty: {course.difficulty_level.value}")
    print(f"   Description: {course.description[:100]}...")
    print()

üîç Searching for: 'machine learning and artificial intelligence'



21:55:09 httpx INFO   HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"


‚úÖ Found 3 relevant courses:

1. CS007: Machine Learning
   Department: Computer Science
   Difficulty: advanced
   Description: Introduction to machine learning algorithms and applications. Supervised and unsupervised learning, ...

2. DS012: Statistics for Data Science
   Department: Data Science
   Difficulty: intermediate
   Description: Statistical methods and probability theory for data analysis. Hypothesis testing, regression, and st...

3. DS014: Statistics for Data Science
   Department: Data Science
   Difficulty: intermediate
   Description: Statistical methods and probability theory for data analysis. Hypothesis testing, regression, and st...



### Search with Filters

We can combine semantic search with metadata filters for more precise results:

How filters work:

```python
results = await course_manager.search_courses(
    query=query,
    limit=3,
    filters=filters
)
```
 - CourseManager._build_filters() converts dict to RedisVL filter expressions
 - Uses Tag filters for categorical fields (difficulty_level, format, department)
 - Uses Num filters for numeric fields (credits, year)
 - Combines filters with AND logic
 - Applied to vector search results


In [13]:
# Search for beginner-level machine learning courses
query = "machine learning"
filters = {"difficulty_level": "beginner", "format": "online"}

print(f"üîç Searching for: '{query}'\n   Filters: {filters}\n")
# How filters work:
# - CourseManager._build_filters() converts dict to RedisVL filter expressions
# - Uses Tag filters for categorical fields (difficulty_level, format, department)
# - Uses Num filters for numeric fields (credits, year)
# - Combines filters with AND logic
# - Applied to vector search results
results = await course_manager.search_courses(query=query, limit=3, filters=filters)

print(f"‚úÖ Found {len(results)} matching courses:")
for i, course in enumerate(results, 1):
    print(f"{i}. {course.course_code}: {course.title}")
    print(
        f"   Format: {course.format.value}, Difficulty: {course.difficulty_level.value}"
    )
    print()

üîç Searching for: 'machine learning'
   Filters: {'difficulty_level': 'beginner', 'format': 'online'}



21:55:09 httpx INFO   HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"


‚úÖ Found 3 matching courses:
1. DS020: Data Visualization
   Format: online, Difficulty: beginner

2. PSY043: Introduction to Psychology
   Format: online, Difficulty: beginner

3. PSY049: Introduction to Psychology
   Format: online, Difficulty: beginner



**üí° Key Insight:** We can combine:
- **Semantic search** (find courses about "machine learning")
- **Metadata filters** (only beginner, online courses)

This gives us precise, relevant results for any query. This will be a useful tool to build context for our RAG pipeline.

---

## üîó Step 6: Building the RAG Pipeline

Now let's combine everything into a complete RAG pipeline: Retrieval ‚Üí Context Assembly ‚Üí Generation.

### The RAG Flow

```
User Query
    ‚Üì
1. Semantic Search (retrieve relevant courses)
    ‚Üì
2. Context Assembly (combine system + user + retrieved context)
    ‚Üì
3. LLM Generation (create personalized response)
```

Let's implement each step:

In [14]:
from langchain_core.messages import HumanMessage, SystemMessage
from langchain_openai import ChatOpenAI

# Initialize LLM
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0.7)

print("‚úÖ LLM initialized (gpt-4o-mini)")

‚úÖ LLM initialized (gpt-4o-mini)


### Step 6.1: Retrieval Function

First, let's create a function to retrieve relevant courses:

In [15]:
async def retrieve_courses(query: str, limit: int = 3, filters: dict = None):
    """
    Retrieve relevant courses using semantic search.

    Args:
        query: User's search query
        limit: Number of courses to retrieve
        filters: Optional metadata filters

    Returns:
        List of relevant courses
    """
    results = await course_manager.search_courses(
        query=query, limit=limit, filters=filters
    )
    return results


# Test retrieval
test_query = "I want to learn about data structures"
retrieved_courses = await retrieve_courses(test_query, limit=3)

print(f"üîç Retrieved {len(retrieved_courses)} courses for: '{test_query}'")
for course in retrieved_courses:
    print(f"   - {course.course_code}: {course.title}")

21:55:09 httpx INFO   HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"


üîç Retrieved 3 courses for: 'I want to learn about data structures'
   - CS009: Data Structures and Algorithms
   - CS001: Introduction to Programming
   - CS005: Introduction to Programming


### Step 6.2: Context Assembly Function

Now let's assemble context from multiple sources (system + user + retrieved):

In [16]:
def assemble_context(
    user_query: str, retrieved_courses: list, user_profile: dict = None
):
    """
    Assemble context from multiple sources for the LLM.

    This implements the context engineering principles from Section 1:
    - System Context: AI role and instructions
    - User Context: Student profile and preferences
    - Retrieved Context: Relevant courses from vector search
    """

    # System Context: Define the AI's role
    system_context = """You are a Redis University course advisor.

Your role:
- Help students find courses that match their interests and goals
- Provide personalized recommendations based on student profiles
- Explain course prerequisites and learning paths
- Be encouraging and supportive

Guidelines:
- Only recommend courses from the provided course list
- Consider student's difficulty level preferences
- Explain your reasoning for recommendations
- Be concise but informative
"""

    # User Context: Student profile (if provided)
    user_context = ""
    if user_profile:
        user_context = f"""
Student Profile:
- Name: {user_profile.get('name', 'Student')}
- Major: {user_profile.get('major', 'Undeclared')}
- Year: {user_profile.get('year', 'N/A')}
- Interests: {', '.join(user_profile.get('interests', []))}
- Preferred Difficulty: {user_profile.get('preferred_difficulty', 'any')}
- Preferred Format: {user_profile.get('preferred_format', 'any')}
"""

    # Retrieved Context: Relevant courses from semantic search
    retrieved_context = "\nRelevant Courses:\n"
    for i, course in enumerate(retrieved_courses, 1):
        retrieved_context += f"""
{i}. {course.course_code}: {course.title}
   Department: {course.department}
   Difficulty: {course.difficulty_level.value}
   Format: {course.format.value}
   Credits: {course.credits}
   Description: {course.description}
   Prerequisites: {len(course.prerequisites)} required
"""

    # Combine all context
    full_context = system_context
    if user_context:
        full_context += user_context
    full_context += retrieved_context

    return full_context


# Test context assembly
test_profile = {
    "name": "Sarah Chen",
    "major": "Computer Science",
    "year": "Junior",
    "interests": ["machine learning", "data science"],
    "preferred_difficulty": "intermediate",
    "preferred_format": "online",
}

assembled_context = assemble_context(
    user_query=test_query,
    retrieved_courses=retrieved_courses,
    user_profile=test_profile,
)

print("‚úÖ Context assembled")
print(f"   Total length: {len(assembled_context)} characters")
print(f"   Includes: System + User + Retrieved context")

‚úÖ Context assembled
   Total length: 1537 characters
   Includes: System + User + Retrieved context


In [17]:
print(f"Observe the assembled context: \n\n{assembled_context}")

Observe the assembled context: 

You are a Redis University course advisor.

Your role:
- Help students find courses that match their interests and goals
- Provide personalized recommendations based on student profiles
- Explain course prerequisites and learning paths
- Be encouraging and supportive

Guidelines:
- Only recommend courses from the provided course list
- Consider student's difficulty level preferences
- Explain your reasoning for recommendations
- Be concise but informative

Student Profile:
- Name: Sarah Chen
- Major: Computer Science
- Year: Junior
- Interests: machine learning, data science
- Preferred Difficulty: intermediate
- Preferred Format: online

Relevant Courses:

1. CS009: Data Structures and Algorithms
   Department: Computer Science
   Difficulty: intermediate
   Format: in_person
   Credits: 4
   Description: Study of fundamental data structures and algorithms. Arrays, linked lists, trees, graphs, sorting, and searching.
   Prerequisites: 2 required

2. CS

**üéÅ Bonus:** Can you identify the different parts of the context from what we learned in section 1 from above?

**‚úÖ Answer:** Yes! Looking at the assembled context above, we can identify all three context types from Section 1:

1. **System Context** (Static)
   - The first section: "You are a Redis University course advisor..."
   - Defines the AI's role, responsibilities, and guidelines
   - Remains the same for all queries
   - Sets behavioral instructions and constraints

2. **User Context** (Dynamic, User-Specific)
   - The "Student Profile" section
   - Contains Sarah Chen's personal information: major, year, interests, preferences
   - Changes based on who is asking the question
   - Enables personalized recommendations

3. **Retrieved Context** (Dynamic, Query-Specific)
   - The "Relevant Courses" section
   - Lists the 3 courses found via semantic search for "data structures"
   - Changes based on the specific query
   - Provides the factual information the LLM needs to answer

Notice how all three work together: System Context tells the AI **how to behave**, User Context tells it **who it's helping**, and Retrieved Context provides **what information is relevant**. This is RAG in action!

### Step 6.3: Generation Function

Finally, let's generate a response using the assembled context:

In [18]:
async def generate_response(user_query: str, context: str):
    """
    Generate LLM response using assembled context.

    Args:
        user_query: User's question
        context: Assembled context (system + user + retrieved)

    Returns:
        LLM response string
    """
    messages = [SystemMessage(content=context), HumanMessage(content=user_query)]

    response = await llm.ainvoke(messages)
    return response.content


# Test generation
response = await generate_response(test_query, assembled_context)

print("\nü§ñ Generated Response:\n")
print(response)

21:55:14 httpx INFO   HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"



ü§ñ Generated Response:

Hi Sarah! It's great to hear that you're interested in learning about data structures, especially since you are a Computer Science major.

The course that fits your interest is **CS009: Data Structures and Algorithms**. Here's a bit more about it:

- **Difficulty**: Intermediate
- **Format**: In-person (please note that it may not match your preference for online)
- **Credits**: 4
- **Description**: This course covers fundamental data structures and algorithms, including arrays, linked lists, trees, graphs, sorting, and searching. 
- **Prerequisites**: There are 2 required prerequisites, which means you should check if you meet those before enrolling.

Since you're a junior and interested in machine learning and data science, understanding data structures will be crucial for optimizing algorithms and managing data efficiently.

Unfortunately, since this course is in-person, it may not fit your preferred format. If you're open to considering other online cours

### üéØ Understanding the Generated Response

Notice how the LLM's response demonstrates effective context engineering:

**üë§ Personalization from User Context:**
- Addresses Sarah by name
- References her intermediate difficulty preference
- Acknowledges her online format preference (even though the course is in-person)
- Connects to her interests (machine learning and data science)

**üìö Accuracy from Retrieved Context:**
- Recommends CS009 (which was in the retrieved courses)
- Provides correct course details (difficulty, format, credits, description)
- Mentions prerequisites accurately (2 required)

**ü§ñ Guidance from System Context:**
- Acts as a supportive advisor ("I'm here to help you succeed!")
- Explains reasoning for the recommendation
- Acknowledges the format mismatch honestly
- Stays within the provided course list

This is the power of RAG: the LLM generates a response that is **personalized** (User Context), **accurate** (Retrieved Context), and **helpful** (System Context). Without RAG, the LLM would either hallucinate course details or provide generic advice.

---

## ‚ú® Step 7: Complete RAG Function

Let's combine all three steps into a single, reusable RAG function:

In [19]:
async def rag_query(
    user_query: str, user_profile: dict = None, limit: int = 3, filters: dict = None
):
    """
    Complete RAG pipeline: Retrieve ‚Üí Assemble ‚Üí Generate

    Args:
        user_query: User's question
        user_profile: Optional student profile
        limit: Number of courses to retrieve
        filters: Optional metadata filters

    Returns:
        LLM response string
    """
    # Step 1: Retrieve relevant courses
    retrieved_courses = await retrieve_courses(user_query, limit, filters)

    # Step 2: Assemble context
    context = assemble_context(user_query, retrieved_courses, user_profile)

    # Step 3: Generate response
    response = await generate_response(user_query, context)

    return response, retrieved_courses


# Test the complete RAG pipeline
print("=" * 60)
print("COMPLETE RAG PIPELINE TEST")
print("=" * 60)
print()

query = "I'm interested in learning about databases and data management"
profile = {
    "name": "Alex Johnson",
    "major": "Data Science",
    "year": "Sophomore",
    "interests": ["databases", "data analysis", "SQL"],
    "preferred_difficulty": "intermediate",
    "preferred_format": "hybrid",
}

print(f"Query: {query}")
print()
print(f"Student: {profile['name']} ({profile['major']}, {profile['year']})")
print()

response, courses = await rag_query(query, profile, limit=3)

print("Retrieved Courses:")
for i, course in enumerate(courses, 1):
    print(f"   {i}. {course.course_code}: {course.title}")
print()

print("AI Response:")
print(response)

COMPLETE RAG PIPELINE TEST

Query: I'm interested in learning about databases and data management

Student: Alex Johnson (Data Science, Sophomore)



21:55:14 httpx INFO   HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"


21:55:19 httpx INFO   HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


Retrieved Courses:
   1. CS004: Database Systems
   2. CS009: Data Structures and Algorithms
   3. CS007: Machine Learning

AI Response:
Hi Alex! It's great to hear that you're interested in databases and data management, especially considering your major in Data Science. Based on your profile, I recommend the following course:

**CS004: Database Systems**
- **Difficulty**: Intermediate
- **Format**: Online
- **Credits**: 3
- **Description**: This course covers the design and implementation of database systems, including SQL, normalization, transactions, and database administration.
- **Prerequisites**: None

This course aligns well with your interests in databases and SQL, and being at an intermediate level, it should match your current skill set nicely. Although it's fully online, it will provide you with a solid foundation in database management which is crucial for your field.

If you're open to considering more advanced topics in the future, you might also look into **CS007: Machi

### üéØ Why This Complete RAG Function Matters

The `rag_query()` function encapsulates the entire RAG pipeline in a single, reusable interface. This is important because:

**1. Simplicity:** One function call handles retrieval ‚Üí assembly ‚Üí generation
- No need to manually orchestrate the three steps
- Clean API for building applications

**2. Consistency:** Every query follows the same pattern
- Ensures all three context types are always included
- Reduces errors from missing context

**3. Flexibility:** Easy to customize behavior
- Adjust `top_k` for more/fewer retrieved courses
- Add/remove user profile information
- Modify filters for specific use cases

**4. Production-Ready:** This pattern scales to real applications
- In Section 3, we'll add memory (conversation history)
- In Section 4, we'll add tools (course enrollment, prerequisites checking)
- The core RAG pattern remains the same

This is the foundation you'll build on throughout the rest of the course.

---

## üß™ Step 8: Try Different Queries

Let's test our RAG system with various queries to see how it handles different scenarios:

In [20]:
# Test 1: Beginner looking for programming courses
print("=" * 60)
print("TEST 1: Beginner Programming")
print("=" * 60)
print()

query1 = "I'm new to programming and want to start learning"
profile1 = {
    "name": "Maria Garcia",
    "major": "Undeclared",
    "year": "Freshman",
    "interests": ["programming", "technology"],
    "preferred_difficulty": "beginner",
    "preferred_format": "online",
}

response1, courses1 = await rag_query(query1, profile1, limit=2)
print(f"\nQuery: {query1}\n")
print("\nAI Response:\n")
print(response1)

TEST 1: Beginner Programming



21:55:20 httpx INFO   HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"


21:55:27 httpx INFO   HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"



Query: I'm new to programming and want to start learning


AI Response:

Hi Maria! It's great to hear that you're interested in starting your programming journey. Since you're a beginner and looking for online courses, I recommend considering the following:

1. **CS001: Introduction to Programming**
   - **Department:** Computer Science
   - **Difficulty:** Beginner
   - **Format:** Hybrid (note: while it's not fully online, it may still offer some online components)
   - **Credits:** 3
   - **Description:** This course covers fundamental programming concepts using Python, including variables, control structures, functions, and basic data structures.
   - **Prerequisites:** None required.

2. **CS005: Introduction to Programming**
   - **Department:** Computer Science
   - **Difficulty:** Beginner
   - **Format:** Hybrid (similar note on format)
   - **Credits:** 3
   - **Description:** This course also focuses on fundamental programming concepts using Python, covering the same materi

In [21]:
# Test 2: Advanced student looking for specialized courses
print("=" * 60)
print("TEST 2: Advanced Machine Learning")
print("=" * 60)
print()

query2 = "I want advanced courses in machine learning and AI"
profile2 = {
    "name": "David Kim",
    "major": "Computer Science",
    "year": "Senior",
    "interests": ["machine learning", "AI", "research"],
    "preferred_difficulty": "advanced",
    "preferred_format": "in-person",
}

response2, courses2 = await rag_query(query2, profile2, limit=2)
print(f"\nQuery: {query2}\n")
print("\nAI Response:\n")
print(response2)

TEST 2: Advanced Machine Learning



21:55:27 httpx INFO   HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"


21:55:31 httpx INFO   HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"



Query: I want advanced courses in machine learning and AI


AI Response:

Hi David! Based on your interests in machine learning and AI, I recommend taking CS007: Machine Learning. 

Here's why it aligns with your profile:

- **Difficulty Level**: CS007 is classified as advanced, matching your preference for more challenging coursework.
- **Content**: The course covers essential topics such as supervised and unsupervised learning and neural networks, which are crucial in both machine learning and AI research.
- **Format**: Although it is offered in a hybrid format, it still provides a strong foundation for advanced concepts in the field.

Unfortunately, the available courses do not specifically cover AI or advanced machine learning beyond CS007. However, CS007 is a solid choice to deepen your knowledge in machine learning, which is integral to AI.

If you have any other questions or need further assistance, feel free to ask!


In [22]:
# Test 3: Business student looking for relevant courses
print("=" * 60)
print("TEST 3: Business Analytics")
print("=" * 60)
print()

query3 = "What courses can help me with business analytics and decision making?"
profile3 = {
    "name": "Jennifer Lee",
    "major": "Business Administration",
    "year": "Junior",
    "interests": ["analytics", "management", "strategy"],
    "preferred_difficulty": "intermediate",
    "preferred_format": "hybrid",
}

response3, courses3 = await rag_query(query3, profile3, limit=2)
print(f"\nQuery: {query3}\n")
print()
print("\nAI Response:\n")
print(response3)

TEST 3: Business Analytics



21:55:31 httpx INFO   HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"


21:55:35 httpx INFO   HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"



Query: What courses can help me with business analytics and decision making?



AI Response:

Hi Jennifer! It's great to see your interest in business analytics and decision making. However, based on the course list provided, I don't see any specific courses that directly focus on business analytics or decision-making strategies.

The courses listed, such as BUS032 and BUS034, focus on marketing strategy, which may touch upon aspects of analytics in terms of market analysis and consumer behavior, but they do not specifically cater to business analytics or decision-making.

Given your major in Business Administration and your interests, I recommend considering courses in data analytics or management strategy if they are available in the future. For now, if you are open to exploring marketing strategy, either BUS032 or BUS034 could still provide valuable insights into strategic decision-making processes.

If you have any other specific interests or questions, feel free to ask! I'm here 

---

## üéì Key Takeaways

### What You've Learned

**1. RAG Fundamentals**
- RAG dynamically retrieves relevant information instead of hardcoding knowledge
- Vector embeddings enable semantic search (meaning-based, not keyword-based)
- RAG solves the scalability and token efficiency problems of static context

**2. The RAG Pipeline**
```
User Query ‚Üí Semantic Search ‚Üí Context Assembly ‚Üí LLM Generation
```
- **Retrieval:** Find relevant documents using vector similarity
- **Assembly:** Combine system + user + retrieved context
- **Generation:** LLM creates personalized response with full context

**3. Context Engineering in Practice**
- **System Context:** AI role and instructions (static)
- **User Context:** Student profile and preferences (dynamic, user-specific)
- **Retrieved Context:** Relevant courses from vector search (dynamic, query-specific)
- **Integration:** All three context types work together

**4. Technical Implementation with Reference Agent Utilities**
- **redis_config**: Production-ready Redis configuration (RedisVL + LangChain)
  - Manages connections, embeddings, vector index, checkpointer
  - Same configuration used in reference agent
- **CourseManager**: Handles all course operations
  - Uses RedisVL's VectorQuery for semantic search
  - Supports metadata filters with Tag and Num classes
  - Automatically generates embeddings and stores courses
- **CourseIngestionPipeline**: Bulk data ingestion
  - Loads JSON, generates embeddings, stores in Redis
  - Progress tracking and verification
- **Benefits**: Focus on RAG concepts, not Redis implementation details

### Best Practices

**Retrieval:**
- Retrieve only what's needed (top-k results)
- Use metadata filters to narrow results
- Balance between too few (missing info) and too many (wasting tokens) results
- **üí° Research Insight:** Context Rot research shows that distractors (similar-but-wrong information) have amplified negative impact in long contexts. Precision in retrieval matters more than recall. ([Context Rot paper](https://research.trychroma.com/context-rot))

**Context Assembly:**
- Structure context clearly (system ‚Üí user ‚Üí retrieved)
- Include only relevant metadata
- Keep descriptions concise but informative

**Generation:**
- Use appropriate temperature (0.7 for creative, 0.0 for factual)
- Provide clear instructions in system context
- Let the LLM explain its reasoning

---

## Part 4: Context Quality Matters

### Why Quality Engineering is Essential

You've built a working RAG system - congratulations! But there's a critical question: **What makes context "good"?**

In the next notebook, you'll learn that context engineering is real engineering - it requires the same rigor, analysis, and deliberate decision-making as any other engineering discipline. Let's preview why this matters with a concrete example.

---

### Example: The Impact of Poor vs. Well-Engineered Context

Let's see what happens when we don't engineer our context properly.

**Scenario:** A student asks about machine learning courses.

In [23]:
# Poor context: Raw JSON dump (what we might do naively)
# Get first 10 courses using a broad search
poor_context_courses = await course_manager.search_courses("course", limit=10)

poor_context = json.dumps(
    [
        {
            "id": c.id,
            "course_code": c.course_code,
            "title": c.title,
            "description": c.description,
            "department": c.department,
            "credits": c.credits,
            "difficulty_level": c.difficulty_level.value,
            "format": c.format.value,
            "instructor": c.instructor,
            "prerequisites": (
                [p.course_code for p in c.prerequisites] if c.prerequisites else []
            ),
        }
        for c in poor_context_courses
    ],
    indent=2,
)

poor_tokens = count_tokens(poor_context)

print(
    f"""‚ùå POOR CONTEXT (Naive Approach):
‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ
Courses: {len(poor_context_courses)} (unfiltered - may not be relevant)
Tokens: {poor_tokens:,}
Format: Raw JSON with all fields (including internal IDs)

Sample:
{poor_context[:300]}...
"""
)

21:55:35 httpx INFO   HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"


‚ùå POOR CONTEXT (Naive Approach):
‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ
Courses: 10 (unfiltered - may not be relevant)
Tokens: 1,257
Format: Raw JSON with all fields (including internal IDs)

Sample:
[
  {
    "id": "course_catalog:01K98Z0MEGD61VQMAV074C6NHD",
    "course_code": "MATH027",
    "title": "Calculus I",
    "description": "Differential calculus including limits, derivatives, and applications. Foundation for advanced mathematics.",
    "department": "Mathematics",
    "credits": 4,
 ...



Now let's compare with well-engineered context using our RAG system:

In [24]:
# Well-engineered context: Filtered + Optimized
query = "What machine learning courses are available?"

# Use our RAG system to get relevant courses
relevant_courses = await course_manager.search_courses(query, limit=3)

# Transform to LLM-friendly format (not raw JSON)
well_engineered_context = "\n\n".join(
    [
        f"""{course.course_code}: {course.title} ({course.difficulty_level.value})
Description: {course.description}
Department: {course.department} | Credits: {course.credits} | Format: {course.format.value}
Prerequisites: {', '.join([p.course_code for p in course.prerequisites]) if course.prerequisites else 'None'}"""
        for course in relevant_courses
    ]
)

good_tokens = count_tokens(well_engineered_context)

print(
    f"""‚úÖ WELL-ENGINEERED CONTEXT (RAG + Optimization):
‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ
Courses: {len(relevant_courses)} (filtered by semantic relevance)
Tokens: {good_tokens:,}
Format: LLM-optimized text (no internal fields, clean formatting)

Context:
{well_engineered_context}

‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ
Token Reduction: {poor_tokens - good_tokens:,} tokens ({((poor_tokens - good_tokens) / poor_tokens * 100):.1f}% reduction)
Cost Savings: ${((poor_tokens - good_tokens) / 1_000_000) * 2.50:.4f} per request
"""
)

21:55:36 httpx INFO   HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"


‚úÖ WELL-ENGINEERED CONTEXT (RAG + Optimization):
‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ
Courses: 3 (filtered by semantic relevance)
Tokens: 147
Format: LLM-optimized text (no internal fields, clean formatting)

Context:
CS007: Machine Learning (advanced)
Description: Introduction to machine learning algorithms and applications. Supervised and unsupervised learning, neural networks.
Department: Computer Science | Credits: 4 | Format: hybrid
Prerequisites: None

MATH026: Linear Algebra (intermediate)
Description: Vector spaces, matrices, eigenvalues, and linear transformations. Essential for data science and engineering.
Department: Mathematics | Credits: 3 | Format: in_person
Prerequisites: None

MATH022: Linear Algebra (intermediate)
Description: Vector spaces, matrices, eigenvalues, and linear tra

### The Difference in LLM Responses

Let's see how context quality affects the actual responses:

In [25]:
# Test with poor context
messages_poor = [
    SystemMessage(
        content=f"""You are a Redis University course advisor.

Available Courses:
{poor_context}

Help students find relevant courses."""
    ),
    HumanMessage(content=query),
]

response_poor = llm.invoke(messages_poor)

print(
    f"""‚ùå RESPONSE WITH POOR CONTEXT:
‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ
{response_poor.content}
"""
)

21:55:42 httpx INFO   HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


‚ùå RESPONSE WITH POOR CONTEXT:
‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ
Currently, there are no specific machine learning courses listed in the available course catalog. If you're interested in related topics, you might want to consider courses such as "Statistics for Data Science," which covers statistical methods and probability theory that are foundational for understanding machine learning concepts. 

Here are the relevant options:

1. **Statistics for Data Science (Hybrid)**
   - Instructor: Chris Flores
   - Credits: 4
   - Description: Statistical methods and probability theory for data analysis. Hypothesis testing, regression, and statistical inference.

2. **Statistics for Data Science (In-person)**
   - Instructor: Rhonda Rodriguez
   - Credits: 4
   - Description: Statistical methods and p

In [26]:
# Test with well-engineered context
messages_good = [
    SystemMessage(
        content=f"""You are a Redis University course advisor.

Relevant Courses:
{well_engineered_context}

Help students find the best course for their needs."""
    ),
    HumanMessage(content=query),
]

response_good = llm.invoke(messages_good)

print(
    f"""‚úÖ RESPONSE WITH WELL-ENGINEERED CONTEXT:
‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ
{response_good.content}
"""
)

21:55:46 httpx INFO   HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


‚úÖ RESPONSE WITH WELL-ENGINEERED CONTEXT:
‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ
The available machine learning course is:

**CS007: Machine Learning (advanced)**
- **Description:** Introduction to machine learning algorithms and applications. Covers supervised and unsupervised learning, and neural networks.
- **Department:** Computer Science
- **Credits:** 4
- **Format:** Hybrid
- **Prerequisites:** None

This course is suitable for students interested in gaining a deeper understanding of machine learning concepts and applications.



### Key Takeaways: Why Context Engineering Matters

From this example, you can see that well-engineered context:

1. **Reduces Token Usage** - 50-70% fewer tokens through filtering and optimization
2. **Improves Relevance** - Semantic search finds the right courses
3. **Enhances Response Quality** - LLM can focus on relevant information
4. **Saves Money** - Fewer tokens = lower API costs
5. **Scales Better** - Works with thousands of courses, not just 10

**The Engineering Mindset:**
- Context is data that requires engineering discipline
- Raw data ‚â† Good context
- Systematic transformation: Extract ‚Üí Clean ‚Üí Transform ‚Üí Optimize ‚Üí Store
- Quality metrics: Relevance, Completeness, Efficiency, Accuracy

---

### What You'll Learn in the Next Notebook

In **Notebook 2: Engineering Context for Production**, you'll dive deep into:

**Data Engineering for Context:**
- Systematic transformation pipeline (Extract ‚Üí Clean ‚Üí Transform ‚Üí Optimize ‚Üí Store)
- Three engineering approaches: RAG, Structured Views, Hybrid
- When to use each approach based on your requirements

**Chunking Strategies:**
- When does your data need chunking? (Critical first question)
- Four different chunking strategies with LangChain integration
- How to choose based on your data characteristics

**Production Pipelines:**
- Three pipeline architectures (Request-Time, Batch, Event-Driven)
- Building production-ready context preparation workflows
- Quality optimization and testing

**You'll learn to engineer context with the same rigor as any other data engineering problem.**

---

## üöÄ What's Next?

### üìä Section 2, Notebook 2: Engineering Context for Production

Now that you understand RAG fundamentals and why context quality matters, the next notebook teaches you to engineer context with production-level rigor:
- Master data engineering workflows for context preparation
- Learn chunking strategies and when to use them
- Build production-ready context pipelines
- Optimize context quality with systematic approaches

### üß† Section 3: Memory Systems for Context Engineering

In this section, you built a RAG system that retrieves relevant information for each query. But there's a problem: **it doesn't remember previous conversations**.

In Section 3, you'll add memory to your RAG system:
- **Working Memory:** Track conversation history within a session
- **Long-term Memory:** Remember user preferences across sessions
- **LangGraph Integration:** Manage stateful workflows with checkpointing
- **Redis Agent Memory Server:** Automatic memory extraction and retrieval

### ü§ñ Section 4: Tool Use and Agents

After adding memory, you'll transform your RAG system into a full agent:
- **Tool Calling:** Let the AI use functions (search, enroll, check prerequisites)
- **LangGraph State Management:** Orchestrate complex multi-step workflows
- **Agent Reasoning:** Plan and execute multi-step tasks
- **Production Patterns:** Error handling, retries, and monitoring

### The Journey

```
Section 1: Context Engineering Fundamentals
    ‚Üì
Section 2, NB1: RAG Fundamentals ‚Üê You are here
    ‚Üì
Section 2, NB2: Engineering Context for Production ‚Üê Next
    ‚Üì
Section 3: Memory Systems for Context Engineering
    ‚Üì
Section 4: Tool Use and Agents (Complete System)
```

---

## üí™ Practice Exercises

Try these exercises to deepen your understanding:

**Exercise 1: Custom Filters**
- Modify the RAG query to filter by specific departments
- Try combining multiple filters (difficulty + format + department)

**Exercise 2: Adjust Retrieval**
- Experiment with different `top_k` values (1, 3, 5, 10)
- Observe how response quality changes with more/fewer retrieved courses

**Exercise 3: Context Optimization**
- Modify the `assemble_context` function to include more/less detail
- Measure token usage and response quality trade-offs

**Exercise 4: Different Domains**
- Generate courses for a different domain (e.g., healthcare, finance)
- Ingest and test RAG with your custom data

**Exercise 5: Evaluation**
- Create test queries with expected results
- Measure retrieval accuracy (are the right courses retrieved?)
- Measure generation quality (are responses helpful and accurate?)

---

## üìù Summary

You've built a complete RAG system that:
- ‚úÖ Generates and ingests course data with vector embeddings
- ‚úÖ Performs semantic search to find relevant courses
- ‚úÖ Assembles context from multiple sources (system + user + retrieved)
- ‚úÖ Generates personalized responses using LLMs
- ‚úÖ Handles different query types and user profiles

This RAG system is the foundation for the advanced topics in Sections 3 and 4. You'll build on this exact code to add memory, tools, and full agent capabilities.

**Great work!** You've mastered Retrieved Context and built a production-ready RAG pipeline. üéâ

---

## üìö Additional Resources

### **RAG and Vector Search**
- [Retrieval-Augmented Generation Paper](https://arxiv.org/abs/2005.11401) - Original RAG paper by Facebook AI
- [Redis Vector Similarity Search](https://redis.io/docs/stack/search/reference/vectors/) - Official Redis VSS documentation
- [RedisVL Documentation](https://redisvl.com/) - Redis Vector Library for Python
- [LangChain RAG Tutorial](https://python.langchain.com/docs/tutorials/rag/) - Building RAG applications

### **Embeddings and Semantic Search**
- [OpenAI Embeddings Guide](https://platform.openai.com/docs/guides/embeddings) - Understanding text embeddings
- [Sentence Transformers](https://www.sbert.net/) - Open-source embedding models
- [HNSW Algorithm](https://arxiv.org/abs/1603.09320) - Hierarchical Navigable Small World graphs

### **LangChain and Redis Integration**
- [LangChain Documentation](https://python.langchain.com/docs/get_started/introduction) - Framework overview
- [LangChain Redis Integration](https://python.langchain.com/docs/integrations/vectorstores/redis/) - Using Redis with LangChain
- [Redis Python Client](https://redis-py.readthedocs.io/) - redis-py documentation

### **Advanced RAG Techniques**
- [Advanced RAG Patterns](https://blog.langchain.dev/deconstructing-rag/) - LangChain blog on RAG optimization
- [Advanced Search with RedisVL](https://docs.redisvl.com/en/latest/user_guide/11_advanced_queries.html) - Vector, Hybrid, Text, and Keyword Search
- [RAG Evaluation](https://arxiv.org/abs/2309.15217) - Measuring RAG system performance
