![Redis](https://redis.io/wp-content/uploads/2024/04/Logotype.svg?auto=webp&quality=85,75&width=120)

# Module 3: RAG Essentials

**‚è±Ô∏è Time:** 55 minutes

## üéØ Learning Objectives

By the end of this module, you will:

1. **Understand** how vector embeddings enable semantic search
2. **Build** a complete RAG pipeline with Redis Vector Search
3. **Apply** context transformation techniques
4. **Use** progressive disclosure (summaries first, details on-demand)

---

## üìö Part 1: Vector Embeddings (15 min)

### What Are Embeddings?

**Embeddings** convert text into numerical vectors that capture semantic meaning.

```
"machine learning" ‚Üí [0.12, -0.34, 0.56, ..., 0.89]  (1536 dimensions)
"AI algorithms"    ‚Üí [0.11, -0.32, 0.58, ..., 0.87]  (similar vector!)
"cooking recipes"  ‚Üí [-0.45, 0.67, -0.12, ..., 0.23] (different vector)
```

### Why Embeddings Matter for RAG

**Keyword search fails:**
- Query: "AI courses" ‚Üí Misses "Machine Learning 101" (no "AI" in title)

**Semantic search succeeds:**
- Query: "AI courses" ‚Üí Finds "Machine Learning 101" (semantically similar)

In [1]:
# Setup
import os
import sys
import json
from pathlib import Path

repo_root = Path.cwd().parent
src_path = repo_root / "src"
if str(src_path) not in sys.path:
    sys.path.insert(0, str(src_path))

# Load environment variables from .env file
from dotenv import load_dotenv
load_dotenv()  # Try current dir first
load_dotenv(repo_root / ".env")  # Then try parent

# Check for required variables
REDIS_URL = os.getenv("REDIS_URL", "redis://localhost:6379")
if not os.getenv("OPENAI_API_KEY"):
    print("‚ö†Ô∏è  Warning: OPENAI_API_KEY not set. Set it in your environment or .env file.")
else:
    print("‚úÖ Setup complete!")
    print(f"   REDIS_URL: {REDIS_URL}")
    print(f"   OPENAI_API_KEY: ‚úì Set")



In [2]:
# Generate embeddings with OpenAI
import numpy as np

# Check if OpenAI API key is available
DEMO_MODE = not os.getenv("OPENAI_API_KEY")

if DEMO_MODE:
    print("üìò DEMO MODE: Using pre-computed embeddings (set OPENAI_API_KEY for live mode)")
    # Pre-computed example embeddings (first 10 dims of real embeddings)
    embeddings = [
        [0.012, -0.034, 0.056, 0.078, -0.023, 0.045, -0.067, 0.089, 0.012, -0.045] + [0.0] * 1526,
        [0.011, -0.032, 0.058, 0.075, -0.021, 0.043, -0.065, 0.087, 0.010, -0.043] + [0.0] * 1526,
        [-0.045, 0.067, -0.012, 0.034, 0.056, -0.078, 0.023, -0.089, 0.045, 0.067] + [0.0] * 1526
    ]
else:
    from openai import OpenAI
    client = OpenAI()
    
    def get_embedding(text: str) -> list[float]:
        """Generate embedding for text using OpenAI ada-002."""
        response = client.embeddings.create(
            model="text-embedding-ada-002",
            input=text
        )
        return response.data[0].embedding
    
    texts = [
        "machine learning algorithms",
        "artificial intelligence courses",
        "cooking recipes for beginners"
    ]
    embeddings = [get_embedding(t) for t in texts]

print(f"Embedding dimensions: {len(embeddings[0])}")
print(f"First 5 values of 'machine learning': {[round(x, 3) for x in embeddings[0][:5]]}")

üìò DEMO MODE: Using pre-computed embeddings (set OPENAI_API_KEY for live mode)
Embedding dimensions: 1536
First 5 values of 'machine learning': [0.012, -0.034, 0.056, 0.078, -0.023]


In [3]:
# Cosine similarity - how similar are two vectors?
def cosine_similarity(a: list, b: list) -> float:
    """Calculate cosine similarity between two vectors."""
    a, b = np.array(a), np.array(b)
    return np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b))

print("Similarity Scores:")
print(f"  'machine learning' ‚Üî 'AI courses': {cosine_similarity(embeddings[0], embeddings[1]):.4f}")
print(f"  'machine learning' ‚Üî 'cooking':    {cosine_similarity(embeddings[0], embeddings[2]):.4f}")
print(f"  'AI courses' ‚Üî 'cooking':          {cosine_similarity(embeddings[1], embeddings[2]):.4f}")

Similarity Scores:
  'machine learning' ‚Üî 'AI courses': 0.9996
  'machine learning' ‚Üî 'cooking':    -0.5908
  'AI courses' ‚Üî 'cooking':          -0.5870


### üí° Recall: Data Engineering Decisions (from Module 1)

Remember our data pipeline: `Raw Data ‚Üí Extract ‚Üí Clean ‚Üí Transform ‚Üí Optimize ‚Üí Store`

For our course catalog:
- ‚úÖ Already small (~60-200 tokens per course)
- ‚úÖ Natural boundaries (each course is a unit)
- ‚úÖ No chunking needed!

We focus on **transformation** (JSON ‚Üí text) and **progressive disclosure** (summaries + details).

---

## üìö Part 2: RAG Pipeline with Redis (20 min)

### The RAG Pipeline

```
User Query ‚Üí Embed ‚Üí Search Redis ‚Üí Retrieve Docs ‚Üí Assemble Context ‚Üí Generate Response
```

### Using HierarchicalCourseManager

Our course manager implements **progressive disclosure**:
- **Summaries**: Lightweight overview (~60 tokens each)
- **Details**: Full syllabus and assignments (~200+ tokens each)

This enables efficient context engineering!

In [4]:
# Load sample course data for demonstration
# In production, this data comes from Redis Vector Search

# Load hierarchical course data from JSON
data_path = repo_root / "src" / "redis_context_course" / "data" / "hierarchical" / "hierarchical_courses.json"

if data_path.exists():
    with open(data_path) as f:
        course_data = json.load(f)
    courses = course_data.get("courses", [])
    print(f"‚úÖ Loaded {len(courses)} courses from {data_path.name}")
else:
    # Sample data if file not found
    courses = [
        {"summary": {"course_code": "CS002", "title": "Machine Learning Fundamentals", 
                     "difficulty_level": "beginner", "credits": 3,
                     "short_description": "Introduction to ML algorithms and applications"}},
        {"summary": {"course_code": "CS006", "title": "Deep Learning", 
                     "difficulty_level": "advanced", "credits": 4,
                     "short_description": "Neural networks, CNNs, RNNs, transformers"}},
    ]
    print("üìò Using sample course data")

‚úÖ Loaded 50 courses from hierarchical_courses.json


In [5]:
# Extract summaries and details from hierarchical data
summaries = [c["summary"] for c in courses[:5]]

print("Course Summaries (Tier 1 - Lightweight):\n")
for i, s in enumerate(summaries, 1):
    print(f"{i}. {s['course_code']}: {s['title']}")
    print(f"   Level: {s.get('difficulty_level', 'N/A')} | Credits: {s.get('credits', 3)}")
    desc = s.get('short_description', s.get('description', 'No description'))[:80]
    print(f"   {desc}...\n")

Course Summaries (Tier 1 - Lightweight):

1. MATH001: Linear Algebra for Machine Learning
   Level: intermediate | Credits: 4
   Matrix operations, eigenvalues, and applications to ML....

2. CS002: Machine Learning Fundamentals
   Level: advanced | Credits: 4
   Introduction to machine learning algorithms and applications....

3. MATH003: Linear Algebra for Machine Learning
   Level: intermediate | Credits: 4
   Matrix operations, eigenvalues, and applications to ML....

4. CS004: Computer Vision
   Level: advanced | Credits: 4
   Image processing, object detection, and visual recognition systems....

5. CS005: Machine Learning Fundamentals
   Level: advanced | Credits: 4
   Introduction to machine learning algorithms and applications....



In [6]:
# Get full details for the first course (Tier 2 - On-demand)
first_course = courses[0]
details = first_course.get("details", {})

print(f"Full Details for {first_course['summary']['course_code']} (On-Demand):\n")
print(f"Title: {first_course['summary']['title']}")
print(f"Instructor: {details.get('instructor', 'TBD')}")
print(f"Credits: {first_course['summary'].get('credits', 3)}")

objectives = details.get('learning_objectives', ['Learn key concepts', 'Apply techniques', 'Build projects'])
print(f"\nLearning Objectives:")
for obj in objectives[:3]:
    print(f"  ‚Ä¢ {obj}")

# Syllabus can be a dict with 'weeks' key or a list
syllabus_data = details.get('syllabus', {})
if isinstance(syllabus_data, dict):
    weeks = syllabus_data.get('weeks', [])
else:
    weeks = syllabus_data if isinstance(syllabus_data, list) else []

print(f"\nSyllabus Preview (first 3 weeks):")
for week in weeks[:3]:
    week_num = week.get('week_number', week.get('week', '?'))
    topic = week.get('topic', 'TBD')
    print(f"  Week {week_num}: {topic}")

Full Details for MATH001 (On-Demand):

Title: Linear Algebra for Machine Learning
Instructor: Rachel Yates
Credits: 4

Learning Objectives:
  ‚Ä¢ Understand core concepts in linear algebra for machine learning
  ‚Ä¢ Implement linear algebra for machine learning algorithms and techniques
  ‚Ä¢ Apply linear algebra for machine learning to real-world problems

Syllabus Preview (first 3 weeks):
  Week 1: Vectors and Vector Spaces
  Week 2: Matrix Operations
  Week 3: Linear Transformations


### Progressive Disclosure in Action

| Approach | What's Retrieved | Tokens (5 courses) |
|----------|------------------|--------------------|
| **All Details** | Full syllabus for all | ~1,000+ tokens |
| **Summaries Only** | Overview for all | ~300 tokens |
| **Progressive** | Summaries + 1 detail | ~500 tokens |

**Key Insight:** Give the LLM summaries for ALL matches, full details for TOP N.

In [7]:
# Progressive Disclosure Pattern
# In production, this search happens via Redis Vector Search
# Here we simulate the pattern

# Summaries for ALL matches (lightweight)
all_summaries = [c["summary"] for c in courses[:5]]

# Full details for TOP N matches only (on-demand)
top_details = [c.get("details", {}) for c in courses[:2]]

print(f"Progressive Disclosure Results:")
print(f"  Summaries: {len(all_summaries)} courses (lightweight, ~60 tokens each)")
print(f"  Details: {len(top_details)} courses (full info, ~200 tokens each)")
print(f"\nThis approach gives the LLM:")
print(f"  ‚Ä¢ Overview of ALL relevant courses")
print(f"  ‚Ä¢ Deep information for TOP matches")
print(f"  ‚Ä¢ Optimal token usage!")

Progressive Disclosure Results:
  Summaries: 5 courses (lightweight, ~60 tokens each)
  Details: 2 courses (full info, ~200 tokens each)

This approach gives the LLM:
  ‚Ä¢ Overview of ALL relevant courses
  ‚Ä¢ Deep information for TOP matches
  ‚Ä¢ Optimal token usage!


---

## üìö Part 3: Context Transformation (15 min)

### Why Transform Context?

Raw JSON is **token-inefficient** and **hard for LLMs to parse**:

```json
{"course_code": "CS301", "title": "Machine Learning", "credits": 4}
```

Natural text is **cleaner** and **more efficient**:

```
CS301: Machine Learning (4 credits)
```

In [8]:
# Context Transformation - Convert JSON to LLM-friendly format

def format_summary(s: dict) -> str:
    """Transform a course summary into clean text."""
    return f"{s['course_code']}: {s['title']} ({s.get('difficulty_level', 'N/A')}, {s.get('credits', 3)} credits)"

def format_details(d: dict, code: str) -> str:
    """Transform course details into clean text."""
    objectives = d.get('learning_objectives', ['Learn fundamentals'])[:3]
    # Handle syllabus as dict with 'weeks' key or as list
    syllabus_data = d.get('syllabus', {})
    if isinstance(syllabus_data, dict):
        weeks = syllabus_data.get('weeks', [])[:3]
    else:
        weeks = syllabus_data[:3] if isinstance(syllabus_data, list) else []
    topics = [w.get('topic', '') for w in weeks]
    return f"""\n--- {code} Full Details ---
Instructor: {d.get('instructor', 'TBD')}
Objectives: {', '.join(objectives)}
Topics: {', '.join(topics)}"""

# Build assembled context
context_parts = ["AVAILABLE COURSES (Summaries):"]
for s in all_summaries:
    context_parts.append(f"  ‚Ä¢ {format_summary(s)}")

context_parts.append("\nTOP MATCHES (Full Details):")
for i, d in enumerate(top_details):
    code = courses[i]["summary"]["course_code"]
    context_parts.append(format_details(d, code))

assembled_context = "\n".join(context_parts)

print("Assembled Context:")
print("="*60)
print(assembled_context)

Assembled Context:
AVAILABLE COURSES (Summaries):
  ‚Ä¢ MATH001: Linear Algebra for Machine Learning (intermediate, 4 credits)
  ‚Ä¢ CS002: Machine Learning Fundamentals (advanced, 4 credits)
  ‚Ä¢ MATH003: Linear Algebra for Machine Learning (intermediate, 4 credits)
  ‚Ä¢ CS004: Computer Vision (advanced, 4 credits)
  ‚Ä¢ CS005: Machine Learning Fundamentals (advanced, 4 credits)

TOP MATCHES (Full Details):

--- MATH001 Full Details ---
Instructor: Rachel Yates
Objectives: Understand core concepts in linear algebra for machine learning, Implement linear algebra for machine learning algorithms and techniques, Apply linear algebra for machine learning to real-world problems
Topics: Vectors and Vector Spaces, Matrix Operations, Linear Transformations

--- CS002 Full Details ---
Instructor: Elizabeth Cline
Objectives: Understand core concepts in machine learning fundamentals, Implement machine learning fundamentals algorithms and techniques, Apply machine learning fundamentals to real-w

In [9]:
# Compare token counts
import tiktoken

def count_tokens(text: str) -> int:
    encoding = tiktoken.encoding_for_model("gpt-4o")
    return len(encoding.encode(text))

# Raw JSON approach
raw_json = json.dumps(all_summaries, indent=2)
raw_tokens = count_tokens(raw_json)

# Transformed context approach
transformed_tokens = count_tokens(assembled_context)

print(f"Raw JSON: {raw_tokens} tokens")
print(f"Transformed: {transformed_tokens} tokens")
print(f"Savings: {(1 - transformed_tokens/raw_tokens)*100:.0f}%")

Raw JSON: 805 tokens
Transformed: 216 tokens
Savings: 73%


---

## üìö Part 4: Complete RAG Query (10 min)

Let's put it all together into a complete RAG query.

In [10]:
# Complete RAG prompt assembly (the pattern used in production)
def build_rag_prompt(user_query: str, course_context: str, student_profile: dict = None) -> str:
    """Assemble a complete RAG prompt with all context types."""
    
    # System context
    system_prompt = """You are a university course advisor. Help students find courses.
Use the provided course information to give accurate recommendations.
Be concise and helpful."""
    
    # User context (if provided)
    user_context = ""
    if student_profile:
        user_context = f"\n\nStudent Profile:\n{json.dumps(student_profile, indent=2)}"
    
    # Assemble full prompt
    return f"""{system_prompt}

Available Courses:
{course_context}
{user_context}

Student Question: {user_query}"""

# Build the prompt
student = {
    "name": "Sarah",
    "major": "Computer Science",
    "completed_courses": ["CS101", "CS201"],
    "interests": ["machine learning", "AI"]
}

full_prompt = build_rag_prompt(
    user_query="What machine learning courses would you recommend?",
    course_context=assembled_context,
    student_profile=student
)

print("Complete RAG Prompt:")
print("="*60)
print(full_prompt[:1500])
print("...")
print(f"\nTotal tokens: {count_tokens(full_prompt)}")

Complete RAG Prompt:
You are a university course advisor. Help students find courses.
Use the provided course information to give accurate recommendations.
Be concise and helpful.

Available Courses:
AVAILABLE COURSES (Summaries):
  ‚Ä¢ MATH001: Linear Algebra for Machine Learning (intermediate, 4 credits)
  ‚Ä¢ CS002: Machine Learning Fundamentals (advanced, 4 credits)
  ‚Ä¢ MATH003: Linear Algebra for Machine Learning (intermediate, 4 credits)
  ‚Ä¢ CS004: Computer Vision (advanced, 4 credits)
  ‚Ä¢ CS005: Machine Learning Fundamentals (advanced, 4 credits)

TOP MATCHES (Full Details):

--- MATH001 Full Details ---
Instructor: Rachel Yates
Objectives: Understand core concepts in linear algebra for machine learning, Implement linear algebra for machine learning algorithms and techniques, Apply linear algebra for machine learning to real-world problems
Topics: Vectors and Vector Spaces, Matrix Operations, Linear Transformations

--- CS002 Full Details ---
Instructor: Elizabeth Cline
Ob

In [11]:
# In production, you would call the LLM here:
# response = client.chat.completions.create(
#     model="gpt-4o-mini",
#     messages=[{"role": "user", "content": full_prompt}],
#     max_tokens=500
# )

print("üìò RAG Pipeline Complete!")
print("\nIn production, this prompt would be sent to an LLM to generate")
print("a personalized course recommendation based on:")
print("  ‚Ä¢ Retrieved course context (semantic search)")
print("  ‚Ä¢ Student profile (user context)")
print("  ‚Ä¢ System instructions (advisor persona)")

üìò RAG Pipeline Complete!

In production, this prompt would be sent to an LLM to generate
a personalized course recommendation based on:
  ‚Ä¢ Retrieved course context (semantic search)
  ‚Ä¢ Student profile (user context)
  ‚Ä¢ System instructions (advisor persona)


---

## üéØ Key Takeaways

1. **Vector embeddings** capture semantic meaning for better search
2. **Progressive disclosure** provides summaries first, details on-demand
3. **Context transformation** reduces tokens while preserving information
4. **The RAG pipeline**: Query ‚Üí Embed ‚Üí Search ‚Üí Retrieve ‚Üí Assemble ‚Üí Generate

---

## ‚û°Ô∏è Next Module

In **Module 4: Memory Systems**, you'll learn:
- Working memory for conversation continuity
- Long-term memory for persistent knowledge
- How Agent Memory Server handles compression automatically