# Module 1: Introduction & Problem Statement

## 🎯 Learning Objectives
By the end of this module, you will:
- Understand the fundamental limitations of standalone LLMs
- Recognize when RAG is the right solution
- Visualize the complete RAG workflow
- Experience hands-on examples of LLM limitations and RAG solutions

## 📚 Key Concepts

### What is RAG?
**Retrieval-Augmented Generation (RAG)** is a technique that enhances Large Language Models (LLMs) by providing them with relevant external information before generating responses.

Think of it like an open-book exam:
- **Without RAG**: LLM answers from memory only (closed-book exam)
- **With RAG**: LLM gets relevant documents first, then answers (open-book exam)

### Why Do We Need RAG?

#### 🚫 LLM Limitations
1. **Knowledge Cutoff Dates**: LLMs are trained on data up to a certain date
2. **Hallucinations**: LLMs can generate confident-sounding but incorrect information
3. **No Domain-Specific Knowledge**: Limited knowledge about your company/domain
4. **No Real-time Information**: Cannot access current events or dynamic data
5. **Context Length Limits**: Cannot process entire large documents

#### ✅ How RAG Helps
1. **Fresh Information**: Access to up-to-date external data
2. **Grounded Responses**: Answers based on provided evidence
3. **Domain Expertise**: Include your specific documents and data
4. **Source Attribution**: Know where information comes from
5. **Cost Effective**: No need to retrain models with new data

### 2025 Research Insights 🔬
- **Mathematical Proof**: OpenAI researchers proved LLM hallucinations are mathematically inevitable (Sept 2025)
- **Current Best**: Anthropic Claude 3.7 has the lowest hallucination rate at 17%
- **RAG Impact**: Properly implemented RAG can reduce hallucinations by 49-67%

## 🛠️ Setup
Let's install the required packages and set up our environment.

In [None]:
# Install required packages
!pip install -q openai langchain python-dotenv

In [None]:
import os
from datetime import datetime
import openai
from langchain.llms import OpenAI
from langchain.chat_models import ChatOpenAI
from langchain.schema import HumanMessage
from dotenv import load_dotenv

# Load environment variables
load_dotenv()

# Set up OpenAI API key
openai.api_key = os.getenv("OPENAI_API_KEY")

# Initialize LangChain LLM
llm = ChatOpenAI(
    model="gpt-3.5-turbo",
    temperature=0.1,  # Low temperature for more consistent results
    openai_api_key=os.getenv("OPENAI_API_KEY")
)

print("✅ Setup complete!")
print(f"📅 Today's date: {datetime.now().strftime('%Y-%m-%d')}")

## 🧪 Exercise 1: Demonstrating LLM Limitations

Let's see these limitations in action with real examples.

### Problem 1: Knowledge Cutoff Dates

In [None]:
# Test knowledge cutoff with recent events
recent_questions = [
    "What happened in the 2024 US Presidential Election?",
    "Who won the 2024 Nobel Prize in Physics?",
    "What are the latest features in iPhone 16?",
    "What is the current stock price of NVIDIA?"
]

print("🔍 Testing Knowledge Cutoff Issues:")
print("=" * 50)

for question in recent_questions:
    print(f"\n❓ Question: {question}")
    try:
        response = llm.invoke([HumanMessage(content=question)])
        print(f"🤖 LLM Response: {response.content[:200]}...")
    except Exception as e:
        print(f"❌ Error: {e}")
    print("-" * 30)

### Problem 2: Hallucinations (Confident but Wrong Answers)

In [None]:
# Test with questions that might trigger hallucinations
tricky_questions = [
    "What is the exact population of Atlantis?",
    "Who is the CEO of DataScience Corp (a fictional company)?",
    "What are the side effects of Imaginex (a made-up drug)?",
    "What year was the Treaty of Fabrication signed?"
]

print("🎭 Testing Hallucination Tendencies:")
print("=" * 50)

for question in tricky_questions:
    print(f"\n❓ Question: {question}")
    try:
        response = llm.invoke([HumanMessage(content=question)])
        print(f"🤖 LLM Response: {response.content}")
        print("⚠️  Note: This response might be hallucinated since the question involves fictional entities!")
    except Exception as e:
        print(f"❌ Error: {e}")
    print("-" * 30)

### Problem 3: No Domain-Specific Knowledge

In [None]:
# Test with company-specific questions
company_questions = [
    "What is our company's return policy?",
    "Who is the head of our marketing department?",
    "What are the specs of our Model-X product?",
    "What was discussed in last week's board meeting?"
]

print("🏢 Testing Domain-Specific Knowledge:")
print("=" * 50)

for question in company_questions:
    print(f"\n❓ Question: {question}")
    try:
        response = llm.invoke([HumanMessage(content=question)])
        print(f"🤖 LLM Response: {response.content}")
        print("📝 Note: LLM cannot access company-specific information!")
    except Exception as e:
        print(f"❌ Error: {e}")
    print("-" * 30)

## 🔧 Exercise 2: Simple RAG Preview

Now let's see how RAG can solve these problems with a basic example.

In [None]:
# Create some sample company knowledge
company_knowledge = """
COMPANY INFORMATION DATABASE
============================

COMPANY: TechCorp Solutions
FOUNDED: 2018
HEADQUARTERS: San Francisco, CA

LEADERSHIP:
- CEO: Sarah Johnson
- CTO: Michael Chen  
- Head of Marketing: Lisa Rodriguez
- Head of Sales: David Kim

PRODUCTS:
- Model-X: AI-powered analytics platform
  Specs: 99.9% uptime, supports 1M+ queries/sec, cloud-native
- Model-Y: Customer service automation tool
  Specs: 24/7 support, multi-language, integrates with 50+ platforms

POLICIES:
- Return Policy: 30-day money-back guarantee on all products
- Support: 24/7 technical support for enterprise customers
- Privacy: SOC2 Type II certified, GDPR compliant

RECENT NEWS:
- 2024-01-15: Launched Model-Y 2.0 with enhanced AI capabilities
- 2024-02-10: Secured $50M Series B funding
- 2024-03-05: Opened new office in London
"""

print("📚 Company Knowledge Base Created!")
print(f"Knowledge base contains {len(company_knowledge.split())} words of information.")

In [None]:
def simple_rag_query(question, knowledge_base):
    """
    A simple RAG implementation:
    1. Provide relevant context to the LLM
    2. Ask the LLM to answer based on that context
    """
    
    # Create a prompt that includes our knowledge base
    rag_prompt = f"""
    You are a helpful assistant that answers questions based on the provided company information.
    
    COMPANY INFORMATION:
    {knowledge_base}
    
    QUESTION: {question}
    
    Please answer the question based ONLY on the information provided above. 
    If the information is not available, say "I don't have that information in the company database."
    
    ANSWER:
    """
    
    # Get response from LLM
    response = llm.invoke([HumanMessage(content=rag_prompt)])
    return response.content

print("✅ Simple RAG function created!")

In [None]:
# Test our simple RAG with the same company questions
print("🚀 Testing Simple RAG vs Standard LLM:")
print("=" * 60)

test_questions = [
    "What is our company's return policy?",
    "Who is the head of our marketing department?",
    "What are the specs of our Model-X product?",
    "When did we open our London office?"
]

for question in test_questions:
    print(f"\n❓ Question: {question}")
    
    # Standard LLM response
    print("\n🤖 Standard LLM (no context):")
    standard_response = llm.invoke([HumanMessage(content=question)])
    print(f"   {standard_response.content}")
    
    # RAG response
    print("\n🔍 RAG LLM (with company context):")
    rag_response = simple_rag_query(question, company_knowledge)
    print(f"   {rag_response}")
    
    print("-" * 60)

## 📊 Exercise 3: RAG Workflow Visualization

Let's understand the complete RAG workflow step by step.

In [None]:
def visualize_rag_workflow(user_question):
    """
    Demonstrate the RAG workflow step by step
    """
    print("🔄 RAG WORKFLOW DEMONSTRATION")
    print("=" * 50)
    
    # Step 1: User asks a question
    print(f"📝 STEP 1: User Question")
    print(f"   '{user_question}'")
    print()
    
    # Step 2: Retrieve relevant documents (simplified)
    print(f"🔍 STEP 2: Document Retrieval")
    print(f"   Searching knowledge base for relevant information...")
    
    # Simple keyword matching for demonstration
    question_words = user_question.lower().split()
    relevant_lines = []
    
    for line in company_knowledge.split('\n'):
        if any(word in line.lower() for word in question_words if len(word) > 3):
            relevant_lines.append(line.strip())
    
    print(f"   Found {len(relevant_lines)} relevant lines:")
    for line in relevant_lines[:3]:  # Show first 3 relevant lines
        if line:
            print(f"   - {line}")
    print()
    
    # Step 3: Augment the prompt
    print(f"📝 STEP 3: Prompt Augmentation")
    print(f"   Combining retrieved context with user question...")
    print(f"   Context + Question → Enhanced Prompt")
    print()
    
    # Step 4: Generate response
    print(f"🤖 STEP 4: Generation")
    print(f"   LLM generates response based on provided context...")
    
    response = simple_rag_query(user_question, company_knowledge)
    print(f"   Response: '{response}'")
    print()
    
    # Step 5: Return grounded answer
    print(f"✅ STEP 5: Grounded Answer")
    print(f"   Answer is based on actual company data, not LLM's training!")
    
    return response

# Test the workflow
sample_question = "Who is our CTO?"
result = visualize_rag_workflow(sample_question)

## 🎯 Exercise 4: Comparing Accuracy

Let's quantify the difference between standard LLM and RAG responses.

In [None]:
# Define test questions with known correct answers
qa_pairs = [
    {
        "question": "Who is the CEO of TechCorp Solutions?",
        "correct_answer": "Sarah Johnson"
    },
    {
        "question": "What is the uptime guarantee for Model-X?",
        "correct_answer": "99.9%"
    },
    {
        "question": "How many days is the return policy?",
        "correct_answer": "30 days"
    },
    {
        "question": "When was the Series B funding secured?",
        "correct_answer": "2024-02-10"
    }
]

def check_answer_accuracy(response, correct_answer):
    """Simple accuracy check"""
    return correct_answer.lower() in response.lower()

print("📊 ACCURACY COMPARISON")
print("=" * 50)

standard_correct = 0
rag_correct = 0

for qa in qa_pairs:
    question = qa["question"]
    correct = qa["correct_answer"]
    
    print(f"\n❓ {question}")
    print(f"✅ Correct answer: {correct}")
    
    # Standard LLM
    standard_response = llm.invoke([HumanMessage(content=question)]).content
    standard_accurate = check_answer_accuracy(standard_response, correct)
    standard_correct += standard_accurate
    
    print(f"🤖 Standard LLM: {standard_response[:100]}...")
    print(f"   Accurate: {'✅' if standard_accurate else '❌'}")
    
    # RAG
    rag_response = simple_rag_query(question, company_knowledge)
    rag_accurate = check_answer_accuracy(rag_response, correct)
    rag_correct += rag_accurate
    
    print(f"🔍 RAG LLM: {rag_response[:100]}...")
    print(f"   Accurate: {'✅' if rag_accurate else '❌'}")

print(f"\n📊 FINAL RESULTS:")
print(f"Standard LLM Accuracy: {standard_correct}/{len(qa_pairs)} ({standard_correct/len(qa_pairs)*100:.1f}%)")
print(f"RAG LLM Accuracy: {rag_correct}/{len(qa_pairs)} ({rag_correct/len(qa_pairs)*100:.1f}%)")
print(f"\n🎯 Improvement: {(rag_correct-standard_correct)/len(qa_pairs)*100:.1f} percentage points!")

## 🧠 Key Takeaways

From this module, you should now understand:

### ❌ LLM Limitations We Observed:
1. **Knowledge Cutoff**: Cannot answer questions about recent events
2. **Hallucinations**: May provide confident but incorrect answers
3. **No Domain Knowledge**: Cannot access company-specific information
4. **Generic Responses**: Provides general answers without specific context

### ✅ RAG Benefits We Demonstrated:
1. **Accurate Information**: Answers based on provided context
2. **Domain-Specific**: Can access and use company knowledge
3. **Grounded Responses**: Explicitly tells you when information isn't available
4. **Improved Accuracy**: Significantly better performance on domain-specific questions

### 🔄 RAG Workflow:
1. **User Question** → 2. **Retrieve Relevant Docs** → 3. **Augment Prompt** → 4. **Generate Response** → 5. **Grounded Answer**

## 🎯 Next Steps

In the next modules, we'll dive deeper into each component of the RAG system:
- **Module 2**: How to load and process different types of documents
- **Module 3**: Strategies for breaking documents into chunks
- **Module 4**: Understanding embedding models for semantic search
- And much more!

The simple RAG we built here is just the beginning. Real-world RAG systems are much more sophisticated and powerful!

## 🤔 Discussion Questions

1. In which scenarios would you prefer a standard LLM over RAG?
2. What types of company data would be most valuable to include in a RAG system?
3. How might RAG help with compliance and audit requirements?
4. What are potential challenges with implementing RAG in a large organization?

## 📝 Optional Exercise

Try creating your own knowledge base for a fictional company or organization and test the RAG system with domain-specific questions!