# Week 2: Prompt Engineering & RAG

This notebook covers:
- Advanced Prompt Engineering Techniques
- Retrieval Augmented Generation (RAG)
- Vector Databases and Embeddings
- Building a Complete RAG System

**Prerequisites**: Install required packages:
```bash
pip install openai anthropic langchain langchain-openai langchain-anthropic langchain-community chromadb faiss-cpu sentence-transformers pypdf
```

## 1. Advanced Prompt Engineering

### 1.1 Few-Shot Learning

In [None]:
import os
from openai import OpenAI

client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))

def zero_shot_classification(text):
    """Zero-shot sentiment classification."""
    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[
            {"role": "system", "content": "Classify sentiment as positive or negative."},
            {"role": "user", "content": text}
        ],
        temperature=0
    )
    return response.choices[0].message.content

def few_shot_classification(text):
    """Few-shot sentiment classification with examples."""
    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[
            {"role": "system", "content": "Classify sentiment as positive or negative."},
            {"role": "user", "content": "I love this product!"},
            {"role": "assistant", "content": "positive"},
            {"role": "user", "content": "This is terrible."},
            {"role": "assistant", "content": "negative"},
            {"role": "user", "content": "Best purchase ever!"},
            {"role": "assistant", "content": "positive"},
            {"role": "user", "content": text}
        ],
        temperature=0
    )
    return response.choices[0].message.content

test_text = "Not impressed with the quality."
print(f"Text: {test_text}")
print(f"Zero-shot: {zero_shot_classification(test_text)}")
print(f"Few-shot: {few_shot_classification(test_text)}")

### 1.2 Chain-of-Thought Prompting

In [None]:
def solve_with_cot(problem):
    """Use Chain-of-Thought prompting for reasoning."""
    prompt = f"""
Solve this problem step by step:

{problem}

Show your reasoning:
1. Break down the problem
2. Show intermediate steps
3. Provide the final answer
"""
    
    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": prompt}],
        temperature=0
    )
    return response.choices[0].message.content

problem = """A store has 100 apples. They sell 40% on Monday and 30% of the 
remaining apples on Tuesday. How many apples are left?"""

print(solve_with_cot(problem))

### 1.3 Structured Output with Prompting

In [None]:
import json

def extract_structured_info(text):
    """Extract structured information from text."""
    prompt = f"""
Extract the following information from the text and return as JSON:
- product_name
- price
- rating (out of 5)
- pros (list)
- cons (list)

Text: {text}

Return only valid JSON, no additional text.
"""
    
    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": prompt}],
        temperature=0
    )
    
    return json.loads(response.choices[0].message.content)

review = """
The XPhone Pro costs $999 and it's incredible! The camera quality is outstanding 
and battery life lasts all day. However, it's quite expensive and a bit heavy.
I'd rate it 4 out of 5 stars.
"""

result = extract_structured_info(review)
print(json.dumps(result, indent=2))

### 1.4 Role-Based Prompting

In [None]:
def expert_advice(question, expert_role):
    """Get advice from an expert persona."""
    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[
            {"role": "system", "content": f"You are an expert {expert_role}. Provide detailed, professional advice."},
            {"role": "user", "content": question}
        ],
        temperature=0.7
    )
    return response.choices[0].message.content

question = "How can I improve the performance of my NLP model?"

print("Machine Learning Engineer perspective:")
print(expert_advice(question, "Machine Learning Engineer"))
print("\n" + "="*80 + "\n")

print("Data Scientist perspective:")
print(expert_advice(question, "Data Scientist"))

## 2. Introduction to RAG (Retrieval Augmented Generation)

RAG combines retrieval systems with language models to provide accurate, contextual responses based on specific documents.

### 2.1 Understanding Embeddings

In [None]:
from openai import OpenAI
import numpy as np

client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))

def get_embedding(text, model="text-embedding-3-small"):
    """Get embedding for text using OpenAI."""
    text = text.replace("\n", " ")
    response = client.embeddings.create(input=[text], model=model)
    return response.data[0].embedding

def cosine_similarity(vec1, vec2):
    """Calculate cosine similarity between two vectors."""
    return np.dot(vec1, vec2) / (np.linalg.norm(vec1) * np.linalg.norm(vec2))

# Example texts
texts = [
    "The cat sat on the mat.",
    "A feline rested on the rug.",
    "The stock market crashed today."
]

# Get embeddings
embeddings = [get_embedding(text) for text in texts]

print(f"Embedding dimension: {len(embeddings[0])}\n")

# Compare similarities
print("Cosine Similarities:")
print(f"Text 1 vs Text 2 (similar meaning): {cosine_similarity(embeddings[0], embeddings[1]):.4f}")
print(f"Text 1 vs Text 3 (different meaning): {cosine_similarity(embeddings[0], embeddings[2]):.4f}")

### 2.2 Building a Simple RAG System with LangChain

In [None]:
from langchain_openai import OpenAIEmbeddings, ChatOpenAI
from langchain_community.vectorstores import Chroma
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.chains import RetrievalQA
from langchain_core.documents import Document

# Sample documents
documents = [
    """Python is a high-level programming language known for its simplicity and readability. 
    It was created by Guido van Rossum and first released in 1991. Python supports multiple 
    programming paradigms including procedural, object-oriented, and functional programming.""",
    
    """Machine learning is a subset of artificial intelligence that enables systems to learn 
    and improve from experience without being explicitly programmed. It focuses on developing 
    algorithms that can access data and use it to learn for themselves.""",
    
    """Natural Language Processing (NLP) is a branch of AI that helps computers understand, 
    interpret, and manipulate human language. NLP combines computational linguistics with 
    statistical machine learning and deep learning models.""",
    
    """Deep learning is a subset of machine learning based on artificial neural networks. 
    It uses multiple layers to progressively extract higher-level features from raw input. 
    Common applications include computer vision, speech recognition, and NLP."""
]

# Convert to Document objects
docs = [Document(page_content=doc) for doc in documents]

# Split documents into chunks
text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=200,
    chunk_overlap=20
)
splits = text_splitter.split_documents(docs)

print(f"Number of document chunks: {len(splits)}")
print(f"\nFirst chunk:\n{splits[0].page_content}")

In [None]:
# Create embeddings and vector store
embeddings = OpenAIEmbeddings(model="text-embedding-3-small")
vectorstore = Chroma.from_documents(documents=splits, embedding=embeddings)

# Create retriever
retriever = vectorstore.as_retriever(search_kwargs={"k": 2})

# Test retrieval
query = "What is NLP?"
retrieved_docs = retriever.invoke(query)

print(f"Query: {query}\n")
print("Retrieved documents:")
for i, doc in enumerate(retrieved_docs, 1):
    print(f"\n{i}. {doc.page_content}")

In [None]:
# Create RAG chain
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)

qa_chain = RetrievalQA.from_chain_type(
    llm=llm,
    chain_type="stuff",
    retriever=retriever,
    return_source_documents=True
)

# Ask questions
questions = [
    "What is NLP and what does it combine?",
    "Who created Python?",
    "What are applications of deep learning?"
]

for question in questions:
    result = qa_chain.invoke({"query": question})
    print(f"\nQuestion: {question}")
    print(f"Answer: {result['result']}")
    print("-" * 80)

### 2.3 RAG with Anthropic Claude

In [None]:
from langchain_anthropic import ChatAnthropic

# Create RAG chain with Claude
claude_llm = ChatAnthropic(model="claude-3-5-sonnet-20241022", temperature=0)

qa_chain_claude = RetrievalQA.from_chain_type(
    llm=claude_llm,
    chain_type="stuff",
    retriever=retriever,
    return_source_documents=True
)

question = "Compare machine learning and deep learning."
result = qa_chain_claude.invoke({"query": question})

print(f"Question: {question}")
print(f"\nAnswer: {result['result']}")

### 2.4 RAG with Different Vector Stores

In [None]:
from langchain_community.vectorstores import FAISS
from sentence_transformers import SentenceTransformer

# Using FAISS (Facebook AI Similarity Search)
faiss_vectorstore = FAISS.from_documents(splits, embeddings)
faiss_retriever = faiss_vectorstore.as_retriever(search_kwargs={"k": 2})

# Create QA chain with FAISS
qa_chain_faiss = RetrievalQA.from_chain_type(
    llm=llm,
    chain_type="stuff",
    retriever=faiss_retriever
)

question = "What programming paradigms does Python support?"
result = qa_chain_faiss.invoke({"query": question})

print(f"Question: {question}")
print(f"Answer: {result['result']}")

## 3. Advanced RAG: Document Loading and Processing

In [None]:
# Create sample documents
sample_docs = {
    "ai_basics.txt": """
Artificial Intelligence (AI) Overview

Artificial Intelligence refers to the simulation of human intelligence in machines. 
It encompasses various subfields including:

1. Machine Learning: Algorithms that improve through experience
2. Deep Learning: Neural networks with multiple layers
3. Natural Language Processing: Understanding and generating human language
4. Computer Vision: Interpreting and analyzing visual information
5. Robotics: Intelligent physical systems

Applications of AI include autonomous vehicles, medical diagnosis, 
recommendation systems, and virtual assistants.
""",
    "ml_techniques.txt": """
Machine Learning Techniques

Supervised Learning:
- Classification: Categorizing data into classes
- Regression: Predicting continuous values

Unsupervised Learning:
- Clustering: Grouping similar data points
- Dimensionality Reduction: Reducing feature space

Reinforcement Learning:
- Agent learns through interaction with environment
- Receives rewards or penalties for actions

Common algorithms include decision trees, random forests, 
support vector machines, and neural networks.
"""
}

# Write sample files
import os
os.makedirs("sample_docs", exist_ok=True)

for filename, content in sample_docs.items():
    with open(f"sample_docs/{filename}", "w") as f:
        f.write(content)

print("Sample documents created!")

In [None]:
from langchain_community.document_loaders import DirectoryLoader, TextLoader

# Load documents from directory
loader = DirectoryLoader("sample_docs", glob="*.txt", loader_cls=TextLoader)
loaded_docs = loader.load()

print(f"Loaded {len(loaded_docs)} documents")

# Split and create vector store
text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=300,
    chunk_overlap=50
)
doc_splits = text_splitter.split_documents(loaded_docs)

# Create vector store
vectorstore_docs = FAISS.from_documents(doc_splits, embeddings)
retriever_docs = vectorstore_docs.as_retriever(search_kwargs={"k": 3})

# Create QA chain
qa_chain_docs = RetrievalQA.from_chain_type(
    llm=llm,
    chain_type="stuff",
    retriever=retriever_docs,
    return_source_documents=True
)

In [None]:
# Query the document-based RAG system
questions = [
    "What are the main subfields of AI?",
    "Explain reinforcement learning.",
    "What is the difference between supervised and unsupervised learning?"
]

for question in questions:
    result = qa_chain_docs.invoke({"query": question})
    print(f"\nQuestion: {question}")
    print(f"Answer: {result['result']}")
    print(f"\nSources: {len(result['source_documents'])} documents")
    print("-" * 80)

## 4. RAG with Conversation Memory

In [None]:
from langchain.chains import ConversationalRetrievalChain
from langchain.memory import ConversationBufferMemory

# Create memory
memory = ConversationBufferMemory(
    memory_key="chat_history",
    return_messages=True,
    output_key="answer"
)

# Create conversational RAG chain
conversational_chain = ConversationalRetrievalChain.from_llm(
    llm=llm,
    retriever=retriever_docs,
    memory=memory,
    return_source_documents=True
)

# Have a conversation
print("Conversational RAG Demo:\n")

q1 = "What is machine learning?"
result1 = conversational_chain.invoke({"question": q1})
print(f"Q: {q1}")
print(f"A: {result1['answer']}\n")

q2 = "What are its main types?"
result2 = conversational_chain.invoke({"question": q2})
print(f"Q: {q2}")
print(f"A: {result2['answer']}\n")

q3 = "Give me an example of the first type."
result3 = conversational_chain.invoke({"question": q3})
print(f"Q: {q3}")
print(f"A: {result3['answer']}")

## 5. Evaluating RAG Systems

In [None]:
def evaluate_rag_response(question, answer, source_docs):
    """Evaluate RAG response quality."""
    evaluation_prompt = f"""
Evaluate this RAG system response:

Question: {question}
Answer: {answer}

Source Documents:
{chr(10).join([f"{i+1}. {doc.page_content[:200]}..." for i, doc in enumerate(source_docs)])}

Evaluate on:
1. Relevance: Is the answer relevant to the question? (1-5)
2. Accuracy: Is the answer factually correct based on sources? (1-5)
3. Completeness: Does it fully address the question? (1-5)
4. Groundedness: Is it based on the provided sources? (1-5)

Provide scores and brief explanation.
"""
    
    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": evaluation_prompt}],
        temperature=0
    )
    return response.choices[0].message.content

# Evaluate a response
question = "What are applications of AI?"
result = qa_chain_docs.invoke({"query": question})

print("RAG Evaluation:\n")
print(evaluate_rag_response(question, result['result'], result['source_documents']))

## 6. Assignment 1: Build Your Own RAG System

### Task:
Build a RAG system for a specific domain (e.g., university policies, product manuals, research papers).

### Requirements:
1. Collect or create at least 5 documents
2. Implement proper text splitting and chunking
3. Use embeddings and vector store
4. Create a question-answering interface
5. Add conversation memory
6. Evaluate your system with test questions

### Deliverables:
- Working code
- Sample documents
- Test questions and responses
- Evaluation metrics

In [None]:
# Your assignment code here

# 1. Document Collection
# TODO: Load your documents

# 2. Text Splitting
# TODO: Split documents into chunks

# 3. Vector Store
# TODO: Create embeddings and vector store

# 4. QA Interface
# TODO: Create RAG chain

# 5. Conversation Memory
# TODO: Add memory to your chain

# 6. Evaluation
# TODO: Test and evaluate your system

## Summary

In this notebook, you learned:
1. Advanced prompt engineering techniques (few-shot, chain-of-thought, role-based)
2. Embeddings and semantic similarity
3. Building RAG systems with LangChain
4. Different vector stores (Chroma, FAISS)
5. Document loading and processing
6. Conversational RAG with memory
7. Evaluating RAG system quality

**Next Steps**: Complete Assignment 1 and prepare for the live session on adapter tuning!