## ðŸŽ¯ Practice Exercises
## Exercise 1: Build Your Own Agentic RAG System

### Task
Build an agentic RAG system on a topic of your choice (NOT biochemistry - use your own domain).

### Domain Suggestions
- Technology tutorials (Python, JavaScript, etc.)
- Agriculture
- Finance
- Historical documents
- Study notes from a course
- Recipe collection
- Legal documents (simplified)

### Requirements

**1. Document Collection**
- Gather 5-10 documents (PDF or TXT) in your chosen domain
- Documents should be substantial (5000+ words each)
- Topics should be related but distinct

**2. Vector Store Setup**
- Load documents using appropriate loader
- Split into chunks (experiment with chunk size)
- Create Chroma vector store with embeddings
- Test retrieval with sample queries

**3. Retrieval Tool**
- Create `@tool` decorated function for retrieval
- Use MMR or similarity search
- Return formatted context
- Include metadata in responses

**4. Agentic RAG System**
- Build LangGraph with agent and tool nodes
- Implement conditional edges (agent decides when to retrieve)
- Add conversation memory
- Create helpful system prompt

**5. Testing & Evaluation**
- Test with 10 diverse queries:
  - 5 that require retrieval
  - 5 that don't require retrieval
- Document which queries trigger retrieval
- Evaluate answer quality

### Deliverables
1. Jupyter notebook with complete implementation
2. Test results showing agent decisions
3. Brief report (300-500 words):
   - What domain did you choose and why?
   - How did you tune chunk size?
   - Did the agent make good retrieval decisions?
   - What worked well? What needs improvement?

### Example System Behavior
```
Query: "What is Python?"
Agent Decision: NO retrieval needed (general knowledge)
Response: "Python is a high-level programming language..."

Query: "How do I use the new API endpoint for user authentication?"
Agent Decision: YES, retrieve from documentation
Response: [Retrieved docs] "Based on the documentation, the new authentication endpoint..."
```


In [None]:
# Imports
from langgraph.graph import START, END, StateGraph, MessagesState
from langgraph.checkpoint.memory import MemorySaver
from langgraph.prebuilt import ToolNode
from langchain_core.messages import HumanMessage, AIMessage, SystemMessage
from langchain_core.tools import tool
from langchain_openai import ChatOpenAI, OpenAIEmbeddings
from langchain_chroma import Chroma
from langchain_community.document_loaders import PyPDFLoader
from langchain_text_splitters import RecursiveCharacterTextSplitter
from dotenv import load_dotenv
from IPython.display import Image, display
from typing import Literal
import os

print("âœ… All imports successful")

In [None]:
# Load API key
load_dotenv()
openai_api_key = os.getenv("OPENAI_API_KEY")

if not openai_api_key:
    raise ValueError("OPENAI_API_KEY not found! Please set it in your .env file.")

print("âœ… API key loaded")

In [None]:
# Initialize LLM
llm = ChatOpenAI(
    model="gpt-4o-mini",
    temperature=0.5,
    api_key=openai_api_key
)

print(f"âœ… LLM initialized: {llm.model_name}")