# Creating a RAG (Retrieval-Augmented Generation) System with LangChain and Mistral

Welcome! This notebook demonstrates how to build a RAG system using [LangChain](https://python.langchain.com/) and the [Mistral API](https://docs.mistral.ai/). 

**What is RAG?**
- RAG (Retrieval-Augmented Generation) is a technique that combines information retrieval with language generation.
- It allows LLMs to access and utilize external knowledge sources by retrieving relevant documents and using them to inform responses.
- This enables more accurate, up-to-date, and contextually relevant answers beyond the model's training data.

**Notebook Goals:**
- Show how to set up a RAG system powered by Mistral.
- Demonstrate document loading, embedding, and retrieval.
- Build a question-answering system that can answer questions based on your documents.
- Align with the agentic workflow patterns discussed in the tutorial: *Agents and Agentic Workflows with LLMs - Complete Tutorial* (see README.md).

Let's get started!

## 1. Setup & Dependencies

Let's start by installing the required libraries for our RAG system. We'll need document loaders for various file formats, embedding models, vector databases, and the Mistral API integration.

In [None]:
# Install required libraries (uncomment if running in a new environment)
%pip install langchain-mistralai
%pip install python-dotenv
%pip install getpass
%pip install langchain
%pip install langchain-community
%pip install pypdf
%pip install python-docx
%pip install faiss-cpu
%pip install sentence-transformers
%pip install tiktoken
%pip install jq

# Note: If running in a managed environment (like VS Code or JupyterHub),
# you may need to restart the kernel after installation.

Note: you may need to restart the kernel to use updated packages.
Collecting langchain-core<1.0.0,>=0.3.68 (from langchain-mistralai)
  Using cached langchain_core-0.3.68-py3-none-any.whl.metadata (5.8 kB)
Collecting langsmith>=0.3.45 (from langchain-core<1.0.0,>=0.3.68->langchain-mistralai)
  Using cached langsmith-0.4.5-py3-none-any.whl.metadata (15 kB)
Using cached langchain_core-0.3.68-py3-none-any.whl (441 kB)
Using cached langsmith-0.4.5-py3-none-any.whl (367 kB)
Installing collected packages: langsmith, langchain-core
  Attempting uninstall: langsmith
    Found existing installation: langsmith 0.1.147
    Uninstalling langsmith-0.1.147:
      Successfully uninstalled langsmith-0.1.147
  Attempting uninstall: langchain-core
    Found existing installation: langchain-core 0.3.63
    Uninstalling langchain-core-0.3.63:
      Successfully uninstalled langchain-core-0.3.63
Successfully installed langchain-core-0.3.68 langsmith-0.4.5


ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
langchain 0.3.0 requires langsmith<0.2.0,>=0.1.17, but you have langsmith 0.4.5 which is incompatible.
langchain-community 0.3.0 requires langsmith<0.2.0,>=0.1.112, but you have langsmith 0.4.5 which is incompatible.

[notice] A new release of pip is available: 24.2 -> 25.1.1
[notice] To update, run: python.exe -m pip install --upgrade pip


Note: you may need to restart the kernel to use updated packages.



[notice] A new release of pip is available: 24.2 -> 25.1.1
[notice] To update, run: python.exe -m pip install --upgrade pip


Note: you may need to restart the kernel to use updated packages.


ERROR: Could not find a version that satisfies the requirement getpass (from versions: none)

[notice] A new release of pip is available: 24.2 -> 25.1.1
[notice] To update, run: python.exe -m pip install --upgrade pip
ERROR: No matching distribution found for getpass


Collecting langsmith<0.2.0,>=0.1.17 (from langchain)
  Using cached langsmith-0.1.147-py3-none-any.whl.metadata (14 kB)
INFO: pip is looking at multiple versions of langchain-core to determine which version is compatible with other requirements. This could take a while.
Collecting langchain-core<0.4.0,>=0.3.0 (from langchain)
  Using cached langchain_core-0.3.67-py3-none-any.whl.metadata (5.8 kB)
  Using cached langchain_core-0.3.66-py3-none-any.whl.metadata (5.8 kB)
  Using cached langchain_core-0.3.65-py3-none-any.whl.metadata (5.8 kB)
  Using cached langchain_core-0.3.64-py3-none-any.whl.metadata (5.8 kB)
  Using cached langchain_core-0.3.63-py3-none-any.whl.metadata (5.8 kB)
Using cached langchain_core-0.3.63-py3-none-any.whl (438 kB)
Using cached langsmith-0.1.147-py3-none-any.whl (311 kB)
Installing collected packages: langsmith, langchain-core
  Attempting uninstall: langsmith
    Found existing installation: langsmith 0.4.5
    Uninstalling langsmith-0.4.5:
      Successfully u

ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
langchain-mistralai 0.2.11 requires langchain-core<1.0.0,>=0.3.68, but you have langchain-core 0.3.63 which is incompatible.

[notice] A new release of pip is available: 24.2 -> 25.1.1
[notice] To update, run: python.exe -m pip install --upgrade pip


Note: you may need to restart the kernel to use updated packages.



[notice] A new release of pip is available: 24.2 -> 25.1.1
[notice] To update, run: python.exe -m pip install --upgrade pip


Note: you may need to restart the kernel to use updated packages.



[notice] A new release of pip is available: 24.2 -> 25.1.1
[notice] To update, run: python.exe -m pip install --upgrade pip


Note: you may need to restart the kernel to use updated packages.



[notice] A new release of pip is available: 24.2 -> 25.1.1
[notice] To update, run: python.exe -m pip install --upgrade pip


Note: you may need to restart the kernel to use updated packages.



[notice] A new release of pip is available: 24.2 -> 25.1.1
[notice] To update, run: python.exe -m pip install --upgrade pip


Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.



[notice] A new release of pip is available: 24.2 -> 25.1.1
[notice] To update, run: python.exe -m pip install --upgrade pip


Note: you may need to restart the kernel to use updated packages.



[notice] A new release of pip is available: 24.2 -> 25.1.1
[notice] To update, run: python.exe -m pip install --upgrade pip


In [3]:
# Load API keys securely
import os
from dotenv import load_dotenv
from getpass import getpass

# Load environment variables from a .env file if present
load_dotenv()

MISTRAL_API_KEY = os.getenv('MISTRAL_API_KEY')

# If not found, prompt the user securely
if not MISTRAL_API_KEY:
    MISTRAL_API_KEY = getpass('Enter your Mistral API key: ')

# Confirm that the key is loaded (do not print the key!)
if MISTRAL_API_KEY:
    print('✅ Mistral API key loaded.')
else:
    raise ValueError('❌ Mistral API key not found. Please set it in your .env file or enter it when prompted.')

✅ Mistral API key loaded.


## 2. Configure Mistral LLM in LangChain

Next, we'll set up the Mistral LLM using LangChain's integration. We'll use the `mistral-medium` model, but you can choose others as needed. This step connects LangChain to the Mistral API using your key.

In [4]:
# Configure the Mistral LLM in LangChain
from langchain_mistralai.chat_models import ChatMistralAI

# Choose the model (see Mistral docs for available options)
MISTRAL_MODEL = "mistral-medium"  # You can change this to another available model

# Set up the LLM wrapper
llm = ChatMistralAI(
    api_key=MISTRAL_API_KEY,
    model=MISTRAL_MODEL,
)

print(f"✅ Mistral LLM configured with model: {MISTRAL_MODEL}")

✅ Mistral LLM configured with model: mistral-medium


## 3. Load Documents from Knowledgebase

For RAG to work, we need to load documents from various sources. We'll create a flexible document loading system that can handle multiple file formats including text, PDF, Word documents, JSON, and CSV files.

In [None]:
# Import document loaders
from langchain_community.document_loaders import (
    TextLoader,
    PyPDFLoader,
    Docx2txtLoader,
    JSONLoader,
    CSVLoader,
    DirectoryLoader
)
import os
from pathlib import Path

# Function to load documents from various formats
def load_documents_from_directory(directory_path="knowledgebase"):
    """
    Load documents from a directory containing various file formats.
    Supports: .txt, .pdf, .docx, .json, .csv files
    """
    documents = []
    
    # Check if directory exists
    if not os.path.exists(directory_path):
        print(f"Directory '{directory_path}' not found. Creating sample documents...")
        create_sample_documents(directory_path)
    
    # Load different file types
    file_loaders = {
        '.txt': TextLoader,
        '.pdf': PyPDFLoader,
        '.docx': Docx2txtLoader,
        '.json': JSONLoader,
        '.csv': CSVLoader
    }
    
    for file_path in Path(directory_path).glob('*'):
        if file_path.suffix.lower() in file_loaders:
            try:
                loader_class = file_loaders[file_path.suffix.lower()]
                if file_path.suffix.lower() == '.json':
                    # Try different JSON loading approaches
                    try:
                        # Method 1: Use jq_schema (requires jq package)
                        loader = loader_class(str(file_path), jq_schema='.', text_content=False)
                    except ImportError:
                        # Method 2: Fallback to simple JSON loading
                        print(f"⚠️  jq package not found, using simple JSON loading for {file_path.name}")
                        loader = loader_class(str(file_path))
                else:
                    loader = loader_class(str(file_path))
                
                docs = loader.load()
                documents.extend(docs)
                print(f"✅ Loaded {len(docs)} documents from {file_path.name}")
            except Exception as e:
                print(f"❌ Error loading {file_path.name}: {e}")
    
    return documents

def create_sample_documents(directory_path):
    """Create sample documents for demonstration if they don't exist"""
    os.makedirs(directory_path, exist_ok=True)
    
    # Sample text document
    with open(f"{directory_path}/sample.txt", "w", encoding="utf-8") as f:
        f.write("""
        Welcome to the AI Agents Tutorial!
        
        This document contains information about artificial intelligence agents.
        
        What are AI Agents?
        AI agents are autonomous systems that can perceive their environment,
        make decisions, and take actions to achieve specific goals.
        
        Types of AI Agents:
        1. Reactive Agents: Respond to current perceptions
        2. Model-based Agents: Maintain internal state
        3. Goal-based Agents: Act to achieve specific objectives
        4. Utility-based Agents: Optimize for maximum utility
        
        Key Components:
        - Perception: Sensing the environment
        - Decision Making: Choosing appropriate actions
        - Action: Executing chosen behaviors
        - Learning: Improving performance over time
        """)
    
    # Sample JSON document
    sample_json = {
        "agents": [
            {
                "name": "ChatGPT",
                "type": "Language Model",
                "capabilities": ["text generation", "question answering", "summarization"],
                "developer": "OpenAI"
            },
            {
                "name": "Mistral",
                "type": "Language Model", 
                "capabilities": ["text generation", "reasoning", "multilingual"],
                "developer": "Mistral AI"
            }
        ],
        "rag_components": [
            "Document Loading",
            "Text Splitting",
            "Embeddings",
            "Vector Database",
            "Retrieval",
            "Generation"
        ]
    }
    
    import json
    with open(f"{directory_path}/sample.json", "w", encoding="utf-8") as f:
        json.dump(sample_json, f, indent=2)
    
    print(f"✅ Created sample documents in {directory_path}/")

# Load documents
documents = load_documents_from_directory()
print(f"\n📚 Total documents loaded: {len(documents)}")

# Display first document as example
if documents:
    print(f"\n📄 First document preview:")
    print(f"Source: {documents[0].metadata.get('source', 'Unknown')}")
    print(f"Content (first 300 chars): {documents[0].page_content[:300]}...")

Directory 'knowledgebase' not found. Creating sample documents...
✅ Created sample documents in knowledgebase/
❌ Error loading sample.json: jq package not found, please install it with `pip install jq`
✅ Loaded 1 documents from sample.txt

📚 Total documents loaded: 1

📄 First document preview:
Source: knowledgebase\sample.txt
Content (first 300 chars): 
        Welcome to the AI Agents Tutorial!
        
        This document contains information about artificial intelligence agents.
        
        What are AI Agents?
        AI agents are autonomous systems that can perceive their environment,
        make decisions, and take actions to achieve...


## 4. Split Documents into Chunks

Large documents need to be split into smaller chunks for effective retrieval. We'll use LangChain's text splitters to create manageable pieces while preserving context.

In [6]:
# Import text splitters
from langchain.text_splitter import RecursiveCharacterTextSplitter

# Create text splitter
text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=1000,  # Maximum characters per chunk
    chunk_overlap=200,  # Overlap between chunks to preserve context
    length_function=len,
    separators=["\n\n", "\n", " ", ""]  # Split on paragraphs, then lines, then words
)

# Split documents into chunks
document_chunks = text_splitter.split_documents(documents)

print(f"📄 Original documents: {len(documents)}")
print(f"🔪 Document chunks after splitting: {len(document_chunks)}")

# Display example chunk
if document_chunks:
    print(f"\n📋 Example chunk:")
    print(f"Source: {document_chunks[0].metadata.get('source', 'Unknown')}")
    print(f"Content: {document_chunks[0].page_content[:200]}...")

📄 Original documents: 1
🔪 Document chunks after splitting: 1

📋 Example chunk:
Source: knowledgebase\sample.txt
Content: Welcome to the AI Agents Tutorial!
        
        This document contains information about artificial intelligence agents.
        
        What are AI Agents?
        AI agents are autonomous syste...


## 5. Create Embeddings and Vector Store

To enable semantic search, we'll convert our text chunks into vector embeddings and store them in a vector database. We'll use HuggingFace embeddings and FAISS for fast similarity search.

In [7]:
# Import embedding and vector store components
from langchain_community.embeddings import HuggingFaceEmbeddings
from langchain_community.vectorstores import FAISS

# Initialize embeddings model
embeddings = HuggingFaceEmbeddings(
    model_name="sentence-transformers/all-MiniLM-L6-v2",  # Fast and effective model
    model_kwargs={'device': 'cpu'}  # Use CPU (change to 'cuda' if you have GPU)
)

print("🔢 Creating embeddings for document chunks...")

# Create vector store from documents
vector_store = FAISS.from_documents(
    documents=document_chunks,
    embedding=embeddings
)

print(f"✅ Vector store created with {len(document_chunks)} embedded chunks")

# Test similarity search
test_query = "What are AI agents?"
similar_docs = vector_store.similarity_search(test_query, k=3)

print(f"\n🔍 Testing similarity search for: '{test_query}'")
print(f"Found {len(similar_docs)} similar documents:")
for i, doc in enumerate(similar_docs, 1):
    print(f"\n{i}. Source: {doc.metadata.get('source', 'Unknown')}")
    print(f"   Content: {doc.page_content[:150]}...")


  embeddings = HuggingFaceEmbeddings(
  from .autonotebook import tqdm as notebook_tqdm
  from .autonotebook import tqdm as notebook_tqdm
To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development
To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development
Xet Storage is enabled for this repo, but the 'hf_xet' package is not installed. Falling back to regular HTTP download. For better performance, install the package with: `pip install huggingface_hub[hf_xet]` or `pip install hf_xet`
Xet Storage is enabled for this repo, but the 'hf_xet' package is not installed. Falling back to regular HT

🔢 Creating embeddings for document chunks...
✅ Vector store created with 1 embedded chunks

🔍 Testing similarity search for: 'What are AI agents?'
Found 1 similar documents:

1. Source: knowledgebase\sample.txt
   Content: Welcome to the AI Agents Tutorial!
        
        This document contains information about artificial intelligence agents.
        
        What are...
✅ Vector store created with 1 embedded chunks

🔍 Testing similarity search for: 'What are AI agents?'
Found 1 similar documents:

1. Source: knowledgebase\sample.txt
   Content: Welcome to the AI Agents Tutorial!
        
        This document contains information about artificial intelligence agents.
        
        What are...


## 6. Create RAG Chain

Now we'll combine retrieval and generation by creating a RAG chain that can answer questions based on our documents.

In [8]:
# Import RAG components
from langchain.chains import RetrievalQA
from langchain.prompts import PromptTemplate

# Create retriever from vector store
retriever = vector_store.as_retriever(
    search_type="similarity",
    search_kwargs={"k": 4}  # Retrieve top 4 most relevant chunks
)

# Create custom prompt template for RAG
rag_prompt_template = """
You are a helpful assistant that answers questions based on the provided context.
Use the following pieces of context to answer the question at the end.
If you don't know the answer based on the context, just say that you don't know.
Don't try to make up an answer.

Context: {context}

Question: {question}

Answer:"""

RAG_PROMPT = PromptTemplate(
    template=rag_prompt_template,
    input_variables=["context", "question"]
)

# Create RAG chain
rag_chain = RetrievalQA.from_chain_type(
    llm=llm,
    chain_type="stuff",
    retriever=retriever,
    return_source_documents=True,
    chain_type_kwargs={"prompt": RAG_PROMPT}
)

print("✅ RAG chain created successfully!")

✅ RAG chain created successfully!


## 7. Test RAG System

Let's test our RAG system with various questions to see how it retrieves relevant information and generates answers based on our documents.

In [9]:
# Test the RAG system with various questions

def test_rag_system(question):
    """Test the RAG system with a question and display results"""
    print(f"🤔 Question: {question}")
    print("-" * 50)
    
    # Get answer from RAG chain
    result = rag_chain.invoke({"query": question})
    
    print(f"🤖 Answer: {result['result']}")
    print(f"\n📚 Sources used ({len(result['source_documents'])} documents):")
    
    for i, doc in enumerate(result['source_documents'], 1):
        print(f"{i}. {doc.metadata.get('source', 'Unknown')} - {doc.page_content[:100]}...")
    
    print("\n" + "="*70 + "\n")
    return result

# Test with different types of questions
test_questions = [
    "What are AI agents?",
    "What are the different types of AI agents?",
    "What capabilities does Mistral have?",
    "What are the components of RAG?",
    "How do AI agents perceive their environment?"
]

for question in test_questions:
    test_rag_system(question)

🤔 Question: What are AI agents?
--------------------------------------------------
🤖 Answer: AI agents are autonomous systems that can perceive their environment, make decisions, and take actions to achieve specific goals.

📚 Sources used (1 documents):
1. knowledgebase\sample.txt - Welcome to the AI Agents Tutorial!
        
        This document contains information about artific...


🤔 Question: What are the different types of AI agents?
--------------------------------------------------
🤖 Answer: AI agents are autonomous systems that can perceive their environment, make decisions, and take actions to achieve specific goals.

📚 Sources used (1 documents):
1. knowledgebase\sample.txt - Welcome to the AI Agents Tutorial!
        
        This document contains information about artific...


🤔 Question: What are the different types of AI agents?
--------------------------------------------------
🤖 Answer: Based on the provided context, the different types of AI agents are:

1. Reactive

## 8. Interactive RAG Demo

Let's create an interactive function where you can ask questions about your documents and get real-time answers with source attribution.

In [10]:
# Interactive RAG function
def ask_question(question, show_sources=True):
    """
    Ask a question to the RAG system and get an answer with optional source display
    """
    try:
        result = rag_chain.invoke({"query": question})
        
        print(f"🤔 Question: {question}")
        print(f"🤖 Answer: {result['result']}")
        
        if show_sources:
            print(f"\n📚 Sources:")
            for i, doc in enumerate(result['source_documents'], 1):
                source_file = doc.metadata.get('source', 'Unknown')
                content_preview = doc.page_content[:150].replace('\n', ' ')
                print(f"  {i}. {source_file}: {content_preview}...")
        
        return result
        
    except Exception as e:
        print(f"❌ Error: {e}")
        return None

# Try some example questions
print("🚀 Interactive RAG Demo\n")

# Example 1: General question about AI agents
ask_question("What are the key components of AI agents?")

print("\n" + "="*60 + "\n")

# Example 2: Specific question about types
ask_question("Can you explain the different types of AI agents?")

print("\n" + "="*60 + "\n")

# Example 3: Question about specific technology
ask_question("What information do you have about Mistral AI?")

print("\n" + "="*60 + "\n")

# You can now use ask_question("Your question here") to test with your own questions!

🚀 Interactive RAG Demo

🤔 Question: What are the key components of AI agents?
🤖 Answer: The key components of AI agents are:

- Perception: Sensing the environment
- Decision Making: Choosing appropriate actions
- Action: Executing chosen behaviors
- Learning: Improving performance over time

📚 Sources:
  1. knowledgebase\sample.txt: Welcome to the AI Agents Tutorial!                  This document contains information about artificial intelligence agents.                  What are...


🤔 Question: What are the key components of AI agents?
🤖 Answer: The key components of AI agents are:

- Perception: Sensing the environment
- Decision Making: Choosing appropriate actions
- Action: Executing chosen behaviors
- Learning: Improving performance over time

📚 Sources:
  1. knowledgebase\sample.txt: Welcome to the AI Agents Tutorial!                  This document contains information about artificial intelligence agents.                  What are...


🤔 Question: Can you explain the differen

## 9. Advanced RAG Features

Let's explore some advanced features like conversation memory and hybrid search to make our RAG system more sophisticated.

In [11]:
# Advanced RAG with conversation memory
from langchain.memory import ConversationBufferMemory
from langchain.chains import ConversationalRetrievalChain

# Set up conversation memory
memory = ConversationBufferMemory(
    memory_key="chat_history",
    return_messages=True,
    output_key="answer"
)

# Create conversational RAG chain
conversational_rag = ConversationalRetrievalChain.from_llm(
    llm=llm,
    retriever=retriever,
    memory=memory,
    return_source_documents=True
)

print("✅ Conversational RAG system created!")

# Test conversational capabilities
def chat_with_rag(message):
    """Chat with the RAG system maintaining conversation history"""
    try:
        result = conversational_rag.invoke({"question": message})
        print(f"🤔 You: {message}")
        print(f"🤖 RAG: {result['answer']}")
        
        # Show sources for transparency
        if result.get('source_documents'):
            print(f"📚 Sources: {len(result['source_documents'])} documents referenced")
        
        return result
    except Exception as e:
        print(f"❌ Error: {e}")
        return None

# Demonstrate conversation flow
print("\n🗣️ Conversational RAG Demo:")
print("="*50)

# First question
chat_with_rag("What are AI agents?")

print("\n" + "-"*30 + "\n")

# Follow-up question (should use context from previous)
chat_with_rag("Can you give me more details about the reactive type?")

print("\n" + "-"*30 + "\n")

# Another follow-up
chat_with_rag("How do they differ from goal-based agents?")

print("\n🎯 Notice how the system maintains context across questions!")

✅ Conversational RAG system created!

🗣️ Conversational RAG Demo:
🤔 You: What are AI agents?
🤖 RAG: AI agents are autonomous systems that can perceive their environment, make decisions, and take actions to achieve specific goals.
📚 Sources: 1 documents referenced

------------------------------

🤔 You: What are AI agents?
🤖 RAG: AI agents are autonomous systems that can perceive their environment, make decisions, and take actions to achieve specific goals.
📚 Sources: 1 documents referenced

------------------------------

🤔 You: Can you give me more details about the reactive type?
🤖 RAG: Reactive AI agents operate based solely on the current input they receive from their environment without relying on any internal memory or historical data. These agents follow the principle of the "condition-action" rule, meaning they respond to immediate perceptions with predetermined actions. Reactive agents are typically simpler in design and are effective in environments where quick responses are 

## 10. Wrap-up: RAG System Architecture

In this notebook, we've built a comprehensive **RAG (Retrieval-Augmented Generation) System** with LangChain and Mistral:

**Key Components Implemented:**
- **Document Loading**: Multi-format document ingestion (PDF, DOCX, TXT, JSON, CSV)
- **Text Splitting**: Intelligent chunking with overlap for context preservation
- **Embeddings**: Semantic vector representations using HuggingFace models
- **Vector Store**: FAISS for fast similarity search and retrieval
- **RAG Chain**: Integration of retrieval and generation with Mistral LLM
- **Conversational Memory**: Context-aware multi-turn conversations

**RAG Workflow:**
1. **Ingestion**: Load and split documents into chunks
2. **Embedding**: Convert chunks to vector representations
3. **Storage**: Store embeddings in vector database
4. **Retrieval**: Find relevant chunks based on query similarity
5. **Generation**: Generate answers using retrieved context and LLM
6. **Response**: Return answer with source attribution

**Next Steps:**
- **Enhanced Retrieval**: Implement hybrid search (semantic + keyword)
- **Advanced Chunking**: Use semantic chunking strategies
- **Multi-modal RAG**: Add support for images and other media
- **Evaluation**: Implement RAG evaluation metrics (RAGAS, etc.)
- **Production**: Deploy with FastAPI and add caching

**Benefits of RAG:**
- Access to current, domain-specific information
- Reduced hallucination through grounded responses
- Source attribution for transparency
- Scalable knowledge base updates

This RAG system provides a solid foundation for building knowledge-aware AI applications!