# RAG System Demo

This notebook demonstrates the RAG system for Aether chat features.

## Contents
1. Setup & Imports
2. Loading Chat Data
3. Document Processing
4. Semantic Search
5. Question Answering

In [6]:
import os
from dotenv import load_dotenv
from langchain_openai import OpenAIEmbeddings, ChatOpenAI
from langchain.chains import RetrievalQA
from langchain_community.vectorstores import FAISS

# Load environment variables
load_dotenv()

# Initialize OpenAI components
embeddings = OpenAIEmbeddings()
llm = ChatOpenAI(temperature=0)
print("OpenAI components initialized!")

OpenAI components initialized!


## Loading Chat Data

First, we will load our chat data from the mock conversations we created.

In [7]:
from tools.rag.data_prep import create_langchain_documents

# Load documents
documents = create_langchain_documents()
print(f"Loaded {len(documents)} documents")

# Display first document as example
print("\nExample document:")
print(f"Content: {documents[0].page_content}")
print(f"Metadata: {documents[0].metadata}")

ModuleNotFoundError: No module named 'tools'

## Document Processing

Now we will create our vector store using FAISS.

In [None]:
# Create vector store
vectorstore = FAISS.from_documents(documents, embeddings)
print("Vector store created!")

# Create retriever
retriever = vectorstore.as_retriever(
    search_type="similarity",
    search_kwargs={"k": 3}
)
print("Retriever configured!")

## Semantic Search

Let us test our semantic search capabilities.

In [None]:
# Test query
query = "What are some examples of high energy conversations?"
docs = retriever.get_relevant_documents(query)

print(f"Found {len(docs)} relevant documents\n")
for i, doc in enumerate(docs):
    print(f"Document {i+1}:")
    print(f"Content: {doc.page_content}")
    print(f"Metadata: {doc.metadata}\n")

## Question Answering

Finally, let us test our question answering chain.

In [None]:
# Create QA chain
qa_chain = RetrievalQA.from_chain_type(
    llm=llm,
    chain_type="stuff",
    retriever=retriever,
    return_source_documents=True
)

# Test question
question = "How does the system handle conversations with varying energy levels?"
result = qa_chain({"query": question})

print("Question:", question)
print("\nAnswer:", result["result"])
print("\nSource Documents:")
for i, doc in enumerate(result["source_documents"]):
    print(f"\nDocument {i+1}:")
    print(f"Content: {doc.page_content}")
    print(f"Metadata: {doc.metadata}")