# Exploring Retrieval-Augmented Generation (RAG) vs Plain Chat Completions with HR Data

*This notebook demonstrates how a simple LLM chat compares to a RAG-powered approach when answering company-specific HR questions.*

**Get your API key:**
[Google AI Studio](https://makersuite.google.com/app/apikey) or
[Openai](https://platform.openai.com/settings/organization/api-keys)
 → Create API Key → Copy to config below

## 📦 Install Packages

**Run this cell to install dependencies:**
<p> Ignore Google colab package install errors. They are specific to only the colab environment</p>

In [None]:
try:
  !pip install -q langchain langchain-google-genai langchain-community langchain-openai langchain-chroma
  !pip install -q python-dotenv
  !pip install -q matplotlib
  !pip install -q plotly
  !pip install -q scikit-learn
  !pip install -q requests
  !pip install -q gradio

  print("✅ Packages installed!")
except Exception as e:
  print(f"❌ Error: {str(e)}")

## 📚 Imports

**Loading libraries:**

In [None]:
from langchain_google_genai import ChatGoogleGenerativeAI, GoogleGenerativeAIEmbeddings
from langchain_openai import ChatOpenAI, OpenAIEmbeddings
from langchain.schema import HumanMessage, SystemMessage
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.document_loaders import TextLoader
from langchain_chroma import Chroma
from langchain.chains import RetrievalQA, ConversationalRetrievalChain
from langchain.memory import ConversationBufferMemory
import warnings
import os
#env imports
from google.colab import userdata
#data visualization
import matplotlib.pyplot as plt
from sklearn.manifold import TSNE
import numpy as np
import plotly.graph_objects as go

import gradio

warnings.filterwarnings("ignore")
print("✅ Libraries loaded!")

## 🔧 Configuration

**Set your API key and preferences:**

In [None]:
# API Configuration
# uncomment the preferered provider
# GOOGLE_API_KEY = userdata.get('GOOGLE_API_KEY')
OPENAI_API_KEY = userdata.get('OPENAI_API_KEY')

# Model Configuration
# MODEL_NAME = "gemini-2.0-flash"
MODEL_NAME = "gpt-4o-mini" # for avaialable models visit https://platform.openai.com/docs/models

# Chat Settings
SYSTEM_PROMPT = """You are an AI assistant trained to help with workplace and HR-related questions.
 Provide professional, well-structured answers using your general understanding of HR practices, company culture, and employee roles."""
# GOOGLE_CONFIG = {
#     "temperature": 0.7,       # creativity vs determinism
#     "max_output_tokens": 200, # length of response
# }
# google llm
# llm = ChatGoogleGenerativeAI(
#     google_api_key=GOOGLE_API_KEY,
#     model=MODEL_NAME,
#     **GOOGLE_CONFIG
# )

OPENAI_CONFIG = {
    "temperature": 0.7,  # creativity vs determinism
    "max_tokens": 200, # length of response
}
#Open AI llm
llm = ChatOpenAI(
    api_key=OPENAI_API_KEY,
    model_name=MODEL_NAME,
    **OPENAI_CONFIG
)

# RAG Settings
CHUNK_SIZE = 200
CHUNK_OVERLAP = 20
TOP_K_RESULTS = 2

# Vector Store (ChromaDB)
COLLECTION_NAME = "hr_handbook"

print(f"✅ Config ready | Model: {MODEL_NAME} | Vector Store: ChromaDB")



## 💬 Basic Chat




In [None]:
# Initialize Gemini model

def simple_chat(user_prompt: str) -> str:
    """Simple chat with Gemini"""
    try:
      messages = [
        {"role": "system", "content": SYSTEM_PROMPT},
        {"role": "user", "content": user_prompt},
      ]

      response = llm.invoke(messages)
      return response.content
    except Exception as e:
        return f"❌ Error: {str(e)}"

print("✅ Chat model ready!")

## 🧪 Test Chat

**Try the basic chat:**

In [None]:
# Test basic chat
response = simple_chat("What is the leave policy of our company?")
print("🤖 Response:")
print(response)

## 🗄️ Initialize Embeddings

**Setup Google embeddings model:**

In [None]:
# Initialize Google embeddings
# embeddings = GoogleGenerativeAIEmbeddings(
#     model = 'models/gemini-embedding-001',
#     google_api_key=GOOGLE_API_KEY
# )

# Initialize Openai embeddings
embeddings = OpenAIEmbeddings(
    model="text-embedding-3-small",
    api_key=OPENAI_API_KEY
)

print("✅ Embeddings model ready!")

## 📄 Process Documents

**Load and chunk the sample documents:**

In [None]:
# Setup text splitter
text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=CHUNK_SIZE,
    chunk_overlap=CHUNK_OVERLAP
)

# Load and process documents from the 'hr_data' folder
documents = []
hr_handbook_path = "hr_data/hr_handbook.txt" # Assuming it's a text file
if os.path.exists(hr_handbook_path):
    loader = TextLoader(hr_handbook_path)
    docs = loader.load()
    documents.extend(text_splitter.split_documents(docs))
else:
    print(f"❌ Error: File not found at {hr_handbook_path}")


print(f"📄 Processed {len(documents)} document chunks from {hr_handbook_path}")

## 🗄️ Create ChromaDB

**Build the vector database:**

In [None]:
# Create ChromaDB vector store
vectorstore = Chroma.from_documents(
    documents=documents,
    embedding=embeddings,
    collection_name=COLLECTION_NAME,
    persist_directory="./chroma_db"
)

print(f"✅ ChromaDB created with {len(documents)} chunks!")

#
## 🕹️ Visualizing the Vector Store

In [None]:
collection = vectorstore._collection
result = collection.get(include=['embeddings', 'documents'])
vectors = np.array(result['embeddings'])
documents = result['documents']

tsne = TSNE(n_components=3, random_state=42, perplexity=5)
reduced_vectors = tsne.fit_transform(vectors)

# Create the 3D scatter plot
fig = go.Figure(data=[go.Scatter3d(
    x=reduced_vectors[:, 0],
    y=reduced_vectors[:, 1],
    z=reduced_vectors[:, 2],
    mode='markers',
    marker=dict(size=5, color='blue', opacity=0.8),
    text=[f"Text: {d[:100]}..." for d in documents],
    hoverinfo='text'
)])

fig.update_layout(
    title='3D Chroma Vector Store Visualization',
    scene=dict(xaxis_title='x', yaxis_title='y', zaxis_title='z'),
    width=900,
    height=700,
    margin=dict(r=20, b=10, l=10, t=40)
)

fig.show()

## 🔍 Setup Retriever

**Configure document retrieval:**

In [None]:
# Setup retriever from vectorstore
retriever = vectorstore.as_retriever(search_kwargs={"k": TOP_K_RESULTS})

# Test retrieval
test_query = "Who is Grace Williams"
retrieved_docs = retriever.get_relevant_documents(test_query)

print(f"✅ Retriever ready! Test retrieved {len(retrieved_docs)} documents for: '{test_query}'")
for i, doc in enumerate(retrieved_docs, 1):

    filename = doc.metadata.get('source', 'Unknown').split('/')[-1]
    print(f"  {i}. {filename}")

## 🔗 Setup Memory

**Configure Langchain memory for llm**

In [None]:
# Setup memory for conversational RAG
memory = ConversationBufferMemory(
    memory_key="chat_history",
    return_messages=True,
    output_key="answer"
)

# Create RAG chain WITH memory (conversational)
rag_chain = ConversationalRetrievalChain.from_llm(
    llm=llm,
    retriever=retriever,
    memory=memory,
    #verbose=True #print context sent to llm
)


def rag_chat(question: str, chat_history):
    """RAG with conversation memory"""
    try:
        result = rag_chain({"question": question, "chat_history": chat_history})
        return result["answer"]
    except Exception as e:
        return f"❌ Error: {str(e)}", []

print("✅ RAG setup ready!")

# 🧪 Test RAG

In [None]:
question = "What is the leave policy of our company?"
rag_response = rag_chat(question, chat_history=[])
print("🤖 RAG RESPONSE:")
print(rag_response)
print("\n" + "="*60)

## ⚖️ Compare: Chat vs RAG

**Test the same question with both approaches:**

In [None]:
# Test question about specific information in our documents
test_question = "Can I work from home?"

print("🔍 TEST QUESTION:")
print(test_question)
print("\n" + "="*60)

## 💬 Step 1: Regular Chat

**Ask without document context:**

In [None]:
# Regular chat (no RAG)
regular_response = simple_chat(test_question)

print("💬 REGULAR CHAT RESPONSE:")
print(regular_response)
print("\n" + "="*60)

## 🔍 Step 2: RAG Chat

**Ask with document context:**

In [None]:
# RAG chat memory
rag_response = rag_chat(test_question)

print("🔍 RAG CHAT RESPONSE:")
print(rag_response)
print("\n" + "="*60)