# Retrieval-Augmented Generation (RAG) QA System using LangChain, ChromaDB, and GPT-4o-mini

This notebook demonstrates a RAG pipeline that:
- Loads `.txt` documents from a folder
- Splits and embeds the documents using Hugging Face's MiniLM model
- Stores the embeddings in a Chroma vector database
- Retrieves relevant documents for a query
- Uses GPT-4o-mini via OpenRouter to generate an answer


## Setup and Imports
Load all required libraries and environment variables.


In [1]:
import os
from dotenv import load_dotenv

from langchain_community.vectorstores import Chroma
from langchain.embeddings import HuggingFaceEmbeddings
from langchain_community.document_loaders import TextLoader
from langchain.text_splitter import CharacterTextSplitter

from langchain_openai import ChatOpenAI
from langchain_core.messages import SystemMessage, HumanMessage

# Load environment variables from .env
load_dotenv()


True

## Define File Paths
Set the document directory and vector store path.


In [2]:
current_dir = os.getcwd()
books_dir = os.path.join(current_dir, "documents")
persistent_directory = os.path.join(current_dir, "db", "chroma_db_with_metadata")

print("Documents directory:", books_dir)
print("Vector DB directory:", persistent_directory)


Documents directory: c:\Users\Aditya Jain\OneDrive\finance summer\langChain\rag_project\documents
Vector DB directory: c:\Users\Aditya Jain\OneDrive\finance summer\langChain\rag_project\db\chroma_db_with_metadata


## Build Vector Store (if not already created)
Load documents, split into chunks, embed, and store in Chroma.


In [3]:
if not os.path.exists(persistent_directory):
    print("Creating vector store...")

    if not os.path.exists(books_dir):
        raise FileNotFoundError(f"{books_dir} not found.")

    documents = []
    for fname in os.listdir(books_dir):
        if fname.endswith(".txt"):
            path = os.path.join(books_dir, fname)
            loader = TextLoader(path, encoding="utf-8")
            book_docs = loader.load()
            for doc in book_docs:
                doc.metadata = {"source": fname}
                documents.append(doc)

    # Split documents into chunks
    text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
    docs = text_splitter.split_documents(documents)

    print("Number of chunks:", len(docs))

    # Create embeddings
    embeddings = HuggingFaceEmbeddings(model_name="all-MiniLM-L6-v2")

    # Create and persist the vector store
    db = Chroma.from_documents(docs, embeddings, persist_directory=persistent_directory)
    print("Vector store successfully created.")
else:
    print("Vector store already exists.")


Created a chunk of size 1184, which is longer than the specified 1000
Created a chunk of size 1045, which is longer than the specified 1000
Created a chunk of size 1132, which is longer than the specified 1000
Created a chunk of size 1674, which is longer than the specified 1000
Created a chunk of size 1610, which is longer than the specified 1000
Created a chunk of size 1562, which is longer than the specified 1000
Created a chunk of size 1063, which is longer than the specified 1000
Created a chunk of size 1543, which is longer than the specified 1000
Created a chunk of size 2597, which is longer than the specified 1000
Created a chunk of size 2613, which is longer than the specified 1000
Created a chunk of size 1079, which is longer than the specified 1000
Created a chunk of size 1251, which is longer than the specified 1000
Created a chunk of size 1534, which is longer than the specified 1000
Created a chunk of size 1323, which is longer than the specified 1000
Created a chunk of s

Created a chunk of size 1314, which is longer than the specified 1000
Created a chunk of size 1485, which is longer than the specified 1000
Created a chunk of size 1177, which is longer than the specified 1000
Created a chunk of size 1781, which is longer than the specified 1000
Created a chunk of size 3129, which is longer than the specified 1000
Created a chunk of size 1153, which is longer than the specified 1000
Created a chunk of size 1286, which is longer than the specified 1000
Created a chunk of size 1430, which is longer than the specified 1000
Created a chunk of size 1027, which is longer than the specified 1000
Created a chunk of size 3687, which is longer than the specified 1000
Created a chunk of size 1039, which is longer than the specified 1000
Created a chunk of size 1714, which is longer than the specified 1000
Created a chunk of size 1067, which is longer than the specified 1000
Created a chunk of size 1121, which is longer than the specified 1000
Created a chunk of s

Creating vector store...


Created a chunk of size 1012, which is longer than the specified 1000
Created a chunk of size 1096, which is longer than the specified 1000
Created a chunk of size 2214, which is longer than the specified 1000
Created a chunk of size 2222, which is longer than the specified 1000
Created a chunk of size 1673, which is longer than the specified 1000
Created a chunk of size 1465, which is longer than the specified 1000
Created a chunk of size 1069, which is longer than the specified 1000
Created a chunk of size 1233, which is longer than the specified 1000
Created a chunk of size 1518, which is longer than the specified 1000
Created a chunk of size 1467, which is longer than the specified 1000
Created a chunk of size 1397, which is longer than the specified 1000
Created a chunk of size 1001, which is longer than the specified 1000
Created a chunk of size 1927, which is longer than the specified 1000
Created a chunk of size 1375, which is longer than the specified 1000
Created a chunk of s

Number of chunks: 1730

Vector store successfully created.


## Load Vector Store
Reload the saved ChromaDB and HuggingFace embeddings.


In [4]:
embeddings = HuggingFaceEmbeddings(model_name="all-MiniLM-L6-v2")
db = Chroma(persist_directory=persistent_directory, embedding_function=embeddings)


  db = Chroma(persist_directory=persistent_directory, embedding_function=embeddings)


In [5]:
query = "What does Dracula fear the most?"

retriever = db.as_retriever(search_type="similarity", search_kwargs={"k": 3})
relevant_docs = retriever.invoke(query)

# Display relevant chunks
for i, doc in enumerate(relevant_docs, 1):
    print(f"Document {i} (from {doc.metadata['source']}):\n{doc.page_content}\n")


Document 1 (from Dracula.txt):
“Thus when we find the habitation of this man-that-was, we can confine
him to his coffin and destroy him, if we obey what we know. But he is
clever. I have asked my friend Arminius, of Buda-Pesth University, to
make his record; and, from all the means that are, he tell me of what he
has been. He must, indeed, have been that Voivode Dracula who won his
name against the Turk, over the great river on the very frontier of
Turkey-land. If it be so, then was he no common man; for in that time,
and for centuries after, he was spoken of as the cleverest and the most
cunning, as well as the bravest of the sons of the ‘land beyond the
forest.’ That mighty brain and that iron resolution went with him to his
grave, and are even now arrayed against us. The Draculas were, says
Arminius, a great and noble race, though now and again were scions who
were held by their coevals to have had dealings with the Evil One. They
learned his secrets in the Scholomance, amongst the 

## Generate Answer Using GPT-4o-mini
Pass the context and question to the OpenAI model to get an answer.


In [6]:
combined_input = (
    "Use only the following documents to answer the question.\n\n"
    + "\n\n".join([doc.page_content for doc in relevant_docs])
    + f"\n\nQuestion: {query}\n"
    + "Answer briefly. If the answer is not found, respond with 'I'm not sure'."
)

model = ChatOpenAI(
    model="openai/gpt-4o-mini",
    openai_api_key=os.getenv("GITHUB_TOKEN2"),
    openai_api_base="https://models.github.ai/inference"
)

messages = [
    SystemMessage(content="You are a helpful assistant."),
    HumanMessage(content=combined_input),
]

result = model.invoke(messages)
print("Generated Answer:\n", result.content)


Generated Answer:
 Dracula fears being confined to his coffin and destroyed, as indicated by the reference to banishing him from his tomb and the notion of the "Un-Dead" being defeated.


## Summary

- The notebook demonstrates a working RAG pipeline.
- Text data is chunked, embedded, and stored in a vector database.
- The query retrieves top-k similar chunks and uses them for LLM-based answering.
- This approach can be scaled to support PDFs, UI apps (Streamlit/Gradio), or multiple LLM backends.
