## ✅ Step 1: Install Required Packages
Before running the code, install the necessary dependencies.

We will install the following:
- **`langchain`** → A framework for building LLM-based applications.
- **`langchain_openai`** → The updated module for OpenAI embeddings.
- **`chromadb`** → A lightweight vector database for storing embeddings.
- **`openai`** → Required for OpenAI API calls.
- **`pypdf`** → PDF parser to extract text from PDF files.

Use the following command to install all necessary dependencies.

In [None]:

!pip install --upgrade pip
!pip install langchain langchain_openai chromadb openai pypdf

## ✅ Step 2: Load and Split Text from PDF

Now that we have installed all required dependencies, the next step is to **load the PDF file and extract text**.  
Since LLMs have **token limits**, we must split large documents into **smaller chunks** for better retrieval.  

### 🔹 Why Split the Text?
- **LLMs handle limited tokens** → We break the text into **smaller, meaningful chunks**.
- **Improves retrieval accuracy** → Smaller sections make searches more relevant.
- **Uses chunk overlap** → Ensures chunks do not lose context when split.

### 🔹 What We Will Do:
1. **Load the PDF using `PyPDFLoader`**.
2. **Extract text from all pages**.
3. **Use `RecursiveCharacterTextSplitter`** to break text into chunks.

---

In [9]:
from langchain.document_loaders import PyPDFLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter

# 📌 Set the path to your PDF file
pdf_path = "/Users/hfakhrav/Downloads/2005.11401v4.pdf"  # Update with your actual file path

# 🔍 Load the PDF using PyPDFLoader
loader = PyPDFLoader(pdf_path)
documents = loader.load()

# 📌 Define the RecursiveCharacterTextSplitter
text_splitter = RecursiveCharacterTextSplitter(
    separators=[". ", "\n\n", " "],  # Sentence → Paragraph → Word splits
    chunk_size=85,  # Number of characters per chunk
    chunk_overlap=20,  # Overlapping characters to maintain context
    length_function=len,
    strip_whitespace=True,  
    keep_separator=True  
)

# 🔍 Split the document into chunks
chunks = text_splitter.split_documents(documents)

print(f"✅ Extracted {len(chunks)} chunks from the PDF.")

✅ Extracted 1149 chunks from the PDF.


## ✅ Step 3: Embed and Store in ChromaDB

Now that we have **extracted 1149 chunks from the PDF**, the next step is to **convert these text chunks into embeddings** and **store them in ChromaDB**.

### 🔹 Why Use Embeddings?
- **Embeddings convert text into numerical vectors** that capture **semantic meaning**.
- **Similar vectors indicate similar meanings**, allowing efficient document retrieval.
- **OpenAI's `text-embedding-3-small` model** will be used to generate these embeddings.

### 🔹 Why Store in ChromaDB?
- **ChromaDB is a fast, lightweight vector database** designed for storing and searching embeddings.
- **Allows quick retrieval of relevant text** using similarity searches.

### 🔹 What We Will Do:
1. **Convert text chunks into LangChain `Document` objects**.
2. **Use OpenAI’s `text-embedding-3-small` model** to generate embeddings.
3. **Store the embeddings in a ChromaDB vector store** for retrieval.

---

In [12]:
import os
from langchain_openai import OpenAIEmbeddings  # ✅ Correct Import
from langchain.vectorstores import Chroma
from langchain.docstore.document import Document

# ✅ Set OpenAI API key
os.environ["OPENAI_API_KEY"] = ""  # Replace with your actual key

# ✅ Initialize OpenAI embedding model
embedding_model = OpenAIEmbeddings(model="text-embedding-3-small")  # Fixed import

# ✅ Convert text chunks into LangChain `Document` format
documents = [Document(page_content=chunk.page_content) for chunk in chunks]

# ✅ Create a Chroma vector store and embed the chunks
vector_store = Chroma.from_documents(documents, embedding_model)

print("✅ Chunks embedded and stored in Chroma successfully!")

✅ Chunks embedded and stored in Chroma successfully!


## ✅ Step 4: Querying ChromaDB for Retrieval

Now that our **chunks are embedded and stored in ChromaDB**, we can **retrieve relevant text** using a similarity search.

### 🔹 How Does Retrieval Work?
1. **Convert the search query into an embedding** using OpenAI's `text-embedding-3-small`.
2. **Compare the query embedding with stored embeddings** in ChromaDB.
3. **Return the most relevant text chunks** based on similarity.

### 🔹 Why is This Useful?
- **Retrieve relevant context** for answering questions.
- **Find specific sections of a document** instantly.
- **Improve AI-generated responses** by retrieving factual information.

---

In [14]:
# 📌 Define a user query
query = "What is Retrieval-Augmented Generation (RAG)?"

# 🔍 Retrieve the top 3 most relevant chunks
retrieved_docs = vector_store.similarity_search(query, k=3)

# 📌 Print retrieved documents
print("🔍 Retrieved Chunks:")
for i, doc in enumerate(retrieved_docs):
    print(f"{i+1}: {doc.page_content}\n")

🔍 Retrieved Chunks:
1: we refer to as retrieval-augmented generation (RAG).
We build RAG models where the

2: retrieval-augmented generation
(RAG) — models which combine pre-trained parametric

3: like RAG, generate answers, but which do not exploit retrieval, instead
relying



## ✅ Step 5: Integrating Retrieval into a Chatbot

Now that we can **retrieve relevant text from ChromaDB**, we will:
1. **Accept user input (a question)**
2. **Retrieve the most relevant chunks from ChromaDB**
3. **Generate an AI-powered response** using OpenAI’s GPT model.

### 🔹 Why Integrate a Chatbot?
- **Interactive Q&A system** for document-based queries.
- **Retrieve accurate responses** based on stored knowledge.
- **Can be integrated into a website, Slack bot, or internal tool.**

---

In [28]:
import os
import openai  # ✅ Ensure it's at the top with no indentation issues

from langchain_openai import OpenAIEmbeddings
from langchain.vectorstores import Chroma
from langchain.docstore.document import Document

# ✅ Securely retrieve the OpenAI API key
openai.api_key = os.getenv("OPENAI_API_KEY")

# ✅ Initialize OpenAI embedding model
embedding_model = OpenAIEmbeddings(model="text-embedding-3-small")

# ✅ Convert text chunks into LangChain `Document` format
documents = [Document(page_content=chunk.page_content) for chunk in chunks]

# ✅ Create a Chroma vector store and embed the chunks
vector_store = Chroma.from_documents(documents, embedding_model)

print("✅ Chunks embedded and stored in Chroma successfully!")

# 📌 Define a user query
query = "What is Retrieval-Augmented Generation (RAG)?"

# 🔍 Retrieve the top 3 most relevant chunks
retrieved_docs = vector_store.similarity_search(query, k=3)

# 📌 Print retrieved documents
print("🔍 Retrieved Chunks:")
for i, doc in enumerate(retrieved_docs):
    print(f"{i+1}: {doc.page_content}\n")

def ask_question(query):
    """
    Function to retrieve relevant text from ChromaDB and generate an AI response.
    """
    # ✅ Retrieve the top 3 most relevant chunks
    retrieved_docs = vector_store.similarity_search(query, k=3)
    
    # ✅ Combine the retrieved chunks into a single context
    context = "\n\n".join([doc.page_content for doc in retrieved_docs])
    
    # ✅ Construct the prompt for GPT
    prompt = f"""
    You are an AI assistant. Answer the question using the retrieved context below.
    
    Context:
    {context}
    
    Question: {query}
    
    Answer:
    """

    # ✅ Generate the response using OpenAI's new API
    client = openai.OpenAI()  # ✅ Corrected OpenAI API usage
    response = client.chat.completions.create(
        model="gpt-4",
        messages=[{"role": "system", "content": "You are an AI that answers questions based on retrieved documents."},
                  {"role": "user", "content": prompt}]
    )

    # ✅ Extract and return the response
    return response.choices[0].message.content

# ✅ Run the chatbot interactively
while True:
    user_query = input("Ask a question (or type 'exit' to quit): ")
    if user_query.lower() == "exit":
        print("Goodbye!")
        break
    response = ask_question(user_query)
    print("\n🤖 Chatbot Response:\n", response, "\n")

✅ Chunks embedded and stored in Chroma successfully!
🔍 Retrieved Chunks:
1: we refer to as retrieval-augmented generation (RAG).
We build RAG models where the

2: we refer to as retrieval-augmented generation (RAG).
We build RAG models where the

3: retrieval-augmented generation
(RAG) — models which combine pre-trained parametric



Ask a question (or type 'exit' to quit):  What is Retrieval-Augmented Generation (RAG)?



🤖 Chatbot Response:
 Retrieval-Augmented Generation (RAG) refers to models that combine pre-trained parametric. These models are built in a specific way using RAG principles. 



Ask a question (or type 'exit' to quit):  how is the weather?



🤖 Chatbot Response:
 The document does not provide information on the current weather. 



Ask a question (or type 'exit' to quit):  quit



🤖 Chatbot Response:
 I'm sorry, but I don't understand your request as there's no pertinent information given in the context. Please provide a relevant context. 



Ask a question (or type 'exit' to quit):  exit


Goodbye!


In [34]:
!pip install --upgrade langchain-chroma




In [36]:
from langchain_chroma import Chroma  # ✅ Updated import
