### Install required libraries

In [None]:
%pip install llama_index
%pip install llama-index-vector-stores-chroma
%pip install llama_index-embeddings-gemini
%pip install llama-index-llms-gemini
%pip install PyPDF2
%pip install chromadb
%pip install google-generativeai

# AI-Based PDF Chatbot Architecture Overview

This notebook outlines the architecture of an AI-based PDF chatbot that combines multiple components to create an interactive QA bot capable of retrieving information from documents and generating responses using Google’s Gemini generative AI model. Below is a detailed breakdown of the system structure and functionality:

---

## 1. Document Handling

### PDF or TXT Input:
- The bot can load text data from either **.txt** or **.pdf** files.
- PDF loading is handled using the **PyPDF2** library, which extracts raw text from PDFs and converts it into a single document object.

### Text Splitting and Chunking:
- The document is split into manageable chunks using the **SentenceSplitter**.
- Each chunk has a defined size (`chunk_size=512`) and overlap (`chunk_overlap=20`) to ensure smoother transitions between sentences across document splits.

---

## 2. Vector Store & Embeddings

### Vector Store with Chroma:
- Document chunks are stored in a vector database, **ChromaVectorStore**, provided by `chromadb`.
- **Chroma** handles vector-based searches by comparing the user query embeddings to the document chunk embeddings, allowing the chatbot to retrieve relevant sections from the document.

### Embeddings:
- The bot uses **GeminiEmbedding** for embedding the document text into vectors that enable efficient vector-based retrieval.

---

## 3. Generative Model

### LLM (Large Language Model):
- The core LLM is Google’s **Gemini** model, which powers the generative response capability of the chatbot.
- The model used is **gemini-pro**, an advanced version of Google's Generative AI.

### LLM Settings:
- Configurations such as **API keys**, **model name**, and output limits (`num_output=512`) are defined to control the generation of responses from the model.

---

## 4. Retrieval-Augmented Generation (RAG)

### Index Creation:
- After the document is split and embedded, the vectorized chunks are indexed using **VectorStoreIndex**.
- This index is used to retrieve the most relevant document segments based on a user’s query.

### Retrieval:
- When a query is sent to the bot, the vector store is queried to retrieve relevant document chunks that match the query's semantics.

### Response Generation:
- The **Gemini model** generates responses by considering both the retrieved document segments and the ongoing conversation context.

---

## 5. Chat Engine

### Chat Engine:
- The `_create_chat_engine()` function sets up the chat engine using `as_chat_engine`, where retrieved document segments are fed into the generative model to create a final response.

---

## Workflow of Interaction:

1. **Document Upload**: The user uploads a **.txt** or **.pdf** file.
2. **Vector Index Creation**: 
   - The document is chunked, embedded using **GeminiEmbedding**, and indexed in **Chroma’s vector store**.
3. **Query Input**: 
   - The user submits a query.
4. **Document Retrieval**: 
   - The vector index retrieves relevant document sections based on the semantic similarity of the query and the document chunks.
5. **Response Generation**: 
   - The **generative model** creates a final answer by combining the retrieved segments and any prior conversation context.
6. **User Interaction**: 
   - The response is presented to the user, along with relevant document sections (if any were retrieved).

---

## Generative Responses

The chatbot generates answers based on a combination of the following:

### Document-based Retrieval:
- The bot primarily refers to the uploaded document to generate accurate and document-specific answers.


### Generative Completion:
- If the document and memory do not provide sufficient information, the bot can generate an answer based on its general knowledge or indicate that the information is unavailable.

---



Challenges faced, and solutions

## Key Decisions and Design

### 1. Choosing Google Gemini API for LLM and Embeddings
- **Decision**: Use `Google Generative AI` and `Gemini` models for both the language model (LLM) and embeddings.
- **Reason**: `Gemini` offers state-of-the-art language understanding and embeddings, ensuring performance and consistency.
- **Challenge**: API key integration and request management.
- **Solution**: Set up credentials using environment variables and configure API keys during app initialization.

---

### 2. Using ChromaDB for Vector Storage
- **Decision**: Utilize `ChromaDB` for vector storage and retrieval during QA.
- **Reason**: Efficiently stores document embeddings and scales easily for retrieval tasks.
- **Challenge**: Managing the vector store without creating conflicts.
- **Solution**: Check if a collection exists, delete it before creating a new one, ensuring up-to-date knowledge.

---

### 3. Supporting PDF and TXT Files
- **Decision**: Allow uploads of both `.pdf` and `.txt` files.
- **Reason**: Common formats for users and increases accessibility.
- **Challenge**: Handling PDF extraction from complex layouts.
- **Solution**: Use `PyPDF2` for extracting PDF text and wrap it in `Document` objects for uniformity between formats.

---

### 4. Chunking Documents for Efficient Embedding
- **Decision**: Split documents into smaller chunks using `SentenceSplitter` with a 512-token chunk size and 20-token overlap.
- **Reason**: Avoid overloading the LLM’s context window, improving performance and accuracy.
- **Challenge**: Retaining document context while splitting.
- **Solution**: Include a 20-token overlap between chunks to maintain continuity in responses.

---

### 5. Creating the Chat Engine
- **Decision**: Implement a conversational context using `ChatMemoryBuffer` and an LLM-powered chat engine.
- **Reason**: Handle follow-up questions with context for better user experience.
- **Challenge**: Managing memory size to handle long conversations without exceeding token limits.
- **Solution**: Set a large token limit of 150,000 tokens to balance context retention with performance.

---

## Challenges and Solutions

### 1. File Handling for Different Formats
- **Challenge**: Efficient handling of both TXT and PDF file formats.
- **Solution**: Use `PyPDF2` for PDF extraction and `SimpleDirectoryReader` for TXT files, ensuring robustness in loading documents.

---

### 2. Ensuring Embedding Consistency and Efficiency
- **Challenge**: Embedding large documents efficiently without overloading the system.
- **Solution**: Split documents into manageable chunks and embed them using `Google GeminiEmbedding`, ensuring the vector store remains scalable.

---

### 3. Managing the Knowledge Base
- **Challenge**: Efficiently updating the ChromaDB vector store for new document uploads.
- **Solution**: Delete outdated collections and recreate them during every new upload, ensuring fresh embeddings and avoiding bottlenecks.

---

### 4. Graceful Error Handling
- **Challenge**: Handling errors during file loading or API interaction without crashing the app.
- **Solution**: Implement `try-except` blocks with meaningful error messages to inform users about issues like failed PDF extraction.

---

### 5. User-Friendly Interaction with the LLM
- **Challenge**: Ensuring the bot provides accurate answers based only on the uploaded document.
- **Solution**: Use system prompts to guide the bot to answer within the document’s scope and avoid hallucinations.

---

## Key Components

1. **QAChatbot Class**: Centralized logic for document processing, knowledge base creation, and user interaction.
2. **Embedding and Indexing**: Managed by the `_create_kb` method, responsible for embedding and storing documents in ChromaDB.
3. **Chat Memory and Chat Engine**: Uses `ChatMemoryBuffer` to retain context during conversations for smooth follow-up queries.
4. **Error Handling and Debugging**: Includes debugging print statements to track app flow and catch issues during development.

---
---


### Library Imports

In this section, we import the necessary libraries for our project. Each library serves a specific purpose in our workflow.

1. **`from llama_index.embeddings.gemini import GeminiEmbedding`**
   - **Purpose:** Provides embeddings functionality using the Gemini model. Embeddings are used to convert text into vector representations for processing.

2. **`from llama_index.vector_stores.chroma import ChromaVectorStore`**
   - **Purpose:** Interfaces with Chroma, a vector database for storing and retrieving vector embeddings. Chroma is used to manage and query large sets of embeddings efficiently.

3. **`from llama_index.core import SimpleDirectoryReader, VectorStoreIndex`**
   - **Purpose:**
     - `SimpleDirectoryReader`: Reads documents from a directory. It's useful for loading data from files.
     - `VectorStoreIndex`: Creates an index from documents using vector embeddings. This index is used to perform similarity searches.

4. **`from llama_index.core.memory import ChatMemoryBuffer`**
   - **Purpose:** Manages memory for chat interactions, allowing the model to remember previous interactions within a session.

5. **`from llama_index.core.storage.storage_context import StorageContext`**
   - **Purpose:** Manages storage context for the vector store, handling the creation and maintenance of vector databases.

6. **`from llama_index.core import Settings`**
   - **Purpose:** Provides configuration settings for the model, including parameters for embeddings, chunking, and output.

7. **`from llama_index.llms.gemini import Gemini`**
   - **Purpose:** Interfaces with the Gemini language model for generating responses based on the provided context and queries.

8. **`import PyPDF2`**
   - **Purpose:** A library for reading and extracting text from PDF files. It is used to handle PDF document content.

9. **`import chromadb`**
   - **Purpose:** A client library for interacting with the Chroma vector database. It is used for managing collections and querying vectors.

10. **`import google.generativeai as genai`**
    - **Purpose:** Interfaces with Google's Generative AI API for generating text and other AI-driven functionalities.

11. **`import warnings`**
    - **Purpose:** Provides a way to issue and control warnings. It's used to suppress or display warnings during execution.

12. **`import os`**
    - **Purpose:** Provides a way to interact with the operating system, such as setting environment variables and managing file paths.


In [2]:
from llama_index.embeddings.gemini import GeminiEmbedding
from llama_index.vector_stores.chroma import ChromaVectorStore
from llama_index.core import SimpleDirectoryReader, VectorStoreIndex
from llama_index.core.memory import ChatMemoryBuffer
from llama_index.core.storage.storage_context import StorageContext
from llama_index.core import Settings
from llama_index.llms.gemini import Gemini
import PyPDF2
import chromadb
import google.generativeai as genai
import warnings
import os

### **Model Architecture**

The system integrates various components to perform question-answering using a hybrid Retrieval-Augmented Generation (RAG) approach. The core components of the architecture are:

1. **Document Ingestion and Preprocessing**:
   - The documents are loaded from local files (e.g., PDFs) using the `SimpleDirectoryReader`.
   - Preprocessing includes tokenizing the text and preparing it for embedding by chunking it into manageable parts.

2. **Embeddings Generation**:
   - A pre-trained embedding model like **GeminiEmbedding** is used to convert the text into high-dimensional vector representations. These embeddings capture semantic meaning, allowing for similarity-based retrieval later.
   - Embeddings are created for both documents and queries to facilitate comparison.

3. **Vector Storage**:
   - The system stores the generated embeddings in a vector database such as **ChromaVectorStore**. This allows for efficient querying based on vector similarity. Each document chunk’s embedding is indexed to enable retrieval during question-answering.

4. **Question-Answering Pipeline**:
   - When a user submits a question, it is first converted into an embedding using the same **GeminiEmbedding** model.
   - The vector store is queried to find the most similar document chunks based on the cosine similarity between the question's embedding and the document embeddings.
   
5. **Generative Response**:
   - The retrieved document chunks are passed to a generative language model, such as **Gemini** or **Google Generative AI**. These models generate a coherent and contextually appropriate response by synthesizing information from the retrieved documents.
   - The final response is generated based on a combination of the retrieved content and the model’s generative capabilities.


In [3]:

# Set up your Google API credentials
os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = "/content/your Google API credentials.json"
warnings.filterwarnings("ignore")

# Set up Google Generative AI API key
genai.configure(api_key='Google api key')

### Configuration of Settings for Language Model and Embeddings

In this section, we configure various settings for the language model (LLM) and the embedding model.

1. **`from llama_index.core.node_parser import SentenceSplitter`**
   - **Purpose:** Imports the `SentenceSplitter` class used to split text into manageable chunks for processing.

2. **`Settings.llm = Gemini(models='gemini-pro', api_key='AIzaSyDFqHznL-EB9_CHIGFrwVfwmUBCmk4nRzc')`**
   - **Purpose:** Configures the language model to use Gemini, specifically the 'gemini-pro' model. The `api_key` is required for accessing the Gemini API.
   - **Why:** Setting up the LLM allows the system to generate responses based on the provided context and queries.

3. **`Settings.embed_model = GeminiEmbedding(model_name="models/embedding-001", api_key='AIzaSyDFqHznL-EB9_CHIGFrwVfwmUBCmk4nRzc')`**
   - **Purpose:** Configures the embedding model to use `GeminiEmbedding`, specifying the model name and `api_key` for creating vector embeddings.
   - **Why:** Embeddings are used to convert text into numerical vectors, which are necessary for indexing and similarity search.

4. **`Settings.node_parser = SentenceSplitter(chunk_size=512, chunk_overlap=20)`**
   - **Purpose:** Sets the `SentenceSplitter` to split text into chunks of 512 tokens, with a 20-token overlap between chunks.
   - **Why:** This ensures that large texts are divided into smaller parts, making it easier to process and analyze, while maintaining context between chunks.

5. **`Settings.num_output = 512`**
   - **Purpose:** Specifies the number of tokens the model should generate in its response.
   - **Why:** Limits the response length to 512 tokens, balancing detail and conciseness.

6. **`Settings.context_window = 3900`**
   - **Purpose:** Defines the context window size, which is the number of tokens the model can consider when generating a response.
   - **Why:** A larger context window allows the model to take more context into account, potentially improving the relevance and accuracy of the responses.


In [4]:
from llama_index.core.node_parser import SentenceSplitter

# Configuring the settings for the LLM and embedding model
Settings.llm = Gemini(models='gemini-pro', api_key='Google api key')
# Sets the language model to Gemini with a specific API key for generating responses.

Settings.embed_model = GeminiEmbedding(model_name="models/embedding-001", api_key='Google api key')
# Configures the embedding model to use GeminiEmbedding with a specific model name and API key for creating vector embeddings.

Settings.node_parser = SentenceSplitter(chunk_size=512, chunk_overlap=20)
# Sets up the sentence splitter to process text into chunks of size 512 tokens with an overlap of 20 tokens between chunks.

Settings.num_output = 512
# Specifies the number of tokens to be generated in the model’s response.

Settings.context_window = 3900
# Sets the context window size to 3900 tokens, which determines how much text the model considers when generating a response.

### **Retrieval Approach**

The system employs a vector-based retrieval approach. Here's how the retrieval mechanism works:

1. **Document Chunking**:
   - Large documents are divided into smaller chunks to ensure that each part is manageable for embedding and retrieval.
   - Each chunk is transformed into an embedding, which is a vector representation of the chunk’s semantic content.

2. **Similarity Search**:
   - When a query is received, its embedding is compared to the document embeddings stored in **ChromaVectorStore**.
   - The similarity search is performed using cosine similarity, which measures the angle between the query vector and document vectors. Smaller angles indicate higher similarity.
   
3. **Retrieval of Top Matches**:
   - The vector store returns the most relevant document chunks based on similarity. These chunks serve as the context for generating an answer to the query.


### QAChatbot Class Overview

The `QAChatbot` class is designed to create a chatbot that can answer questions based on the content of a document. Here’s a breakdown of the main components:

#### Initialization

- **`__init__(self, document_path)`**
  - **Purpose:** Initializes the chatbot with the path to the document and sets up the ChromaDB client and language model.
  - **Attributes:**
    - `self._chroma_client`: ChromaDB client for vector storage.
    - `self._llm`: Language model (LLM) used for generating responses.
    - `self._document_path`: Path to the document.
    - `self._index`: Placeholder for the vector store index.
  - **Methods Called:**
    - `_create_kb()`: Creates the knowledge base from the document.
    - `_create_chat_engine()`: Initializes the chat engine using the knowledge base.

#### Creating the Knowledge Base

- **`_create_kb(self)`**
  - **Purpose:** Loads the document, creates a vector store, and generates a knowledge base index.
  - **Steps:**
    - Checks the document type (TXT or PDF) and loads the content.
    - Deletes any existing collection in ChromaDB.
    - Creates a new collection and sets up the vector store and index.
  - **Why:** This method prepares the data for querying by converting it into a format that can be used by the chatbot.

#### Loading PDF Documents

- **`_load_pdf(self, pdf_path)`**
  - **Purpose:** Extracts text from a PDF file and wraps it into `Document` objects.
  - **Steps:**
    - Opens the PDF file and reads its content.
    - Extracts text from each page and compiles it into a `Document` object.
  - **Why:** This method enables the chatbot to handle PDF files as input and convert them into a usable format.

#### Creating the Chat Engine

- **`_create_chat_engine(self)`**
  - **Purpose:** Initializes the chat engine with context mode and memory.
  - **Steps:**
    - Sets up a memory buffer with a token limit.
    - Configures the chat engine with context mode, memory, and the language model.
  - **Why:** This method enables the chatbot to interact with users and generate responses based on the knowledge base.

#### Interacting with the Language Model

- **`interact_with_llm(self, user_query)`**
  - **Purpose:** Sends a query to the chat engine and returns the generated response.
  - **Steps:**
    - Checks if the chat engine is initialized.
    - Sends the user query to the chat engine and retrieves the response.
  - **Why:** This method allows users to ask questions and receive answers from the chatbot.

#### Prompt Definition

- **`_prompt(self)`**
  - **Purpose:** Provides a prompt that defines the behavior of the AI assistant.
  - **Content:** Guides the assistant to answer questions based on the provided document and states if information is not available.
  - **Why:** This prompt ensures that the chatbot answers questions accurately and consistently based on the document.


In [5]:
from llama_index.core import Document

In [8]:
class QAChatbot:
    def __init__(self, document_path):
        self._chroma_client = chromadb.EphemeralClient()
        self._llm = Settings.llm
        self._document_path = document_path
        self._index = None
        self._create_kb()
        self._create_chat_engine()

    def _create_kb(self):
        try:
            # Check if the document is a PDF or TXT
            if self._document_path.endswith('.txt'):
                print("Reading TXT file")
                reader = SimpleDirectoryReader(input_files=[self._document_path])
                documents = reader.load_data()
                print("Documents loaded successfully from TXT file")
            elif self._document_path.endswith('.pdf'):
                documents = self._load_pdf(self._document_path)
                print("Documents loaded successfully from PDF file")
            else:
                raise ValueError("Unsupported file format. Please provide a .txt or .pdf file.")

            # Check if the collection already exists and delete it
            collection_name = "collection"
            existing_collections = [col.name for col in self._chroma_client.list_collections()]

            if collection_name in existing_collections:
                self._chroma_client.delete_collection(collection_name)  # Delete the whole collection
                print(f"Deleted existing collection: {collection_name}")

            # Create a new collection
            chroma_collection = self._chroma_client.create_collection(collection_name)
            print(f"Created new collection: {collection_name}")

            vector_store = ChromaVectorStore(chroma_collection=chroma_collection)
            storage_context = StorageContext.from_defaults(vector_store=vector_store)

            # Create the vector index with documents and embeddings
            self._index = VectorStoreIndex.from_documents(documents, storage_context=storage_context, embed_model=Settings.embed_model)
            print("Knowledgebase created successfully!")
        except Exception as e:
            print(f"Error while creating knowledgebase: {e}")
            self._index = None


    def _load_pdf(self, pdf_path):
        # Extract text from PDF
        try:
            with open(pdf_path, 'rb') as file:
                pdf_reader = PyPDF2.PdfReader(file)
                text = ""
                for page in pdf_reader.pages:
                    text += page.extract_text()

            # Wrap the extracted text into Document objects
            documents = [Document(text=text)]  # Create a list of Document objects
            return documents
        except Exception as e:
            print(f"Error loading PDF: {e}")
            return []

    def _create_chat_engine(self):
        if self._index is None:
            print("Knowledgebase is not created. Cannot create chat engine.")
            return
        memory = ChatMemoryBuffer.from_defaults(token_limit=150000)
        self._chat_engine = self._index.as_chat_engine(
            chat_mode="context",
            memory=memory,
            system_prompt=self._prompt,
            llm=self._llm,

        )

    def interact_with_llm(self, user_query):
        try:
            if self._chat_engine is None:
                print("Chat engine is not initialized due to knowledgebase creation failure.")
                return "Sorry, the chat engine is not available."

            print("Debug: Sending query to chat engine")
            AgentChatResponse = self._chat_engine.chat(user_query)
            print("Debug: Received response from chat engine")

            # Extract the generated answer
            answer = AgentChatResponse.response

            # Extract retrieved document segments (accessing node and then its content)
            retrieved_segments = [node.node.text for node in AgentChatResponse.source_nodes]

            # Return both the answer and the retrieved document segments
            return answer, retrieved_segments

        except KeyError as e:
            print(f"KeyError: {e}")
            return "An error occurred while processing your query.", []

    @property
    def _prompt(self):
        return """
            You are a professional AI assistant that answers questions based on the provided document.
            Use relevant information from the document to answer any questions accurately and concisely.

            if answer is not available in document, try to answer on the base of previous conversation,if u know the answer properly.

            If the information is not available in document and in previous conversation, simply state that the answer is not in the document.
        """

### **Generative Response Creation**

Once the relevant document chunks are retrieved, the system moves to the response generation phase:

1. **Input to Generative Model**:
   - The retrieved chunks are concatenated to form the context. This context is passed as input to the **Gemini** or **Google Generative AI** model.
   - The model uses this context to generate a response that addresses the user’s query.

2. **Contextual Answer Generation**:
   - The generative model leverages the information in the context while applying its internal knowledge to generate a fluent, coherent, and contextually relevant answer.
   - Depending on the complexity of the query, the response may combine information from multiple chunks.

3. **Final Output**:
   - The generated response is returned to the user, providing an answer that is both factual (based on retrieved content) and linguistically natural (thanks to the generative capabilities of the language model).


In [9]:
document_path = "bhagavad-gita-in-english-source-file.pdf"
chatbot = QAChatbot(document_path)
response = chatbot.interact_with_llm("What is the main purpose of the document?")
print("Chat Bot")
print(response[0])
print("RAG")
print(response[1])

Documents loaded successfully from PDF file
Deleted existing collection: collection
Created new collection: collection
Knowledgebase created successfully!
Debug: Sending query to chat engine
Debug: Received response from chat engine
Chat Bot
The document is a selection of verses from the Bhagavad Gita, a Hindu scripture, focusing on the concepts of selfless service, enlightenment, and achieving union with the Supreme Being. 

RAG
['(3.05) The deluded \nones, who restrain their organs of action but mentally dwell upon \nthe sense enjoyment, are called hypocrites. (3.06)  \nWhy one should serve others?  \nOne who controls the senses by a trained and purified mind \nand intellect, and engages the organs of action to selfless service, \nis superior, O Arjuna. (3.07) Perform your obligatory duty because \nworking is indeed better than sitting idle. Even the maintena nce of \nyour body would be impossible without work. (3.08) Human beings  \nare bound by work that is not performed as a selfl

In [10]:
response = chatbot.interact_with_llm("give top 10 life lesson from this pdf")
print("Chat Bot")
print(response[0])
print("RAG")
print(response[1])

Debug: Sending query to chat engine
Debug: Received response from chat engine
Chat Bot
Here are 10 life lessons from the provided excerpt of the Bhagavad Gita, focusing on the themes of enlightenment, self-realization, and spiritual liberation:

1. **True Liberation:**  The path to liberation lies in merging with the Source, dedicating oneself to it, and letting go of impurities through knowledge. This leads to freedom from the cycle of rebirth. (5.16)
2. **Equality and Compassion:**  An enlightened person sees everyone and everything with equal regard, recognizing the divine within all beings, regardless of their social status or perceived worth. (5.17)
3. **Inner Peace through Equanimity:**  Achieving inner peace and realizing God comes from cultivating a mind that remains undisturbed by external circumstances, both pleasant and unpleasant. (5.18, 5.19)
4. **The Joy of Self-Realization:**  True happiness comes from connecting with the Self through contemplation and experiencing the j

In [11]:
response = chatbot.interact_with_llm("what is the weather today")
print("Chat Bot")
print(response[0])
print("RAG")
print(response[1])

Debug: Sending query to chat engine
Debug: Received response from chat engine
Chat Bot
This document does not contain information about the weather. 

RAG
['I am the thunderbolt among weap-\nons, and I am Cupid for procreation. (10.27 -28) I am the water god \nand the manes. I am the controller of death. I am death among the \nhealers, lion among the beasts, and the king of birds among birds. \n(10.29-30) I am the wind among the purifiers and Lord R ama \namong the warriors. I am the crocodile among the fishes and the \nholy Gang a river among the rivers. (10.31)  \nI am the beginning, the middle, and the end of all creation, O \nArjuna. Among knowledge I am  knowledge of the supreme Self. I \nam logic of the logician. (10.32) I am the letter ‘A’ among the al-\nphabets. I am the dual compound among the compound words. I \nam the endless time. I am the sustainer of all, and have faces on \nall sides (or I am omniscient) . (10.33) I am the all -devouring death \nand also the origin of fu

In [12]:
response = chatbot.interact_with_llm("Do u know who am i?")
print("Chat Bot")
print(response[0])
print("RAG")
print(response[1])

Debug: Sending query to chat engine
Debug: Received response from chat engine
Chat Bot
As an AI, I don't have memory of past conversations or personal information about you.  If you'd like to tell me who you are, I'd be happy to get to know you! 😊 

RAG
['I am the thunderbolt among weap-\nons, and I am Cupid for procreation. (10.27 -28) I am the water god \nand the manes. I am the controller of death. I am death among the \nhealers, lion among the beasts, and the king of birds among birds. \n(10.29-30) I am the wind among the purifiers and Lord R ama \namong the warriors. I am the crocodile among the fishes and the \nholy Gang a river among the rivers. (10.31)  \nI am the beginning, the middle, and the end of all creation, O \nArjuna. Among knowledge I am  knowledge of the supreme Self. I \nam logic of the logician. (10.32) I am the letter ‘A’ among the al-\nphabets. I am the dual compound among the compound words. I \nam the endless time. I am the sustainer of all, and have faces on \

### **Flow Overview**

1. **User Query**: User submits a query.
2. **Query Embedding**: The query is converted into an embedding using **GeminiEmbedding**.
3. **Vector Search**: The query embedding is compared to the stored document embeddings in **ChromaVectorStore**.
4. **Top-k Retrieval**: The most similar document chunks are retrieved.
5. **Response Generation**: The retrieved chunks are passed to a generative model, which synthesizes a response.
6. **Answer Delivery**: The system delivers the generated answer to the user.
