# Problem statement

Need to create a chatbout using GenAI.

Steps of Assignment
1. Create a RAG based chatbot using any open source models(LLaMa2, LLaMa3, Mistral
etc.)
2. Query from multiple documents
3. Improve the performance of Retrieval(control what goes to the LLM model)
4. You should use the documents that are shared with you with this email.
Note: The query should be asked from multiple sources in a single query.
For example, if there are 5 documents, the query can be ‘who prepared all the documents’

# Approach to the problem

 Retrival Augmented Generation is a popular technology now days for chatbot to get the relevant information reagarding the query that the use ask. Sometimes, the information that we get is valid but not always!!.. This is a drawback in the RAG system and there is a lot of limitations when it comes to retrive or query among multiple documents. 

This notebook includes the simple RAG system and the Advanced RAG which I used to over the issue that I faced whilst creating a chatbout with RAG with multiple documents. There is more ways that we can solve the issue of retrieving the relevant chunks or piece of document but I just tried with only AGentic RAG.

### Approach 1 - Simple RAG system

 Simple RAG system which works for single document but implement the same approach for multiple documents to get an idea of how it works and what all are the improvements it required.

### Approach 2 - Agentic RAG system

This apporach will overcome the issue of retrieving or extracting the relevant information from the database as per the user query so that the users would be to get the informations from all documents as per the query. For example, if there are 5 documents, the query can be ‘who prepared all the documents’ then it will provide the names of the authors who prepared the documents.


## Approach 1 - Simple RAG system
Tech stack used  -  Langchain, FAISS(VectorDB), Open source LLMS (mistral , llama2)

In [1]:
# # #Install requirements
# !pip install sentence-transformers
!pip install langchain
!pip install langchain_community
!pip install pypdf
!pip install langchain_ollama
!pip install -qU langchain-openai
# !pip install langchain-huggingface
!pip install faiss-cpu



In [None]:
# !pip install langchain-community==0.2.4 langchain==0.2.3 faiss-cpu==1.8.0 transformers==4.41.2 sentence-transformers==3.0.1

In [2]:
import os
from langchain_community.llms import Ollama
from langchain_community.document_loaders import PyPDFLoader
from langchain_community.vectorstores import FAISS
from langchain.embeddings import HuggingFaceEmbeddings
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.chains import RetrievalQA
from langchain_openai import ChatOpenAI
from langchain_openai import OpenAIEmbeddings
from langchain_ollama import OllamaEmbeddings

#### Link to install ollama and download the model locally
https://ollama.com/

In [3]:
!ollama list

NAME                   	ID          	SIZE  	MODIFIED     
mistral:latest         	f974a74358d6	4.1 GB	7 hours ago 	
nomic-embed-text:latest	0a109f422b47	274 MB	8 hours ago 	
llama2:latest          	78e26419b446	3.8 GB	19 hours ago	
llama3.1:8b            	62757c860e01	4.7 GB	23 hours ago	


#### If you are trying with Openai please provide the openai key here

In [38]:
OPENAI_API_KEY = "openi_ai_key"

In [56]:
# loading the  llama3 LLM
llm = Ollama(
    model="llama3.1:8b",
    temperature=0
)

#openai llm
# llm = ChatOpenAI(
#     model="gpt-4o-mini",
#     temperature=0,
#     max_tokens=None,
#     timeout=None,
#     max_retries=2,
#     openai_api_key = OPENAI_API_KEY )

### We can either use any pdf loaders or even unstructured document loaders for all docs or pdf

In [3]:

# pdf_path = "cubet_pdfs"
# def load_documents():
#     document_loader = PyPDFDirectoryLoader(pdf_path)
#     return document_loader.load()
# documents = load_documents()


In [4]:
# len(documents)

In [5]:
# documents[:3]

In [57]:
#Here I'm using the PyPDFLoader as document loader and recursivecharactersplitter to split the documents to chunk_size mentioned in code.
def load_and_process_pdfs(pdf_folder_path):
    documents = []
    for file in os.listdir(pdf_folder_path):
        if file.endswith('.pdf'):
            pdf_path = os.path.join(pdf_folder_path, file)
            loader = PyPDFLoader(pdf_path)
            documents.extend(loader.load())
    text_splitter = RecursiveCharacterTextSplitter(chunk_size=1024, chunk_overlap=200)
    splits = text_splitter.split_documents(documents)
    return splits
pdf_path = "cubet_pdfs"
splitted_docs = load_and_process_pdfs(pdf_path)
splitted_docs[:3]

[Document(metadata={'source': 'cubet_pdfs/SRS_ Tekdoc.pdf', 'page': 0}, page_content='Software\nRequirement\nSpeciﬁcation\nFor\nTekdoc\nSoftware\nVersion:\n1.0 \nAnnexure\nA\nPrepared\nby:\nKrishnapriya\nCubet\nTechno\nLabs\nPVT\nLTD\n03-June-2020'),
 Document(metadata={'source': 'cubet_pdfs/SRS_ Tekdoc.pdf', 'page': 1}, page_content='Table\nof\nContents\n1.\nIntroduction\n1.1\nPurpose\n1.2\nScope\n1.3\nIntended\nAudience\nand\nReading\nSuggestions\n1.4\nTerms\nand\nAbbreviations\n1.5\nBeneﬁciaries\n2.\nOverAll\nDescription\n2.1\nProduct\nPerspective\n2.2\nUser\nRoles\n2.3\nDataFlow\nDiagram\n2.4\nUseCase\nDiagram\n2.5\nUserFlow\nof\nMobile\nApp\n2.6\nUser\nStories\n2.7\nFlowchart\n3.\nSystem\nRequirements\n3.1\nFunctional\nRequirements\n3.2\nUX/UI\nBreakdown\n3.3\nNon\nFunctional\nRequirements\n4.\nUpdated\nfeatures\nof\nTekdoc-Phase2\n5.\nAssumptions'),
 Document(metadata={'source': 'cubet_pdfs/SRS_ Tekdoc.pdf', 'page': 2}, page_content='Revision\nHistory\nV er sionAuthorDa t eCommen

In [59]:

#If you want to use openaiembediings, please uncomment the last one and comment the first embeddings variable
embeddings = OllamaEmbeddings(model="nomic-embed-text:latest")
# embeddings = OpenAIEmbeddings(model="text-embedding-ada-002", api_key=OPENAI_API_KEY)

In [61]:
#example embeddings
embeddings.embed_query("Hi, demo for cubet")

[-0.00096889003,
 0.022380479,
 -0.15100619,
 -0.01760835,
 0.038781606,
 0.022319948,
 0.010763753,
 -0.021621116,
 -0.06810876,
 -0.0106204515,
 -0.022100618,
 0.08592628,
 0.068449125,
 0.046536922,
 -0.05160825,
 -0.0066468976,
 -0.0002869543,
 -0.048650928,
 0.02700468,
 0.102889605,
 0.052824352,
 -0.061270118,
 -0.049036395,
 -0.057935316,
 0.08107898,
 0.03811832,
 0.04749662,
 0.057555817,
 -0.021150468,
 0.035806693,
 0.08614739,
 -0.07682749,
 -0.0066961786,
 -0.00030020074,
 -0.0076031927,
 -0.01549189,
 0.054195885,
 -0.004978864,
 -0.038154505,
 -0.025034217,
 0.0070408992,
 -0.009442246,
 -0.019315613,
 -0.042764932,
 0.009057923,
 0.03627235,
 0.06544575,
 0.04575138,
 -0.04231415,
 0.0051195077,
 0.018508025,
 -0.009357308,
 0.041531675,
 -0.034146283,
 0.010708779,
 -0.03030304,
 -0.01206765,
 -0.003058584,
 -0.018136319,
 -0.04229225,
 0.024378259,
 0.057572804,
 -0.019227948,
 0.06876179,
 0.06895448,
 -0.0090079345,
 0.023904905,
 0.030311266,
 -0.00094508805,
 -0.

In [62]:
knowledge_base = FAISS.from_documents(splitted_docs, embeddings)


KeyboardInterrupt



In [44]:
# retrieval QA chain
qa_chain = RetrievalQA.from_chain_type(
    llm,
    retriever=knowledge_base.as_retriever())

In [53]:
question = "Who all prepared the documents?"
retrieved_documents = knowledge_base.similarity_search(question,k=5)
retrieved_documents

[Document(metadata={'source': 'cubet_pdfs/SRS_ MyLeaseAudit.pdf', 'page': 32}, page_content='T h e\nd e l e t e\nb u t t o n\np r e s e n t\ni n\nt h e\nL o c a t i o n\nl i s t i n g\np a g e\ne n a b l e s\nu s e r s\nt o\nd e l e t e\nl o c a t i o n\na n d\ni t s\na s s o c i a t e d\nA u d i t s ,\nL e a s e s ,\nS t a t e m e n t s .\nF.\nASSIGN AUDITORS\n33'),
 Document(metadata={'source': 'cubet_pdfs/OPD Booster-SRS V3.0.pdf', 'page': 88}, page_content='The\nfrontdesk\nuser\non\nclicking\nthe\nDocuments\ntab\nwill\nland\nat\nthe\nabove\npage.\nIt\nwill\nlist\nall\nthe\ndocuments\npertaining\nto\nthe\npatient.\nThe\ndocuments\nare\ncategorized\ninto\nDischarge\nSummary,\nBioChemical\nReport\nand\nGeneral.\nBiochemical\nreports\nwill\nhave\nthe\nreports\nfrom\nthe\nlaboratory\nuploaded\nby\nthe\ndoctors/patients.\nOPD\nBooster\n|\nConfidential\nCopyright\n©\n2024,\nCubet.\nAll\nrights\nreserved.\nPage\n89\nof\n133'),
 Document(metadata={'source': 'cubet_pdfs/SRS_ MyLeaseAudit.pdf

#### As we can see above, the retrived documents is itself  not relevant for the given query

In [54]:
question = "Who all prepared the documents?"
response = qa_chain.invoke({"query": question})
print(response["result"])

I don't know.


### I have created a simple RAG system above in which the system itself not providing information that is relvant to the query so we make use of Agentic RAG approach to retrive the documents which is more relevant and provide it it to the  LLM for query

## Apporach 2 -  Agentic RAG
Tech stack used - llamaIndex, Open source LLMS (mistral ,llama3.1:8B, llama2)

In [8]:
#Install requirements
!pip install python-dotenv
!pip install llama_index
!pip install llama-index-llms-ollama
!pip install llama-index-embeddings-ollama

Collecting python-dotenv
  Using cached python_dotenv-1.0.1-py3-none-any.whl (19 kB)
Installing collected packages: python-dotenv
Successfully installed python-dotenv-1.0.1
Collecting llama_index
  Using cached llama_index-0.10.62-py3-none-any.whl (6.8 kB)
Collecting llama-index-program-openai<0.2.0,>=0.1.3
  Using cached llama_index_program_openai-0.1.7-py3-none-any.whl (5.3 kB)
Collecting llama-index-question-gen-openai<0.2.0,>=0.1.2
  Using cached llama_index_question_gen_openai-0.1.3-py3-none-any.whl (2.9 kB)
Collecting llama-index-core==0.10.62
  Using cached llama_index_core-0.10.62-py3-none-any.whl (15.5 MB)
Collecting llama-index-indices-managed-llama-cloud>=0.2.0
  Using cached llama_index_indices_managed_llama_cloud-0.2.7-py3-none-any.whl (9.5 kB)
Collecting llama-index-llms-openai<0.2.0,>=0.1.27
  Using cached llama_index_llms_openai-0.1.28-py3-none-any.whl (11 kB)
Collecting llama-index-multi-modal-llms-openai<0.2.0,>=0.1.3
  Using cached llama_index_multi_modal_llms_openai

#### Link to install ollama and download the model locally

https://ollama.com/


In [6]:
#Checking for ollama models - Install Ollama by using the link given above. 
#Once installed, install model using ollama run <model name> command (Replace model name with the model that you need to install). 
#Refer the above linke
!ollama list

NAME                   	ID          	SIZE  	MODIFIED     
mistral:latest         	f974a74358d6	4.1 GB	3 hours ago 	
nomic-embed-text:latest	0a109f422b47	274 MB	4 hours ago 	
llama2:latest          	78e26419b446	3.8 GB	15 hours ago	
llama3.1:8b            	62757c860e01	4.7 GB	18 hours ago	


#### We can either use open source models or can use proprietary  models like openai - If you have the openai key you can provide below.

In [50]:
OPENAI_API_KEY = os.getenv('OPENAI_API_KEY', "api_key_here")

In [10]:
# from llama_index.core.query_engine.router_query_engine import RouterQueryEngine
# from llama_index.core.selectors import LLMSingleSelector
# from llama_index.core.tools import QueryEngineTool
from llama_index.core import SummaryIndex, VectorStoreIndex
from llama_index.core import Settings
from llama_index.llms.openai import OpenAI
from llama_index.embeddings.openai import OpenAIEmbedding
from llama_index.llms.ollama import Ollama
from llama_index.embeddings.ollama import OllamaEmbedding
from llama_index.core.node_parser import SentenceSplitter
from llama_index.core import SimpleDirectoryReader
from llama_index.core.tools import QueryEngineTool
from llama_index.core.agent import FunctionCallingAgentWorker
from llama_index.core.agent import AgentRunner
from pathlib import Path
import os
import re
from typing import Tuple

### Lets have a look at how the Agentic RAG works in a single document


In [11]:
#Suppose I have one pdf in the directory called cubet_pdfs
documents = SimpleDirectoryReader(input_files=["cubet_pdfs/AnaBot(Omang_plai)_SRS_V1.docx.pdf"]).load_data()
splitter = SentenceSplitter(chunk_size=1024)
nodes = splitter.get_nodes_from_documents(documents)
nodes[0]

TextNode(id_='c06009b0-d110-4552-91e6-64004d55343b', embedding=None, metadata={'page_label': '1', 'file_name': 'AnaBot(Omang_plai)_SRS_V1.docx.pdf', 'file_path': 'cubet_pdfs/AnaBot(Omang_plai)_SRS_V1.docx.pdf', 'file_type': 'application/pdf', 'file_size': 184818, 'creation_date': '2024-08-06', 'last_modified_date': '2024-08-06'}, excluded_embed_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], excluded_llm_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], relationships={<NodeRelationship.SOURCE: '1'>: RelatedNodeInfo(node_id='cd5a56ba-f2d2-4bcc-b99e-ac93bb44968c', node_type=<ObjectType.DOCUMENT: '4'>, metadata={'page_label': '1', 'file_name': 'AnaBot(Omang_plai)_SRS_V1.docx.pdf', 'file_path': 'cubet_pdfs/AnaBot(Omang_plai)_SRS_V1.docx.pdf', 'file_type': 'application/pdf', 'file_size': 184818, 'creation_date': '2024-08-06', 'last_modified_date': '2024-08-06

#### In order to use models like "nomic-embed-text:latest" and "llama3.1:8b" you need to install ollama and then run models from the link mentioned below in terminal/CLI



https://ollama.com/library

Note: You can also use hugging face embeddings

In [30]:
Settings.llm = Ollama(model="llama3.1:8b",request_timeout=120.0)
Settings.embed_model = OllamaEmbedding(model_name="nomic-embed-text:latest")
# Settings.llm = OpenAI(model="gpt-4o-mini", api_key=OPENAI_API_KEY)
# Settings.embed_model = OpenAIEmbedding(model="text-embedding-ada-002", api_key=OPENAI_API_KEY)

# summary index
summary_index = SummaryIndex(nodes)
# vector store index
vector_index = VectorStoreIndex(nodes)

# summary query engine
summary_query_engine = summary_index.as_query_engine(
    response_mode="tree_summarize",
    use_async=True,
)

# vector query engine
vector_query_engine = vector_index.as_query_engine()

In [31]:
summary_tool = QueryEngineTool.from_defaults(
    query_engine=summary_query_engine,
    description=(
        "Useful for summarization questions related to the AnaBot(Omang_plai)_SRS_V1 paper."
    ),
)

vector_tool = QueryEngineTool.from_defaults(
    query_engine=vector_query_engine,
    description=(
        "Useful for retrieving specific context from the the AnaBot(Omang_plai)_SRS_V1 paper."
    ),
)

In [32]:
llm = Ollama(model="llama3.1:8b",request_timeout=3000.0)
# llm = OpenAI(model="gpt-4o-mini", temperature=0, api_key=OPENAI_API_KEY )

In [33]:
agent_worker = FunctionCallingAgentWorker.from_tools(
    tools=[summary_tool, vector_tool],
    llm=llm,
    verbose=True,
)

agent = AgentRunner(agent_worker)

In [34]:
response = agent.query(
    "who prepared the document"
)
print(str(response))

Added user message to memory: who prepared the document
=== Calling Function ===
Calling function: query_engine_tool with args: {"input": "Who prepared the AnaBot(Omang_plai)_SRS_V1 document?"}
=== Function Output ===
Aswathy A prepared the AnaBot(Omang_plai)_SRS_V1 document.
=== LLM Response ===
The AnaBot(Omang_plai)_SRS_V1 document was prepared by Aswathy A.
The AnaBot(Omang_plai)_SRS_V1 document was prepared by Aswathy A.


## Multiple document query

In [36]:
def sanitize_tool_name(name):
    return re.sub(r'[^a-zA-Z0-9_-]', '_', name)
    
def create_doc_tools(
    document_fp: str,
    doc_name: str,
    verbose: bool = True,
) -> Tuple[QueryEngineTool, QueryEngineTool]:
    documents = SimpleDirectoryReader(input_files=[document_fp]).load_data()
    splitter = SentenceSplitter(chunk_size=1024)
    nodes = splitter.get_nodes_from_documents(documents)

    
    Settings.llm = Ollama(model="llama3.1:8b",request_timeout=120.0)
    Settings.embed_model = OllamaEmbedding(model_name="nomic-embed-text:latest")
    # Settings.llm = OpenAI(model="gpt-4o-mini", api_key=OPENAI_API_KEY)
    # Settings.embed_model = OpenAIEmbedding(model="text-embedding-ada-002", api_key=OPENAI_API_KEY)

    # summary index
    summary_index = SummaryIndex(nodes)
    # vector store index
    vector_index = VectorStoreIndex(nodes)

    # summary query engine
    summary_query_engine = summary_index.as_query_engine(
        response_mode="tree_summarize",
        use_async=True,
    )

    # vector query engine
    vector_query_engine = vector_index.as_query_engine()

    summary_tool = QueryEngineTool.from_defaults(
        name=sanitize_tool_name(f"{doc_name}_summary_query_engine_tool"),
        query_engine=summary_query_engine,
        description=(
            f"Useful for summarization questions related to the {doc_name}."
        ),
    )

    vector_tool = QueryEngineTool.from_defaults(
        name=sanitize_tool_name(f"{doc_name}_vector_query_engine_tool"),
        query_engine=vector_query_engine,
        description=(
            f"Useful for retrieving specific context from the the {doc_name}."
        ),
    )

    return vector_tool, summary_tool

In [37]:
directory = "cubet_pdfs"
papers = [f"./{directory}/{filename}" for filename in os.listdir(directory) if filename.endswith('.pdf')]
papers

['./cubet_pdfs/SRS_ Tekdoc.pdf',
 './cubet_pdfs/AnaBot(Omang_plai)_SRS_V1.docx.pdf',
 './cubet_pdfs/OPD Booster-SRS V3.0.pdf',
 './cubet_pdfs/SRS_ MyLeaseAudit.pdf',
 './cubet_pdfs/Bloom app_v1.1_SRS.docx.pdf']

In [41]:
paper_to_tools_dict = {}
for paper in papers:
    print(f"Creating {paper} tool")
    path = Path(paper)
    vector_tool, summary_tool = create_doc_tools(doc_name=path.stem, document_fp=path)
    paper_to_tools_dict[path.stem] = [vector_tool, summary_tool]

Creating ./cubet_pdfs/SRS_ Tekdoc.pdf tool
Creating ./cubet_pdfs/AnaBot(Omang_plai)_SRS_V1.docx.pdf tool
Creating ./cubet_pdfs/OPD Booster-SRS V3.0.pdf tool
Creating ./cubet_pdfs/SRS_ MyLeaseAudit.pdf tool
Creating ./cubet_pdfs/Bloom app_v1.1_SRS.docx.pdf tool


In [42]:
paper_to_tools_dict

{'SRS_ Tekdoc': [<llama_index.core.tools.query_engine.QueryEngineTool at 0x7f8d00e0e5e0>,
  <llama_index.core.tools.query_engine.QueryEngineTool at 0x7f8cc382efa0>],
 'AnaBot(Omang_plai)_SRS_V1.docx': [<llama_index.core.tools.query_engine.QueryEngineTool at 0x7f8d00e99d90>,
  <llama_index.core.tools.query_engine.QueryEngineTool at 0x7f8d00e99e20>],
 'OPD Booster-SRS V3.0': [<llama_index.core.tools.query_engine.QueryEngineTool at 0x7f8d00bea370>,
  <llama_index.core.tools.query_engine.QueryEngineTool at 0x7f8d00bea160>],
 'SRS_ MyLeaseAudit': [<llama_index.core.tools.query_engine.QueryEngineTool at 0x7f8d00e09d60>,
  <llama_index.core.tools.query_engine.QueryEngineTool at 0x7f8d00e09af0>],
 'Bloom app_v1.1_SRS.docx': [<llama_index.core.tools.query_engine.QueryEngineTool at 0x7f8d00bd7c40>,
  <llama_index.core.tools.query_engine.QueryEngineTool at 0x7f8d00bd7640>]}

In [43]:
initial_tools = [t for paper in papers for t in paper_to_tools_dict[Path(paper).stem]]
print(str(initial_tools))

[<llama_index.core.tools.query_engine.QueryEngineTool object at 0x7f8d00e0e5e0>, <llama_index.core.tools.query_engine.QueryEngineTool object at 0x7f8cc382efa0>, <llama_index.core.tools.query_engine.QueryEngineTool object at 0x7f8d00e99d90>, <llama_index.core.tools.query_engine.QueryEngineTool object at 0x7f8d00e99e20>, <llama_index.core.tools.query_engine.QueryEngineTool object at 0x7f8d00bea370>, <llama_index.core.tools.query_engine.QueryEngineTool object at 0x7f8d00bea160>, <llama_index.core.tools.query_engine.QueryEngineTool object at 0x7f8d00e09d60>, <llama_index.core.tools.query_engine.QueryEngineTool object at 0x7f8d00e09af0>, <llama_index.core.tools.query_engine.QueryEngineTool object at 0x7f8d00bd7c40>, <llama_index.core.tools.query_engine.QueryEngineTool object at 0x7f8d00bd7640>]


In [44]:
len(initial_tools)

10

In [45]:
# from llama_index.llms.openai import OpenAI
from llama_index.llms.ollama import Ollama
llm = Ollama(model="llama3.1:8b",request_timeout=3000.0)
# llm = OpenAI(model="gpt-4o-mini", temperature=0, api_key=OPENAI_API_KEY )


In [47]:
agent_worker = FunctionCallingAgentWorker.from_tools(
    initial_tools,
    llm=llm,
    verbose=True
)

agent = AgentRunner(agent_worker)
response = agent.query(
    "Who prepared these documents"
    
)

print(str(response))

Added user message to memory: Who prepared these documents
=== Calling Function ===
Calling function: SRS__Tekdoc_summary_query_engine_tool with args: {"input": "Who prepared this document?"}
=== Function Output ===
The document was prepared by Krishnapriya from Cubet Techno Labs PVT LTD.
=== Calling Function ===
Calling function: AnaBot_Omang_plai__SRS_V1_docx_summary_query_engine_tool with args: {"input": "Who prepared this document?"}
=== Function Output ===
The document was prepared by Aswathy A.
=== Calling Function ===
Calling function: OPD_Booster-SRS_V3_0_summary_query_engine_tool with args: {"input": "Who prepared this document?"}
=== Function Output ===
Sophy George prepared this document.
=== Calling Function ===
Calling function: SRS__MyLeaseAudit_summary_query_engine_tool with args: {"input": "Who prepared this document?"}
=== Function Output ===
The document was prepared by Krishnapriya.
=== Calling Function ===
Calling function: Bloom_app_v1_1_SRS_docx_summary_query_engi

In [48]:
def q_a_chatbot():
    print("Welcome to the document Q&A Chatbot!")
    print("Type 'exit' to quit.")
    
    while True:
        user_input = input("You: ")
        if user_input.lower() == 'exit':
            print("Chatbot: Goodbye!")
            break
        
        try:
            # Sending user's question to the agent
            response = agent.query(user_input)
            # Output the response from the agent
            print("Chatbot:", str(response))
        except Exception as e:
            print("Chatbot: Sorry, I couldn't process your request.")
            print(f"Error: {e}")

# Run the chatbot
q_a_chatbot()

Welcome to the Document Query Chatbot!
Type 'exit' to quit.


You:  Provide the names of the documents


Added user message to memory: Provide the names of the documents
=== LLM Response ===
The names of the documents are:

1. SRS_ Tekdoc
2. AnaBot(Omang_plai)_SRS_V1.docx
3. OPD Booster-SRS V3.0
4. SRS_ MyLeaseAudit
5. Bloom app_v1.1_SRS.docx
Chatbot: The names of the documents are:

1. SRS_ Tekdoc
2. AnaBot(Omang_plai)_SRS_V1.docx
3. OPD Booster-SRS V3.0
4. SRS_ MyLeaseAudit
5. Bloom app_v1.1_SRS.docx


KeyboardInterrupt: Interrupted by user

## There is many different approach that we can make use to retrive the relevant documents as per the query to provide as input to the  LLMS.. I tried this approaced, wecan also make use of hybird  search approach, reranking of documents and multi query retrievel etx
