## End to end implementation of Agent based Rag

## Let's install the necessary libraries needed for our project

In [1]:
!pip install langchain langchain_community # popular framework for generative ai
%pip install --upgrade --quiet huggingface_hub
!pip install faiss-cpu # vectorstore
!pip install langchain_huggingface
!pip install chromadb # vectorstore
!pip install langchain_core
!pip install wikipedia # this's our tool on which the agent wiil search the answer

Collecting langchain
  Downloading langchain-0.3.14-py3-none-any.whl.metadata (7.1 kB)
Collecting langchain_community
  Downloading langchain_community-0.3.14-py3-none-any.whl.metadata (2.9 kB)
Collecting langchain-core<0.4.0,>=0.3.29 (from langchain)
  Downloading langchain_core-0.3.29-py3-none-any.whl.metadata (6.3 kB)
Collecting langchain-text-splitters<0.4.0,>=0.3.3 (from langchain)
  Downloading langchain_text_splitters-0.3.5-py3-none-any.whl.metadata (2.3 kB)
Collecting langsmith<0.3,>=0.1.17 (from langchain)
  Downloading langsmith-0.2.10-py3-none-any.whl.metadata (14 kB)
Collecting httpx-sse<0.5.0,>=0.4.0 (from langchain_community)
  Downloading httpx_sse-0.4.0-py3-none-any.whl.metadata (9.0 kB)
Collecting pydantic-settings<3.0.0,>=2.4.0 (from langchain_community)
  Downloading pydantic_settings-2.7.1-py3-none-any.whl.metadata (3.5 kB)
Collecting jsonpatch<2.0,>=1.33 (from langchain-core<0.4.0,>=0.3.29->langchain)
  Downloading jsonpatch-1.33-py2.py3-none-any.whl.metadata (3.0 

In [2]:
from kaggle_secrets import UserSecretsClient
user_secrets = UserSecretsClient()
hf_tkn = user_secrets.get_secret("HUGGINGFACEHUB_API_TOKEN") # for accessing my secret HuggingFace token

from langchain_community.document_loaders import PyPDFLoader # this's for loading our document of interest
from langchain.llms import HuggingFaceHub # for accessing LLM chatmodel from huggingface
from langchain_huggingface import HuggingFaceEndpoint # for accessing huggingface models
from langchain_huggingface import HuggingFaceEmbeddings # embeding the documents in the vectorstore
from langchain_huggingface import ChatHuggingFace # chat model
from langchain.prompts import ChatPromptTemplate
from langchain.text_splitter import CharacterTextSplitter,RecursiveCharacterTextSplitter # for splitting the documents into chunks
from langchain_community.vectorstores import FAISS,Chroma # for storing the embeddings of our chunks
from langchain_core.output_parsers import StrOutputParser # parsing output of an llm into desired format
from langchain_community.tools import WikipediaQueryRun
from langchain_community.utilities import WikipediaAPIWrapper  # tools
from langchain.tools.retriever import create_retriever_tool # used to wrap retriever as a tool
from langchain.agents import AgentExecutor, create_react_agent # creating reasoning and action step agent

In [3]:
# loading the documents
loader = PyPDFLoader("/kaggle/input/machinelearning/machine_learning_with_python_tutorial.pdf")
docs = loader.load()

#splitting the documents
documents = RecursiveCharacterTextSplitter(
    chunk_size=1000, chunk_overlap=200
).split_documents(docs)

#creating embeddings
embeddings = HuggingFaceEmbeddings(model_name="sentence-transformers/all-mpnet-base-v2")
vector = FAISS.from_documents(documents, embedding = embeddings)
retriever = vector.as_retriever() # this retriever is used to retrieve answers from the document to the user

modules.json:   0%|          | 0.00/349 [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

README.md:   0%|          | 0.00/10.6k [00:00<?, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/53.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/571 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/438M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/363 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/239 [00:00<?, ?B/s]



1_Pooling/config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

## let's create our chat model, we load it drom hugging face API

In [4]:
llm = HuggingFaceEndpoint(
    repo_id="HuggingFaceH4/zephyr-7b-beta",
    task="text-generation",
    max_new_tokens=512,
    do_sample=False,
    repetition_penalty=1.03,
    huggingfacehub_api_token=hf_tkn
)

chat_model = ChatHuggingFace(llm=llm)

tokenizer_config.json:   0%|          | 0.00/1.43k [00:00<?, ?B/s]

tokenizer.model:   0%|          | 0.00/493k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.80M [00:00<?, ?B/s]

added_tokens.json:   0%|          | 0.00/42.0 [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/168 [00:00<?, ?B/s]

## Tool creation
*WikipediaAPIWrapper - Wrapper around WikipediaAPI.*
*WikipediaQueryRun implements the standard Runnable Interface.*

In [5]:
wrap = WikipediaAPIWrapper(top_k_results = 1)
wikipedia = WikipediaQueryRun(api_wrapper=wrap)

### retriever tool below modify the retriever for it to function as a tool

In [6]:
retriever_tool = create_retriever_tool(retriever,"doc_search","search for contents in a document")

### now this's our overall tool 👇, a combination of both wikipedia tool and retriever tool

In [7]:
tools = [wikipedia,retriever_tool]

## Now let's create our promppt template to our rag system

In [18]:
from langchain.prompts import PromptTemplate
prompt = PromptTemplate(
    input_variables=["input", "agent_scratchpad", "tools", "tool_names"],
    template="""
You are an intelligent assistant that can use tools to answer questions accurately. You must strictly adhere to the following format for your response:

Question: The question you must answer
Thought: Think step by step about what to do
Action: The action to take, should be one of [{tool_names}]
Action Input: The input to the action
Observation: The result of the action
Thought: I now know the final answer
Final Answer: The final answer to the original input question

IMPORTANT: If you cannot answer, explicitly state so, or take an action using the available tools.

The tools you can use are:
{tools}

Now, begin!

Question: {input}
{agent_scratchpad}
"""
)


### Creation of Reasoning and Action agent (ReAct), it think step by step, first given tools it first goes to the retriever tool
### to obtain a clue about the question, then it goes to the wikipedia tool to obtain another clue
### then finnally it gives an answer using both the clues

In [19]:
agent = create_react_agent(llm, tools, prompt)

In [20]:
agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True,handle_parsing_errors=True) # implements the standard runnable interface

## let's do a sample user input

In [21]:
agent_executor.invoke({"input": "what are the most popular ensemble learning methods?"})



[1m> Entering new AgentExecutor chain...[0m




[32;1m[1;3mThought: In order to answer this question, we need to understand what ensemble learning is and which methods are most commonly used in practice.

Action: wikipedia
Action Input: "ensemble learning"

Observation[0m[36;1m[1;3mPage: Ensemble learning
Summary: In statistics and machine learning, ensemble methods use multiple learning algorithms to obtain better predictive performance than could be obtained from any of the constituent learning algorithms alone.
Unlike a statistical ensemble in statistical mechanics, which is usually infinite, a machine learning ensemble consists of only a concrete finite set of alternative models, but typically allows for much more flexible structure to exist among those alternatives.

[0m



[32;1m[1;3mThought: Based on the information provided by the Wikipedia page, we can conclude that ensemble learning involves using multiple learning algorithms to improve predictive performance. We now need to determine which methods are most popular.

Action: doc_search
Action Input: "most popular ensemble learning methods"

Observation[0m[33;1m[1;3mlearners. 
Ensemble Learning Methods 
The f ollowing are the most popular ensemble learning methods i.e. the methods for 
combining the predictions from different models: 
Bagging  
The term bagging is also known as bootstrap aggregation. In bagging methods, ensemble 
model tries to improve prediction accuracy and d ecrease model variance by combining 
predictions of individual models trained over randomly generated training samples. The 
final prediction of ensemble model will be given by calculating the average of all predictions 
from the individual estimators. One of the best examples of bagging methods are random 
forests. 
Boost



[32;1m[1;3mBased on the information provided in the text material, we can conclude that bagging, boosting, and voting are the most popular ensemble learning methods. AdaBoost and Gradient Boosting are two common boosting ensemble algorithms.

Final Answer: Bagging, boosting, and voting are the most popular ensemble learning methods. AdaBoost and Gradient Boosting are two common boosting ensemble algorithms.</s>[0m

[1m> Finished chain.[0m


{'input': 'what are the most popular ensemble learning methods?',
 'output': 'Bagging, boosting, and voting are the most popular ensemble learning methods. AdaBoost and Gradient Boosting are two common boosting ensemble algorithms.</s>'}