# Research Tool using RAG

We are going to build Advanced RAG Pipeline With Multiple Data Sources such as arxiv,wikipedia and our custom web source .Here we will be using agents,tools,toolkits and agent executor.

## Tools in LangChain,
Langchain Tools are interfaces that an agent, chain, or LLM can use to interact with the world

For this RAG application we will create 3 tools usng Arxiv, Wikipedia, and our own custom web source, combinig these tools we will create a RAG pipeline and use a conversational retrieval agent to ask questions. 

### Let's take a look at how to work with tools. To do this, we'll work with a built in tool.

## Wikipedia

We will create a wrapper on top is wikipedia. The WikipediaQueryRun tool connects your agents and chains to Wikipedia.

In [2]:
from langchain_community.tools import WikipediaQueryRun
from langchain_community.utilities import WikipediaAPIWrapper

In [3]:
api_wrapper=WikipediaAPIWrapper(top_k_results=1,doc_content_chars_max=200)
wiki=WikipediaQueryRun(api_wrapper=api_wrapper)

In [4]:
wiki.name

'wikipedia'

## Arxiv Tool

Langchain inbuilt arXiv is an open-access archive for 2 million scholarly articles in the fields of physics, mathematics, computer science, quantitative biology, quantitative finance, statistics, electrical engineering and systems science, and economics.

In [5]:
## Arxiv Tool
!pip install arxiv

from langchain_community.utilities import ArxivAPIWrapper
from langchain_community.tools import ArxivQueryRun

arxiv_wrapper=ArxivAPIWrapper(top_k_results=1, doc_content_chars_max=200)
arxiv=ArxivQueryRun(api_wrapper=arxiv_wrapper)
arxiv.name



'arxiv'

## Custom Web based tool.

We will use a custom web page from LangChain documentation to add into our pipeline. We will use WebBaseLoader to ingest data from the web, then perform techniques like Chunking and Embeddings to create our app. 

In [6]:
from langchain_community.document_loaders import WebBaseLoader
loader=WebBaseLoader("https://docs.smith.langchain.com/")
docs=loader.load()

### Chunking.

We used RecursiveCharacterTextSplitter to create chunks or our text.


In [7]:
from langchain_text_splitters import RecursiveCharacterTextSplitter
documents=RecursiveCharacterTextSplitter(chunk_size=1000,chunk_overlap=200).split_documents(docs)

### Embeddings.

We will use OllamaEmbeddings to convert chunks into vector embeddings the store those vectors in a vector store called FAISS vector Db. 

In [8]:
from langchain_community.embeddings import OllamaEmbeddings
from langchain_community.vectorstores import FAISS

vectordb=FAISS.from_documents(documents,OllamaEmbeddings())
retriever=vectordb.as_retriever()
retriever

VectorStoreRetriever(tags=['FAISS', 'OllamaEmbeddings'], vectorstore=<langchain_community.vectorstores.faiss.FAISS object at 0x000001F5EE790640>)

### Using agents

This is an agent specifically optimized for doing retrieval when necessary and also holding a conversation.

To start, we have set up the retriever we want to use, and then turn it into a retriever tool. Next, we will use the high level constructor for this type of agent.

### Retriever Tool

Now we need to create a tool for our retriever. The main things we need to pass in are a name for the retriever as well as a description. These will both be used by the language model, so they should be informative.

In [9]:
from langchain.tools.retriever import create_retriever_tool
retriever_tool=create_retriever_tool(retriever,"langsmith_search",
                      "Search for information about LangSmith. For any questions about LangSmith, you must use this tool!")

In [10]:
retriever_tool.name

'langsmith_search'

## Research Tool

Now we will compile our 3 tools to create a combines Research Tool.

In [11]:
tools = [wiki, arxiv, retriever_tool]

To get started with executing agents we will assign our LLM and prompts using promp_hub.

In [12]:
from dotenv import load_dotenv

load_dotenv()
import os
os.environ['LANGCHAIN_API_KEY']=os.getenv("LANGCHAIN_API_KEY")


In [18]:
from langchain_community.chat_models import ChatOllama

llm = ChatOllama(model="llama3")

In [19]:
from langchain import hub
# Get the prompt to use - you can modify this!
prompt = hub.pull("hwchase17/openai-functions-agent")
prompt.messages

[SystemMessagePromptTemplate(prompt=PromptTemplate(input_variables=[], template='You are a helpful assistant')),
 MessagesPlaceholder(variable_name='chat_history', optional=True),
 HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['input'], template='{input}')),
 MessagesPlaceholder(variable_name='agent_scratchpad')]

## Agents
The core idea of agents is to use a language model to choose a sequence of actions to take. In chains, a sequence of actions is hardcoded (in code). In agents, a language model is used as a reasoning engine to determine which actions to take and in which order.

For our Reseach Tool, a user will input a query and our application will use agents to fetch the results in order. Firstly it will look into Wikipedia if not found then arxiv and later in the web source. 

In [20]:
### Agents
from langchain.agents import create_openai_tools_agent
agent=create_openai_tools_agent(llm,tools,prompt)

In [21]:
## Agent Executer
from langchain.agents import AgentExecutor
agent_executor=AgentExecutor(agent=agent,tools=tools,verbose=True)
agent_executor

AgentExecutor(verbose=True, agent=RunnableMultiActionAgent(runnable=RunnableAssign(mapper={
  agent_scratchpad: RunnableLambda(lambda x: format_to_openai_tool_messages(x['intermediate_steps']))
})
| ChatPromptTemplate(input_variables=['agent_scratchpad', 'input'], input_types={'chat_history': typing.List[typing.Union[langchain_core.messages.ai.AIMessage, langchain_core.messages.human.HumanMessage, langchain_core.messages.chat.ChatMessage, langchain_core.messages.system.SystemMessage, langchain_core.messages.function.FunctionMessage, langchain_core.messages.tool.ToolMessage]], 'agent_scratchpad': typing.List[typing.Union[langchain_core.messages.ai.AIMessage, langchain_core.messages.human.HumanMessage, langchain_core.messages.chat.ChatMessage, langchain_core.messages.system.SystemMessage, langchain_core.messages.function.FunctionMessage, langchain_core.messages.tool.ToolMessage]]}, metadata={'lc_hub_owner': 'hwchase17', 'lc_hub_repo': 'openai-functions-agent', 'lc_hub_commit_hash': 'a165

In [22]:
agent_executor.invoke({"input":"Tell me about Langsmith"})



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mLangsmith!

Langsmith is a small hamlet located in the county of Gloucestershire, England. It's a tiny settlement with a population of less than 100 people. Despite its size, Langsmith has a rich history and some interesting facts worth sharing:

1. **Historic church**: Langsmith has an ancient church, St Mary's, which dates back to the 12th century. The church is known for its unique architecture and stunning stained-glass windows.
2. **Roman roots**: Archaeological findings suggest that Langsmith was an important Roman settlement, with evidence of Roman roads and buildings in the area.
3. **Agricultural heritage**: For centuries, Langsmith has been an agricultural hub, with farming being a significant part of its economy. The surrounding countryside is still used for arable farming and livestock rearing.
4. **Community spirit**: Despite its small size, Langsmith has a strong sense of community, with many locals actively inv

{'input': 'Tell me about Langsmith',
 'output': "Langsmith!\n\nLangsmith is a small hamlet located in the county of Gloucestershire, England. It's a tiny settlement with a population of less than 100 people. Despite its size, Langsmith has a rich history and some interesting facts worth sharing:\n\n1. **Historic church**: Langsmith has an ancient church, St Mary's, which dates back to the 12th century. The church is known for its unique architecture and stunning stained-glass windows.\n2. **Roman roots**: Archaeological findings suggest that Langsmith was an important Roman settlement, with evidence of Roman roads and buildings in the area.\n3. **Agricultural heritage**: For centuries, Langsmith has been an agricultural hub, with farming being a significant part of its economy. The surrounding countryside is still used for arable farming and livestock rearing.\n4. **Community spirit**: Despite its small size, Langsmith has a strong sense of community, with many locals actively involved

In [23]:
agent_executor.invoke({"input":"What's the paper 1605.08386 about?"})



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mThe paper you're referring to is titled "Attention Is All You Need" and has the arXiv identifier 1605.08386.

This paper, published in 2017, revolutionized the field of natural language processing (NLP) by introducing a new architecture for neural machine translation called Transformer. The authors, Ashish Vaswani et al., proposed a novel approach that relies solely on self-attention mechanisms, eliminating the need for traditional recurrent and convolutional neural networks (RNNs/CNNs).

The paper's main contribution is the Transformer model, which consists of an encoder and a decoder. The encoder transforms input sequences into continuous representations, while the decoder generates output sequences by attending to these representations. This self-attention mechanism allows the model to process input sequences in parallel, making it more efficient than traditional sequence-to-sequence models that rely on RNNs.

The paper pr

{'input': "What's the paper 1605.08386 about?",
 'output': 'The paper you\'re referring to is titled "Attention Is All You Need" and has the arXiv identifier 1605.08386.\n\nThis paper, published in 2017, revolutionized the field of natural language processing (NLP) by introducing a new architecture for neural machine translation called Transformer. The authors, Ashish Vaswani et al., proposed a novel approach that relies solely on self-attention mechanisms, eliminating the need for traditional recurrent and convolutional neural networks (RNNs/CNNs).\n\nThe paper\'s main contribution is the Transformer model, which consists of an encoder and a decoder. The encoder transforms input sequences into continuous representations, while the decoder generates output sequences by attending to these representations. This self-attention mechanism allows the model to process input sequences in parallel, making it more efficient than traditional sequence-to-sequence models that rely on RNNs.\n\nThe p

In [24]:
agent_executor.invoke({"input":"Is AI going to take jobs, tell me how to use AI to grow?"})



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mWhile it's true that AI has the potential to automate certain tasks and potentially replace some jobs, I'd like to focus on the many ways you can leverage AI to grow your career, business, or personal life.

Here are some examples of how AI can help:

1. **Process Automation**: Focus on high-value tasks that require human judgment, creativity, and empathy, while leaving repetitive, routine tasks to AI-powered tools.
2. **Data Analysis**: Leverage AI-driven analytics platforms to gain insights from large datasets, identify trends, and make data-informed decisions.
3. **Customer Service**: Implement AI-powered chatbots or virtual assistants to handle common customer inquiries, freeing up your time for more complex issues.
4. **Marketing Automation**: Use AI-driven marketing tools to personalize content, segment audiences, and optimize campaigns based on user behavior.
5. **Content Creation**: Utilize AI-generated content, such 

{'input': 'Is AI going to take jobs, tell me how to use AI to grow?',
 'output': "While it's true that AI has the potential to automate certain tasks and potentially replace some jobs, I'd like to focus on the many ways you can leverage AI to grow your career, business, or personal life.\n\nHere are some examples of how AI can help:\n\n1. **Process Automation**: Focus on high-value tasks that require human judgment, creativity, and empathy, while leaving repetitive, routine tasks to AI-powered tools.\n2. **Data Analysis**: Leverage AI-driven analytics platforms to gain insights from large datasets, identify trends, and make data-informed decisions.\n3. **Customer Service**: Implement AI-powered chatbots or virtual assistants to handle common customer inquiries, freeing up your time for more complex issues.\n4. **Marketing Automation**: Use AI-driven marketing tools to personalize content, segment audiences, and optimize campaigns based on user behavior.\n5. **Content Creation**: Utiliz