# Buliding Agent with Tools Wikipedia, Arxiv, and RAG (OpenAI Embeddings) using LangChain and GROQ LLM

In this notebook, we build a powerful multi-tool AI agent using LangChain and GROQ's fast and efficient LLaMA3 model. The agent can answer complex questions by combining reasoning with external knowledge sources and custom document retrieval.


### Create Tools 

#### Tool for Wikipedia

In [1]:
from langchain_community.tools import WikipediaQueryRun
from langchain_community.utilities import WikipediaAPIWrapper

# Create wrapper for Wikipedia API
wiki_api_wrapper = WikipediaAPIWrapper(
    top_k_results=1,
    doc_content_chars_max=250
)
# Create function to query Wikipedia
wiki_func = WikipediaQueryRun(api_wrapper=wiki_api_wrapper)
wiki_func.name



'wikipedia'

#### Tool for Arxiv

In [2]:
from langchain_community.tools import ArxivQueryRun
from langchain_community.utilities import ArxivAPIWrapper

# Create wrapper for Arxiv API
arxiv_api_wrapper = ArxivAPIWrapper(
    top_k_results=1,
    doc_content_chars_max=250
)
# Create function to query Arxiv
arxiv_func = ArxivQueryRun(api_wrapper=arxiv_api_wrapper)
print(arxiv_func.name)

arxiv


#### Custom Tool (RAG Tool)

In [3]:
from langchain_community.document_loaders import WebBaseLoader
# Create a web loader for a specific URL
web_loader = WebBaseLoader(
    web_path="https://docs.smith.langchain.com/"
)
docs = web_loader.load()
# Print the loaded documents
for doc in docs:
    print(doc.page_content[:50])  # Print the first 100 characters of each document
    print(doc.metadata)  # Print metadata of the document

USER_AGENT environment variable not set, consider setting it to identify your requests.







Get started with LangSmith | 🦜️🛠️ LangSmith


{'source': 'https://docs.smith.langchain.com/', 'title': 'Get started with LangSmith | 🦜️🛠️ LangSmith', 'description': 'LangSmith is a platform for building production-grade LLM applications.', 'language': 'en'}


In [6]:
from langchain_text_splitters import RecursiveCharacterTextSplitter

# Create a text splitter to split documents into smaller chunks
text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=1000,
    chunk_overlap=200
)
# Split the loaded documents into smaller chunks
split_docs = text_splitter.split_documents(docs)

# Print the first chunk of the first document
print(split_docs[0].page_content[:50])

Get started with LangSmith | 🦜️🛠️ LangSmith


In [7]:
from langchain_openai import OpenAIEmbeddings
from langchain.vectorstores import FAISS

# Create a vector store using FAISS with the embeddings model
vector_store = FAISS.from_documents(
    split_docs,
    OpenAIEmbeddings()
)

# Print the number of documents in the vector store
print(len(vector_store.index_to_docstore_id))

6


In [8]:

# Create a retriever from the vector store
retriever = vector_store.as_retriever(
    search_type="similarity",
    search_kwargs={"k": 2}
)

# print the retriever's search type and number of results to return
retriever


VectorStoreRetriever(tags=['FAISS', 'OpenAIEmbeddings'], vectorstore=<langchain_community.vectorstores.faiss.FAISS object at 0x0000025DBAD17580>, search_kwargs={'k': 2})

In [9]:
# Create a retriever tool using the retriever
from langchain.tools.retriever import create_retriever_tool

# Create a retriever tool using the retriever
retriever_tool = create_retriever_tool(
    retriever=retriever,
    name="LangSmithSearch",
    description="A tool to retrieve from the LangSmith knowledge base."
)

# Print the retriever tool's name and description
retriever_tool

Tool(name='LangSmithSearch', description='A tool to retrieve from the LangSmith knowledge base.', args_schema=<class 'langchain_core.tools.retriever.RetrieverInput'>, func=functools.partial(<function _get_relevant_documents at 0x0000025DC7B82710>, retriever=VectorStoreRetriever(tags=['FAISS', 'OpenAIEmbeddings'], vectorstore=<langchain_community.vectorstores.faiss.FAISS object at 0x0000025DBAD17580>, search_kwargs={'k': 2}), document_prompt=PromptTemplate(input_variables=['page_content'], input_types={}, partial_variables={}, template='{page_content}'), document_separator='\n\n', response_format='content'), coroutine=functools.partial(<function _aget_relevant_documents at 0x0000025DC7E339A0>, retriever=VectorStoreRetriever(tags=['FAISS', 'OpenAIEmbeddings'], vectorstore=<langchain_community.vectorstores.faiss.FAISS object at 0x0000025DBAD17580>, search_kwargs={'k': 2}), document_prompt=PromptTemplate(input_variables=['page_content'], input_types={}, partial_variables={}, template='{pag

#### Create Tool

In [10]:
tools = [wiki_func, arxiv_func, retriever_tool]
# Print the names of the tools
for tool in tools:
    print(tool.name)
    print(tool.description)
    print(tool.args_schema)
    print() 

wikipedia
A wrapper around Wikipedia. Useful for when you need to answer general questions about people, places, companies, facts, historical events, or other subjects. Input should be a search query.
<class 'langchain_community.tools.wikipedia.tool.WikipediaQueryInput'>

arxiv
A wrapper around Arxiv.org Useful for when you need to answer questions about Physics, Mathematics, Computer Science, Quantitative Biology, Quantitative Finance, Statistics, Electrical Engineering, and Economics from scientific articles on arxiv.org. Input should be a search query.
<class 'langchain_community.tools.arxiv.tool.ArxivInput'>

LangSmithSearch
A tool to retrieve from the LangSmith knowledge base.
<class 'langchain_core.tools.retriever.RetrieverInput'>



#### Create model

In [12]:
# Create Groq model using the Groq API
from dotenv import load_dotenv
load_dotenv()

import os
import openai
groq_api_key = os.getenv('GROQ_API_KEY')
openai.api_key=os.getenv("OPENAI_API_KEY")

os.environ["GROQ_API_KEY"] = groq_api_key
os.environ["LANGCHAIN_API_KEY"] = os.getenv('LANGCHAIN_API_KEY')

from langchain_groq import ChatGroq
llm = ChatGroq(
    model_name="Llama3-8b-8192",
    api_key=groq_api_key
)
llm


ChatGroq(client=<groq.resources.chat.completions.Completions object at 0x0000025DBB298430>, async_client=<groq.resources.chat.completions.AsyncCompletions object at 0x0000025DBB29A380>, model_name='Llama3-8b-8192', model_kwargs={}, groq_api_key=SecretStr('**********'))

#### Pull structed prompt from LangChain Hub

In [13]:
from langchain import hub
prompt=hub.pull("hwchase17/openai-functions-agent")
prompt.messages

[SystemMessagePromptTemplate(prompt=PromptTemplate(input_variables=[], input_types={}, partial_variables={}, template='You are a helpful assistant'), additional_kwargs={}),
 MessagesPlaceholder(variable_name='chat_history', optional=True),
 HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['input'], input_types={}, partial_variables={}, template='{input}'), additional_kwargs={}),
 MessagesPlaceholder(variable_name='agent_scratchpad')]

#### Create Agent

In [15]:
from langchain.agents import create_openai_tools_agent

# Create an agent that uses the tools and the prompt
agent = create_openai_tools_agent(
    llm=llm,
    tools=tools,
    prompt=prompt
)

agent

RunnableAssign(mapper={
  agent_scratchpad: RunnableLambda(lambda x: format_to_openai_tool_messages(x['intermediate_steps']))
})
| ChatPromptTemplate(input_variables=['agent_scratchpad', 'input'], optional_variables=['chat_history'], input_types={'chat_history': list[typing.Annotated[typing.Union[typing.Annotated[langchain_core.messages.ai.AIMessage, Tag(tag='ai')], typing.Annotated[langchain_core.messages.human.HumanMessage, Tag(tag='human')], typing.Annotated[langchain_core.messages.chat.ChatMessage, Tag(tag='chat')], typing.Annotated[langchain_core.messages.system.SystemMessage, Tag(tag='system')], typing.Annotated[langchain_core.messages.function.FunctionMessage, Tag(tag='function')], typing.Annotated[langchain_core.messages.tool.ToolMessage, Tag(tag='tool')], typing.Annotated[langchain_core.messages.ai.AIMessageChunk, Tag(tag='AIMessageChunk')], typing.Annotated[langchain_core.messages.human.HumanMessageChunk, Tag(tag='HumanMessageChunk')], typing.Annotated[langchain_core.messages

In [16]:
from langchain.agents import AgentExecutor

# Create an agent executor to run the agent with the tools
agent_executor = AgentExecutor(
    agent=agent,
    tools=tools,
    verbose=True
)

agent_executor

AgentExecutor(verbose=True, agent=RunnableMultiActionAgent(runnable=RunnableAssign(mapper={
  agent_scratchpad: RunnableLambda(lambda x: format_to_openai_tool_messages(x['intermediate_steps']))
})
| ChatPromptTemplate(input_variables=['agent_scratchpad', 'input'], optional_variables=['chat_history'], input_types={'chat_history': list[typing.Annotated[typing.Union[typing.Annotated[langchain_core.messages.ai.AIMessage, Tag(tag='ai')], typing.Annotated[langchain_core.messages.human.HumanMessage, Tag(tag='human')], typing.Annotated[langchain_core.messages.chat.ChatMessage, Tag(tag='chat')], typing.Annotated[langchain_core.messages.system.SystemMessage, Tag(tag='system')], typing.Annotated[langchain_core.messages.function.FunctionMessage, Tag(tag='function')], typing.Annotated[langchain_core.messages.tool.ToolMessage, Tag(tag='tool')], typing.Annotated[langchain_core.messages.ai.AIMessageChunk, Tag(tag='AIMessageChunk')], typing.Annotated[langchain_core.messages.human.HumanMessageChunk, Tag

In [17]:
agent_executor.invoke({
    "input": "What is LangSmith?"
})



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m
Invoking: `wikipedia` with `{'query': 'LangSmith'}`


[0m[36;1m[1;3mPage: Will Smith
Summary: Willard Carroll  Smith II (born September 25, 1968) is an American actor, rapper, and film producer. He has received multiple accolades, including an Academy Award, a Golden Globe Award, a BAFTA Award, and four Grammy Awards[0m[32;1m[1;3mIt seems that the tool call for "call_0c7g" yielded some information about Will Smith. However, I couldn't find any information about a knowledge base or database called "LangSmith". It's possible that it's a lesser-known or niche database.

Since we didn't get any relevant information about "LangSmith", I'll respond directly to your question: I couldn't find any information about LangSmith, but I did find information about Will Smith, an American actor, rapper, and film producer.[0m

[1m> Finished chain.[0m


{'input': 'What is LangSmith?',
 'output': 'It seems that the tool call for "call_0c7g" yielded some information about Will Smith. However, I couldn\'t find any information about a knowledge base or database called "LangSmith". It\'s possible that it\'s a lesser-known or niche database.\n\nSince we didn\'t get any relevant information about "LangSmith", I\'ll respond directly to your question: I couldn\'t find any information about LangSmith, but I did find information about Will Smith, an American actor, rapper, and film producer.'}

In [20]:
agent_executor.invoke({"input":"Who is Alan Turing?"})



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m
Invoking: `wikipedia` with `{'query': 'Alan Turing'}`


[0m[36;1m[1;3mPage: Alan Turing
Summary: Alan Mathison Turing (; 23 June 1912 – 7 June 1954) was an English mathematician, computer scientist, logician, cryptanalyst, philosopher and theoretical biologist. He was highly influential in the development of theoretica[0m[32;1m[1;3mBased on the result, here is the response:

Alan Turing was an English mathematician, computer scientist, logician, cryptanalyst, philosopher, and theoretical biologist. He was highly influential in the development of theoretical computer science, artificial intelligence, and the concept of the universal Turing machine. Turing is widely considered to be the father of computer science and artificial intelligence, and his work laid the foundation for a wide range of modern computer and information technologies.[0m

[1m> Finished chain.[0m


{'input': 'Who is Alan Turing?',
 'output': 'Based on the result, here is the response:\n\nAlan Turing was an English mathematician, computer scientist, logician, cryptanalyst, philosopher, and theoretical biologist. He was highly influential in the development of theoretical computer science, artificial intelligence, and the concept of the universal Turing machine. Turing is widely considered to be the father of computer science and artificial intelligence, and his work laid the foundation for a wide range of modern computer and information technologies.'}

In [21]:
agent_executor.invoke({"input":"Any recent papers on AI and Finance?"})



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m
Invoking: `arxiv` with `{'query': 'Recent papers on AI and Finance'}`


[0m[33;1m[1;3mPublished: 2023-06-05
Title: Explaining AI in Finance: Past, Present, Prospects
Authors: Barry Quinn
Summary: This paper explores the journey of AI in finance, with a particular focus on
the crucial role and potential of Explainable AI (XAI). We trac[0m[32;1m[1;3mHere's a brief summary:

The paper "Explaining AI in Finance: Past, Present, Prospects" by Barry Quinn explores the journey of AI in finance, with a focus on the role and potential of Explainable AI (XAI). The paper tracks the evolution of AI applications in finance, from early attempts to the current state of the field, and discusses the challenges and opportunities of using XAI in finance.

Would you like me to call another tool or provide more information on this paper?[0m

[1m> Finished chain.[0m


{'input': 'Any recent papers on AI and Finance?',
 'output': 'Here\'s a brief summary:\n\nThe paper "Explaining AI in Finance: Past, Present, Prospects" by Barry Quinn explores the journey of AI in finance, with a focus on the role and potential of Explainable AI (XAI). The paper tracks the evolution of AI applications in finance, from early attempts to the current state of the field, and discusses the challenges and opportunities of using XAI in finance.\n\nWould you like me to call another tool or provide more information on this paper?'}

In [22]:
agent_executor.invoke({"input":"How do I log traces in LangSmith?"})



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m
Invoking: `LangSmithSearch` with `{'query': 'logging-traces'}`


[0m[38;5;200m[1;3mLangSmith is a platform for building production-grade LLM applications.
It allows you to closely monitor and evaluate your application, so you can ship quickly and with confidence.
ObservabilityAnalyze traces in LangSmith and configure metrics, dashboards, alerts based on these.EvalsEvaluate your application over production traffic — score application performance and get human feedback on your data.Prompt EngineeringIterate on prompts, with automatic version control and collaboration features.
LangSmith + LangChain OSSLangSmith is framework-agnostic — it can be used with or without LangChain's open source frameworks langchain and langgraph.If you are using either of these, you can enable LangSmith tracing with a single environment variable.
For more see the how-to guide for setting up LangSmith with LangChain or setting up LangSmith with La

{'input': 'How do I log traces in LangSmith?',
 'output': "I've got the information you need. LangSmith is a platform for building production-grade LLM applications, and it allows you to closely monitor and evaluate your application. You can analyze traces in LangSmith, configure metrics, dashboards, and alerts, and also evaluate your application over production traffic."}