# How to build a tool-using agent with LangChain

This notebook takes you through how to use LangChain to augment an OpenAI model with access to external tools. In particular, you'll be able to create LLM agents that use custom tools to answer user queries.


## What is Langchain?
[LangChain](https://python.langchain.com/en/latest/index.html) is a framework for developing applications powered by language models. Their framework enables you to build layered LLM-powered applications that are context-aware and able to interact dynamically with their environment as agents, leading to simplified code for you and a more dynamic user experience for your customers.

## Why do LLMs need to use Tools?
One of the most common challenges with LLMs is overcoming the lack of recency and specificity in their training data - answers can be out of date, and they are prone to hallucinations given the huge variety in their knowledge base. Tools are a great method of allowing an LLM to answer within a controlled context that draws on your existing knowledge bases and internal APIs - instead of trying to prompt engineer the LLM all the way to your intended answer, you allow it access to tools that it calls on dynamically for info, parses, and serves to customer.

Providing LLMs access to tools can enable them to answer questions with context directly from search engines, APIs or your own databases. Instead of answering directly, an LLM with access to tools can perform intermediate steps to gather relevant information. Tools can also be used in combination. [For example](https://python.langchain.com/en/latest/modules/agents/agents/examples/mrkl_chat.html), a language model can be made to use a search tool to lookup quantitative information and a calculator to execute calculations.

## Notebook Sections

- **Setup:** Import packages and connect to a Pinecone vector database.
- **LLM Agent:** Build an agent that leverages a modified version of the [ReAct](https://react-lm.github.io/) framework to do chain-of-thought reasoning.
- **LLM Agent with History:** Provide the LLM with access to previous steps in the conversation.
- **Knowledge Base:** Create a knowledge base of "Stuff You Should Know" podcast episodes, to be accessed through a tool.
- **LLM Agent with Tools:** Extend the agent with access to multiple tools and test that it uses them to answer questions.

# Setup

Import libraries and set up a connection to a [Pinecone](https://www.pinecone.io) vector database.

You can substitute Pinecone for any other vectorstore or database - there are a [selection](https://python.langchain.com/en/latest/modules/indexes/vectorstores.html) that are supported by Langchain natively, while other connectors will need to be developed yourself.

In [91]:
%load_ext autoreload
%autoreload 2

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


In [92]:
from langchain.document_loaders import AsyncHtmlLoader
from langchain.document_loaders.recursive_url_loader import RecursiveUrlLoader
from langchain.document_transformers import Html2TextTransformer
import datetime
from pprint import pprint, pp
import json
import openai
import os
from helper import fg, bg, control
import pandas as pd
import pinecone
import re
from tqdm.auto import tqdm
from typing import List, Union
import zipfile
#from pydantic_v1 import BaseModel, Extra, Field, root_validator
from bs4 import BeautifulSoup
# Langchain imports
from langchain.agents import Tool, AgentExecutor, LLMSingleActionAgent, AgentOutputParser, AgentType, initialize_agent
from langchain.prompts import BaseChatPromptTemplate, ChatPromptTemplate
from langchain import SerpAPIWrapper, LLMChain
from langchain.schema import AgentAction, AgentFinish, HumanMessage, SystemMessage
# LLM wrapper
from langchain.chat_models import ChatOpenAI
from langchain import OpenAI
# Conversational memory
from langchain.memory import ConversationBufferWindowMemory
# Embeddings and vectorstore
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.vectorstores import Pinecone

# Vectorstore Index
index_name = 'dfv'

api_key = os.getenv('OPENAI_API_KEY')
env = 'gcp-starter'

For acquiring an API key to connect with Pinecone, you can set up a [free account](https://app.pinecone.io/) and store it in the `api_key` variable below or in your environment variables under `PINECONE_API_KEY`

In [93]:

pinecone.init(api_key='4ea18324-3aee-437b-b09b-0e72026cae8b', environment=env)
pinecone.whoami()

# Check whether the index with the same name already exists - if so, delete it
# if index_name in pinecone.list_indexes():
#     pinecone.delete_index(index_name)

# Creates new index
try:
    if index_name not in pinecone.list_indexes():
        pinecone.create_index(name=index_name, dimension=1536)
        print('created: ', index_name)
    index = pinecone.Index(index_name=index_name)
except Exception as e:
    print(f'pinecone: {e}')

# Confirm our index was created
pinecone.list_indexes()

['dfv']

Run this code block if you want to clear the index, or if the index doesn't exist yet

```
# Check whether the index with the same name already exists - if so, delete it
if index_name in pinecone.list_indexes():
    pinecone.delete_index(index_name)

# Creates new index
pinecone.create_index(name=index_name, dimension=1536)
index = pinecone.Index(index_name=index_name)

# Confirm our index was created
pinecone.list_indexes()
```

## LLM Agent

An [LLM agent](https://python.langchain.com/en/latest/modules/agents/agents/custom_llm_agent.html) in Langchain has many configurable components, which are detailed in the Langchain documentation.

We'll employ a few of the core concepts to make an agent that talks in the way we want, can use tools to answer questions, and uses the appropriate language model to power the conversation.
- **Prompt Template:** The input template to control the LLM's behaviour and how it accepts inputs and produces outputs - this is the brain that drives your application ([docs](https://python.langchain.com/en/latest/modules/prompts/prompt_templates.html)).
- **Output Parser:** A method of parsing the output from the prompt. If the LLM produces output using certain headers, you can enable complex interactions where variables are generated by the LLM in their response and passed into the next step of the chain ([docs](https://python.langchain.com/en/latest/modules/prompts/output_parsers.html)).
- **LLM Chain:** A Chain brings together a prompt template with an LLM that will execute it - in this case we'll be using ```gpt-3.5-turbo``` but this framework can be used with OpenAI completions models, or other LLMs entirely ([docs](https://python.langchain.com/en/latest/modules/chains.html)).
- **Tool:** An external service that the LLM can use to retrieve information or execute commands should the user require it ([docs](https://python.langchain.com/en/latest/modules/agents/tools.html)).
- **Agent:** The glue that brings all of this together, an agent can call multiple LLM Chains, each with their own tools. Agents can be extended with your own logic to allow retries, error handling and any other methods you choose to add reliability to your application ([docs](https://python.langchain.com/en/latest/modules/agents.html)).

**NB:** Before using this cookbook with the Search tool you'll need to sign up on https://serpapi.com/ and generate an API key. Once you have it, store it in an environment variable named ```SERPAPI_API_KEY```

In [94]:
# Initiate a Search tool - note you'll need to have set SERPAPI_API_KEY as an environment variable as per the above instructions
# search = SerpAPIWrapper()
#
# class FunctionTool(BaseModel):
#     def run(self, query: str) -> str:
#         print(f'FunctionTool Item: {query}')
#         return f"answer: {bytes(m).hex().capitalize()}"
#
import requests
from langchain.tools import tool
from langchain.tools import StructuredTool


def connect_api(host: str, port: int, parameters: dict = None) -> str:
    url = f'http://{host}'
    result = requests.post(url=url, port=port)
    return f"{result.text}"


connect_tool = StructuredTool.from_function(connect_api,
                                            description='connect to specified host and port')


@tool('connect', return_direct=False)
def connect_api(host: str, port: int) -> str:
    """Searches the API for the query."""
    print(f'{connect_api: {host}:{port}}')
    results = f'Host: {host}\r\n\r\n'
    return f"reply at {host}:{port}: {results}"


@tool('fetch')
def fetch_url(url: str) -> str:
    """fetch url"""
    print(f'fetching {url}')
    #result = requests.post(url)
    #
    # soup = BeautifulSoup(result.text, 'lxml')
    # ps = [e.get_text() for e in soup.find_all('p')]
    # article = '\n'.join(ps)

    loader = RecursiveUrlLoader(url, max_depth=2, timeout=3)
    docs = loader.load()
    print(docs)
    html2text = Html2TextTransformer()
    docs_transformed = html2text.transform_documents(docs)
    print(docs_transformed)
    if docs_transformed:
        tr = docs_transformed[0].page_content()
        print(f'result:{tr}')
    else:
        return ""

    return f"{article}"


tools = [connect_tool]
search = SerpAPIWrapper()

tools.append(
    Tool.from_function(
        func=fetch_url,
        name="fetch",
        description='get remote document',
        handle_tool_error=True,
    )
)

# Define a list of tools
tools.append(
    Tool.from_function(
        func=connect_api,
        name="connect",
        description='network connect with the host port',
        handle_tool_error=True,
    ),
)

tools.append(
    Tool(
        name="Abstract Answer",
        func=search.run,
        description="useful for when you need to ask with search",
    )
)

print(tools)

[StructuredTool(name='connect_api', description='connect_api(host: str, port: int, parameters: dict = None) -> str - connect to specified host and port', args_schema=<class 'pydantic.main.connect_apiSchemaSchema'>, return_direct=False, verbose=False, callbacks=None, callback_manager=None, tags=None, metadata=None, handle_tool_error=False, func=<function connect_api at 0x00000259FFCED9D0>, coroutine=None), Tool(name='fetch', description='get remote document', args_schema=None, return_direct=False, verbose=False, callbacks=None, callback_manager=None, tags=None, metadata=None, handle_tool_error=True, func=StructuredTool(name='fetch', description='fetch(url: str) -> str - fetch url', args_schema=<class 'pydantic.main.fetchSchemaSchema'>, return_direct=False, verbose=False, callbacks=None, callback_manager=None, tags=None, metadata=None, handle_tool_error=False, func=<function fetch_url at 0x00000259FFE72040>, coroutine=None), coroutine=None), Tool(name='connect', description='network conn

In [95]:
# Set up the prompt with input variables for tools, user input and a scratchpad for the model to record its workings
template = """Answer the following questions as best you can. You have access to the following tools:

{tools}

Use the following format:

Question: the input question you must answer
Thought: you should always think about what to do and where to get output
Action: the action to take, should be one of [{tool_names}]
Action Input: the input to the action
Observation: the result of the action
... (this Thought/Action/Action Input/Observation can repeat N times)
Thought: I now know the final answer
Final Answer: the final answer to the original input question

Question: {input}
{agent_scratchpad}"""

In [96]:

# Set up a prompt template
class CustomPromptTemplate(BaseChatPromptTemplate):
    # The template to use
    template: str
    # The list of tools available
    tools: List[Tool]

    def format_messages(self, **kwargs) -> str:
        # Get the intermediate steps (AgentAction, Observation tuples)

        # Format them in a particular way
        intermediate_steps = kwargs.pop("intermediate_steps")
        thoughts = ""
        for action, observation in intermediate_steps:
            thoughts += action.log

            thoughts += f"\nObservation: {observation[:4096]}\nThought: "

        # Set the agent_scratchpad variable to that value
        kwargs["agent_scratchpad"] = thoughts

        # Create a tools variable from the list of tools provided
        kwargs["tools"] = "\n".join([f"{tool.name}: {tool.description}" for tool in self.tools])

        # Create a list of tool names for the tools provided
        kwargs["tool_names"] = ", ".join([tool.name for tool in self.tools])
        formatted = self.template.format(**kwargs)
        print(control.bold + f'formatted templtate:\n---\n{formatted}\n---\n', control.reset)
        return [HumanMessage(content=formatted)]


prompt = CustomPromptTemplate(
    template=template,
    tools=tools,
    # This omits the `agent_scratchpad`, `tools`, and `tool_names` variables because those are generated dynamically
    # This includes the `intermediate_steps` variable because that is needed
    input_variables=["input", "intermediate_steps"]
)

In [97]:
class CustomOutputParser(AgentOutputParser):

    def parse(self, llm_output: str) -> Union[AgentAction, AgentFinish]:

        # Check if agent should finish
        if "Final Answer:" in llm_output:
            return AgentFinish(
                # Return values is generally always a dictionary with a single `output` key
                # It is not recommended to try anything else at the moment :)
                return_values={"output": llm_output.split("Final Answer:")[-1].strip()},
                log=llm_output,
            )

        # Parse out the action and action input
        regex = r"Action: (.*?)[\n]*Action Input:[\s]*(.*)"
        match = re.search(regex, llm_output, re.DOTALL)

        # If it can't parse the output it raises an error
        # You can add your own logic here to handle errors in a different way i.e. pass to a human, give a canned response
        if not match:
            raise ValueError(f"Could not parse LLM output: `{llm_output}`")
        action = match.group(1).strip()
        action_input = match.group(2)

        # Return the action and action input
        return AgentAction(tool=action, tool_input=action_input.strip(" ").strip('"'), log=llm_output)


output_parser = CustomOutputParser()

In [98]:
# Initiate our LLM - default is 'gpt-3.5-turbo'
llm = ChatOpenAI(temperature=0)

# LLM chain consisting of the LLM and a prompt
llm_chain = LLMChain(llm=llm, prompt=prompt)

# Using tools, the LLM chain and output_parser to make an agent
tool_names = [tool.name for tool in tools]

agent = LLMSingleActionAgent(
    llm_chain=llm_chain,
    output_parser=output_parser,
    # We use "Observation" as our stop sequence so it will stop when it receives Tool output
    # If you change your prompt template you'll need to adjust this as well
    stop=["\nObservation:"],
    allowed_tools=tool_names
)

In [99]:
# Initiate the agent that will respond to our queries
# Set verbose=True to share the CoT reasoning the LLM goes through
# print(tools)
# llm = OpenAI(temperature=0)
# agent_executor = initialize_agent(
#     tool_names,
#     llm=ll,
#     agent=AgentType.STRUCTURED_CHAT_ZERO_SHOT_REACT_DESCRIPTION,
#     verbose=True)
agent_executor = AgentExecutor.from_agent_and_tools(agent=agent, tools=tools, verbose=True)

In [100]:
#agent_executor.run("How many people live in canada as of 2023?")
agent_executor.run("what content can be on port 80 of google.com? ")



[1m> Entering new AgentExecutor chain...[0m
[01mformatted templtate:
---
Answer the following questions as best you can. You have access to the following tools:

connect_api: connect_api(host: str, port: int, parameters: dict = None) -> str - connect to specified host and port
fetch: get remote document
connect: network connect with the host port
Abstract Answer: useful for when you need to ask with search

Use the following format:

Question: the input question you must answer
Thought: you should always think about what to do and where to get output
Action: the action to take, should be one of [connect_api, fetch, connect, Abstract Answer]
Action Input: the input to the action
Observation: the result of the action
... (this Thought/Action/Action Input/Observation can repeat N times)
Thought: I now know the final answer
Final Answer: the final answer to the original input question

Question: what content can be on port 80 of google.com? 

---
 [0m
[32;1m[1;3mThought: To determi

ValidationError: 1 validation error for connect_apiSchemaSchema
port
  field required (type=value_error.missing)

In [None]:
agent_executor.run('what is on https://ya.ru/')

In [None]:
agent_executor.run("How many in 2022?")

## LLM Agent with History

Extend the LLM Agent with the ability to retain a [memory](https://python.langchain.com/en/latest/modules/agents/agents/custom_llm_agent.html#adding-memory) and use it as context as it continues the conversation.

We use a simple ```ConversationBufferWindowMemory``` for this example that keeps a rolling window of the last two conversation turns. LangChain has other [memory options](https://python.langchain.com/en/latest/modules/memory.html), with different tradeoffs suitable for different use cases.

In [None]:
# Set up a prompt template which can interpolate the history
template_with_history = """You are SearchGPT, a professional search engine who provides informative answers to users. Answer the following questions as best you can. You have access to the following tools:

{tools}

Use the following format:

Question: the input question you must answer
Thought: you should always think about what to do
Action: the action to take, should be one of [{tool_names}]
Action Input: the input to the action
Observation: the result of the action
... (this Thought/Action/Action Input/Observation can repeat N times)
Thought: I now know the final answer
Final Answer: the final answer to the original input question

Begin! Remember to give detailed, informative answers

Previous conversation history:
{history}

New question: {input}
{agent_scratchpad}"""

In [None]:
prompt_with_history = CustomPromptTemplate(
    template=template_with_history,
    tools=tools,
    # The history template includes "history" as an input variable so we can interpolate it into the prompt
    input_variables=["input", "intermediate_steps", "history"]
)

llm_chain = LLMChain(llm=llm, prompt=prompt_with_history)
tool_names = [tool.name for tool in tools]
agent = LLMSingleActionAgent(
    llm_chain=llm_chain,
    output_parser=output_parser,
    stop=["\nObservation:"],
    allowed_tools=tool_names
)

In [None]:
# Initiate the memory with k=2 to keep the last two turns
# Provide the memory to the agent
memory = ConversationBufferWindowMemory(k=2)
agent_executor = AgentExecutor.from_agent_and_tools(agent=agent, tools=tools, verbose=True, memory=memory)

In [None]:
agent_executor.run("How many people live in canada as of 2023?")

In [None]:
agent_executor.run("how about in mexico?")

## Knowledge base

Create a custom vectorstore for the Agent to use as a tool to answer questions with. We'll store the results in [Pinecone](https://docs.pinecone.io/docs/quickstart), which is supported by LangChain ([Docs](https://python.langchain.com/en/latest/modules/indexes/vectorstores/examples/pinecone.html), [API reference](https://python.langchain.com/en/latest/reference/modules/vectorstore.html)). For help getting started with Pinecone or other vector databases, we have a [cookbook](https://github.com/openai/openai-cookbook/blob/colin/examples/vector_databases/Using_vector_databases_for_embeddings_search.ipynb) to help you get started.

You can check the LangChain documentation to see what other [vectorstores](https://python.langchain.com/en/latest/modules/indexes/vectorstores.html) and [databases](https://python.langchain.com/en/latest/modules/chains/examples/sqlite.html) are available.

For this example we'll use the transcripts of the Stuff You Should Know podcast, which was provided thanks to OSF DOI [10.17605/OSF.IO/VM9NT](https://doi.org/10.17605/OSF.IO/VM9NT)

In [None]:
import wget

# Here is a URL to a zip archive containing the transcribed podcasts
# Note that this data has already been split into chunks and embeddings from OpenAI's text-embedding-ada-002 embedding model are included
content_url = 'https://cdn.openai.com/API/examples/data/sysk_podcast_transcripts_embedded.json.zip'

# Download the file (it is ~541 MB so this will take some time)
wget.download(content_url)

In [None]:
# Load podcasts
with zipfile.ZipFile("sysk_podcast_transcripts_embedded.json.zip", "r") as zip_ref:
    zip_ref.extractall("./data")
f = open('./data/sysk_podcast_transcripts_embedded.json')
processed_podcasts = json.load(f)

In [None]:
# Have a look at the contents
pd.DataFrame(processed_podcasts).head()

In [None]:
# Add the text embeddings to Pinecone

batch_size = 100  # how many embeddings we create and insert at once

for i in tqdm(range(0, len(processed_podcasts), batch_size)):
    # find end of batch
    i_end = min(len(processed_podcasts), i + batch_size)
    meta_batch = processed_podcasts[i:i_end]
    # get ids
    ids_batch = [x['cleaned_id'] for x in meta_batch]
    # get texts to encode
    texts = [x['text_chunk'] for x in meta_batch]
    # add embeddings
    embeds = [x['embedding'] for x in meta_batch]
    # cleanup metadata
    meta_batch = [{
        'filename':   x['filename'],
        'title':      x['title'],
        'text_chunk': x['text_chunk'],
        'url':        x['url']
    } for x in meta_batch]
    to_upsert = list(zip(ids_batch, embeds, meta_batch))
    # upsert to Pinecone
    index.upsert(vectors=to_upsert)

In [None]:
# Configuring the embeddings to be used by our retriever to be OpenAI Embeddings, matching our embedded corpus
embeddings = OpenAIEmbeddings()

# Loads a docsearch object from an existing Pinecone index so we can retrieve from it
docsearch = Pinecone.from_existing_index(index_name, embeddings, text_key='text_chunk')

In [None]:
retriever = docsearch.as_retriever()

In [None]:
query_docs = retriever.get_relevant_documents("can you live without a bank account")

In [None]:
# Print out the title and content for the most relevant retrieved documents
print("\n".join(['Title: ' + x.metadata['title'].strip() + '\n\n' + x.page_content + '\n\n' for x in query_docs]))

## LLM Agent with Tools

Extend our list of tools by creating a [RetrievalQA](https://python.langchain.com/en/latest/modules/chains/index_examples/vector_db_qa.html) chain leveraging our Pinecone knowledge base.

In [None]:
from langchain.chains import RetrievalQA

retrieval_llm = OpenAI(temperature=0)

podcast_retriever = RetrievalQA.from_chain_type(llm=retrieval_llm, chain_type="stuff", retriever=docsearch.as_retriever())

In [None]:
expanded_tools = [
    Tool(
        name="Search",
        func=search.run,
        description="useful for when you need to answer questions about current events"
    ),
    Tool(
        name='Knowledge Base',
        func=podcast_retriever.run,
        description="Useful for general questions about how to do things and for details on interesting topics. Input should be a fully formed question."
    )
]

In [None]:
# Re-initialize the agent with our new list of tools
prompt_with_history = CustomPromptTemplate(
    template=template_with_history,
    tools=expanded_tools,
    input_variables=["input", "intermediate_steps", "history"]
)
llm_chain = LLMChain(llm=llm, prompt=prompt_with_history)
multi_tool_names = [tool.name for tool in expanded_tools]
multi_tool_agent = LLMSingleActionAgent(
    llm_chain=llm_chain,
    output_parser=output_parser,
    stop=["\nObservation:"],
    allowed_tools=multi_tool_names
)

In [None]:
multi_tool_memory = ConversationBufferWindowMemory(k=2)
multi_tool_executor = AgentExecutor.from_agent_and_tools(agent=multi_tool_agent, tools=expanded_tools, verbose=True, memory=multi_tool_memory)

In [None]:
multi_tool_executor.run("Hi, I'd like to know how you can live without a bank account")

In [None]:
multi_tool_executor.run('Can you tell me some interesting facts about whether zoos are good or bad for animals')

You now have a template to deploy conversational agents with tools. If you want to extend this with a Custom Agent to add your own retry behaviour or treatment of input/output variables, then follow [this article](https://python.langchain.com/en/latest/modules/agents/agents/custom_agent.html).

We look forward to seeing what you build!