<a href="https://colab.research.google.com/github/anastaszi/GenAI/blob/main/LangChain_Agents.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# LangChain Agents

We've referenced `LangChain` several times throughout this course, this notebook dives deeper into the `Agents` component of the library. We've already discussed the concept of `chains` and how they can be used to architect complex LLM pipelines by chaining multiple models together. However, since these models are incredible flexible in what they can do (classifiction, text generation, code generation, etc.) we can integrate them with other systems - for example we have have a model generate the code to make an API call, write and execute Data Science code, query tabular data, the list goes on. We can use `Agents` to interact with all these external systems to exeucte actions dictated by LLMs. The fundamental idea of an `Agent` is to let the LLM choose an action or sequence of actions to take give a various set of tools.


This is a pretty general description, let's take a look at some of the specifics via code:

We'll be using an OpenAI model for this example. Let's first use the `OpenAI` module to instantiate the `llm` object:

In [None]:
from getpass import getpass
OPENAI_API_KEY = getpass()

········


In [None]:
from langchain import OpenAI

llm = OpenAI(
    openai_api_key=OPENAI_API_KEY,
    temperature=0,
    model_name="text-davinci-003"
)

As stated eariler, `Agents` are all about letting LLMs pick the best tool to solve the problem, as part of these agents, we'll need to a include a tool or list of tools for the LLM to choose from to exeucte it's designated sequence of events. Within `LangChain` there are a variety of tools that can be used as part of these `Agents`. In this simple example we'll use the calculator tool called `llm-math`, but you can take a look at a comprehensive built-in list of tools [here](https://github.com/langchain-ai/langchain/tree/master/libs/langchain/langchain/tools)

In [None]:
from langchain.agents import load_tools

tools = load_tools(
    ['llm-math'],
    llm=llm
)

Alternatively, the user has the option to creat their own tools - if we were to create this calculator tool from scratch, it would look something like the following:

In [None]:
# Create a calculator tool
from langchain.chains import LLMMathChain
from langchain.agents import Tool

llm_math = LLMMathChain(llm=llm)

# initialize the math tool
math_tool = Tool(
    name='Calculator',
    func=llm_math.run,
    description='A tools that should be used exclusively for mathematical computation'
)
# when giving tools to LLM, we must pass as list of tools
tools = [math_tool]

Great! We now have a tool, but just a tool isn't too much use to us - we need an `Agent`! As of July 2023, there are a few different agent types in `LangChain`:

- `Zero-shot ReAct`
    - Use ReAct framework to determine the proper tool. This decision is purely done based on the tools description. Since the tool decision is decided by the description, all tools using this agent must have a description. This is the most general purpose action agent
- `Structured input ReAct`
    - This agent is capable of using multi-input tools. Unlike older tools, this one can use usea a tools' argument schema to create a structured action input. This agent can be really usefl for complex tool usage.
- `OpenAI Functions`
    - Certain OpenAI models have been specifically been fine tuned to detect when a function should be called and respon with the inputs that should be passed to the function. These are modesl like gpt-3.5-turbo-0613 and gpt-4-0613. This agent is designed for these specific models and should be used accordingly.
- `Conversational`
    - As it's name states, this agent is built to be used in conversational settings. This agent uses a prompt that's specifically designted for the conversational, text-generation use case. Simialar to above, it uses the ReAct framwork to decide on the tool, and uses memory to remember the previous conversation interactions.
- `Self ask with search`
    - This agent uses a tool that must be name `Intermediate Answer`. The tool should be one that looks up *factual* answers to questions. This agent is the equivalent to the agent laid out in [this](https://ofir.io/self-ask.pdf) paper.
- `ReAct document store`
    - Similar to `Zero-shot ReAct` this leverages the ReAct framework against a document store
- ` Plan-and-execute agents`
    - Plan and execute agents accomplish an objective by first planning what to do, then executing the sub tasks. You can read more about the implementation in [this](https://arxiv.org/abs/2305.04091) paper.

In order to work with an `Agent` we'll need to define three things:
1. A large language model that will make the decision on path forward given the toolset
2. Tool or tools for the llm to interact with
3. An agent to control the interaction pattern between the llm and the tool(s)

For this agent, since it's a simple math problem, we'll use the `Zero-shot ReAct` agent. We'll initialize the object by specifying the `agent` type, `tools`, `llm`, `verbosity` (so we can easily see what's going on underneath the hood), and the `max_iterations`.

When interacting with agents it is really important to set the `max_iterations` parameters because agents can get stuck in infinite loops that consume plenty of tokens. The default value is 15 to allow for many tools and complex reasoning but for most applications you should keep it much lower.

In [None]:
from langchain.agents import initialize_agent

zero_shot_agent = initialize_agent(
    agent="zero-shot-react-description",
    tools=tools,
    llm=llm,
    verbose=True,
    max_iterations=3
)

Let's test the agent!

In [None]:
zero_shot_agent("what is (6.78*3.2)^1.111?")



[1m> Entering new  chain...[0m
[32;1m[1;3m I need to calculate the power of a product
Action: Calculator
Action Input: 6.78*3.2^1.111[0m
Observation: [36;1m[1;3mAnswer: 24.686033827858196[0m
Thought:[32;1m[1;3m I now know the final answer
Final Answer: 24.686033827858196[0m

[1m> Finished chain.[0m


{'input': 'what is (6.78*3.2)^1.111?', 'output': '24.686033827858196'}

Now that we have this simple agent up and running that can handle math computation, let's add some complexity. In the example below, we'll build an agent that can access the Google Search APIs to answer user queries. To do this, we'll first need to install some dependencies.



*Below example adopted from [here](https://www.datasciencebyexample.com/2023/05/27/understanding-langchain-chains-and-agents/)*


In [None]:
%pip install google-search-results

Defaulting to user installation because normal site-packages is not writeable

[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m23.1.2[0m[39;49m -> [0m[32;49m23.2.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49m/Library/Developer/CommandLineTools/usr/bin/python3 -m pip install --upgrade pip[0m
Note: you may need to restart the kernel to use updated packages.


In [None]:
from langchain.agents import Tool, AgentExecutor, LLMSingleActionAgent, AgentOutputParser
from langchain.prompts import BaseChatPromptTemplate
from langchain import SerpAPIWrapper, LLMChain
from langchain.chat_models import ChatOpenAI
from typing import List, Union
from langchain.schema import AgentAction, AgentFinish, HumanMessage
import re

For this example you'll need a Serp API key. You can create a SerpAPI account for free and generate a key [here](https://serpapi.com/)

In [None]:
SERPAPI_API_KEY = getpass()

········


Once we've defined the Google Search API key, we'll instantiate the tool:

In [None]:
# Define which tools the agent can use to answer user queries
search = SerpAPIWrapper(serpapi_api_key=SERPAPI_API_KEY)
tools = [
    Tool(
        name = "Search",
        func=search.run,
        description="useful for when you need to answer questions about current events"
    )
]

Generate the prompt that we'll template later:

In [None]:
# Set up the base template
template = """Complete the objective as best you can. You have access to the following tools:

{tools}

Use the following format:

Question: the input question you must answer
Thought: you should always think about what to do
Action: the action to take, should be one of [{tool_names}]
Action Input: the input to the action
Observation: the result of the action
... (this Thought/Action/Action Input/Observation can repeat N times)
Thought: I now know the final answer
Final Answer: the final answer to the original input question

These were previous tasks you completed:



Begin!

Question: {input}
{agent_scratchpad}"""

With this prompt, we'll need to do some custom parsing to ensure messages are passed in properly. We'll use this `CustomPromptTemplate` class, which extends `LangChains` `BaseChatPromptTemplate`, to generate and format the prompt and accompanying messages

In [None]:
# Set up a prompt template
class CustomPromptTemplate(BaseChatPromptTemplate):
    # The template to use
    template: str
    # The list of tools available
    tools: List[Tool]

    def format_messages(self, **kwargs) -> str:
        # Get the intermediate steps (AgentAction, Observation tuples)
        # Format them in a particular way
        intermediate_steps = kwargs.pop("intermediate_steps")
        thoughts = ""
        for action, observation in intermediate_steps:
            thoughts += action.log
            thoughts += f"\nObservation: {observation}\nThought: "
        # Set the agent_scratchpad variable to that value
        kwargs["agent_scratchpad"] = thoughts
        # Create a tools variable from the list of tools provided
        kwargs["tools"] = "\n".join([f"{tool.name}: {tool.description}" for tool in self.tools])
        kwargs["tool_names"] = ", ".join([tool.name for tool in self.tools])
        formatted = self.template.format(**kwargs)
        return [HumanMessage(content=formatted)]

Generate the prompt using the custom class

In [None]:
prompt = CustomPromptTemplate(
    template=template,
    tools=tools,
    # This omits the `agent_scratchpad`, `tools`, and `tool_names` variables because those are generated dynamically
    # This includes the `intermediate_steps` variable because that is needed
    input_variables=["input", "intermediate_steps"]
)

Next we'll define a `CustomOutputParser`, which extends the `AgentOutputParser` class. This class is responsible for parsing the action and action input to feed to the downstream `AgentAction`

In [None]:
class CustomOutputParser(AgentOutputParser):

    def parse(self, llm_output: str) -> Union[AgentAction, AgentFinish]:
        # Check if agent should finish
        if "Final Answer:" in llm_output:
            return AgentFinish(
                # Return values is generally always a dictionary with a single `output` key
                # It is not recommended to try anything else at the moment :)
                return_values={"output": llm_output.split("Final Answer:")[-1].strip()},
                log=llm_output,
            )
        # Parse out the action and action input
        regex = r"Action\s*\d*\s*:(.*?)\nAction\s*\d*\s*Input\s*\d*\s*:[\s]*(.*)"
        match = re.search(regex, llm_output, re.DOTALL)
        if not match:
            raise ValueError(f"Could not parse LLM output: `{llm_output}`")
        action = match.group(1).strip()
        action_input = match.group(2)
        # Return the action and action input
        return AgentAction(tool=action, tool_input=action_input.strip(" ").strip('"'), log=llm_output)

Instantiate an instace of the class

In [None]:
output_parser = CustomOutputParser()

Similar to other notebooks, we'll be using one of `OpenAI`'s chat models for this agent. In order to use this model, we'll need to pass a valid API key (via `getpass()`)

In [None]:
OPENAI_API_KEY = getpass()

········


Create the LLM that we'll feed to the agent

In [None]:
llm = ChatOpenAI(openai_api_key=OPENAI_API_KEY, temperature=0)

Chain together the LLM and the prompt

In [None]:
# LLM chain consisting of the LLM and a prompt
llm_chain = LLMChain(llm=llm, prompt=prompt)

Here we're using the `LLMSingleActionAgent` since it's the base class for and single action agent. Since this process only invovles a google search, we don't need to worry about multi-actions

In [None]:
tool_names = [tool.name for tool in tools]
agent = LLMSingleActionAgent(
    llm_chain=llm_chain,
    output_parser=output_parser,
    stop=["\nObservation:"],
    allowed_tools=tool_names
)

Invoke the agent and see what happens!

In [None]:
agent_executor = AgentExecutor.from_agent_and_tools(agent=agent, tools=tools, verbose=True)
agent_executor.run("Where is the first stage of the Tour De France Femmes?")



[1m> Entering new  chain...[0m
[32;1m[1;3mThought: I need to find the location of the first stage of the Tour De France Femmes.
Action: Search
Action Input: "Tour De France Femmes first stage location"[0m

Observation:[36;1m[1;3mClermont-Ferrand[0m
[32;1m[1;3mI now know the location of the first stage of the Tour De France Femmes.
Final Answer: The first stage of the Tour De France Femmes is in Clermont-Ferrand.[0m

[1m> Finished chain.[0m


'The first stage of the Tour De France Femmes is in Clermont-Ferrand.'

![tour_de_femmes](../images/tour_de_femmes.png)