# Multi Input Tools

This notebook shows how to use a tool that requires multiple inputs with an agent.

The difficulty in doing so comes from the fact that an agent decides it's next step from a language model, which outputs a string. So if that step requires multiple inputs, they need to either be inputted one at a time, or else parsed from a single string. The first currently supported way to do this is write a smaller wrapper function that parses that a string into multiple inputs.

For a concrete example, let's work on giving an agent access to a multiplication function, which takes as input two integers. In order to use this, we will tell the agent to generate the "Action Input" as a comma separated list of length two. We will then write a thin wrapper that takes a string, splits it into two around a comma, and passes both parsed sides as integers to the multiplication function.

In [1]:
from langchain.llms import OpenAI
from langchain.agents import initialize_agent, Tool

Here is the multiplication function, as well as a wrapper to parse a string as input.

In [2]:
def multiplier(a, b):
    return a * b

def parsing_multiplier(string):
    a, b = string.split(",")
    return multiplier(int(a), int(b))

In [3]:
llm = OpenAI(temperature=0)
tools = [
    Tool(
        name = "Multiplier",
        func=parsing_multiplier,
        description="useful for when you need to multiply two numbers together. The input to this tool should be a comma separated list of numbers of length two, representing the two numbers you want to multiply together. For example, `1,2` would be the input if you wanted to multiply 1 by 2."
    )
]
mrkl = initialize_agent(tools, llm, agent="zero-shot-react-description", verbose=True)

In [4]:
mrkl.run("What is 3 times 4")



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m I need to multiply two numbers
Action: Multiplier
Action Input: 3,4[0m
Observation: [36;1m[1;3m12[0m
Thought:[32;1m[1;3m I now know the final answer
Final Answer: 3 times 4 is 12[0m

[1m> Finished chain.[0m


'3 times 4 is 12'

## Getting inputs one at a time

The second supported way is to retrieve multiple inputs, one at a time, using `GetMultipleOutputsChain`. This does away with the need for parsing the single input string, which can get complicated if the output includes code or other lengthy text. However, it is more expensive because it requires a separate LLM request for every input.

Here is the above multiplier example, using `GetMultipleOutputsChain` for a single run of the loop:

In [5]:
from langchain.chains.multiple_outputs import GetMultipleOutputsChain

def multiplier_str(a: str, b: str):
    return int(a) * int(b)

tool = Tool(
    name="Multiplier",
    func=multiplier_str,
    description="useful for when you need to multiply two numbers together",
)

dummy_agent_prompt = f"""
Answer the following questions as best you can. You have access to the following tools:

{tool.name}: {tool.description}

Use the following format:

Question: the input question you must answer
Thought: you should always think about what to do
Action: the action to take, should be one of [{tool.name}]
<Action Inputs>: the inputs to the action
Observation: the result of the action
... (this Thought/Action/Action Input/Observation can repeat N times)
Thought: I now know the final answer
Final Answer: the final answer to the original input question

Begin!

Question: What is 3 times 4
Thought: I need to multiply two numbers
Action: Multiplier
""".lstrip()

multiplier_inputs_chain = GetMultipleOutputsChain(
    llm=llm,
    prefix=dummy_agent_prompt,
    variables={
        "a": "First number to multiply",
        "b": "Second number to multiply",
    },
)

multiplier_inputs = multiplier_inputs_chain({})
print(multiplier_inputs)

{'a': ' 3', 'b': ' 4'}


Now we can call the tool directly from the LLM outputs without the use of a wrapper function, so long as the tool takes in only strings:

In [6]:
tool.func(**multiplier_inputs)

12

As of now, the default `Agent` and `AgentExecutor` do not support inputs to tools in this manner. As such, to actually make use of the above tool, we'll have to implement our own custom `AgentExecutor`.

## Customizing AgentExecutor to use multi-input tooling

First, let's subclass `Tool` so that each tool can describe what variables it expects:

In [7]:
from dataclasses import dataclass
from typing import List

from langchain.chains.multiple_outputs import VariableConfig


@dataclass(kw_only=True)
class CustomTool(Tool):
    variable_configs: List[VariableConfig]


def read_file(filename):  # dummy read file function  
    if filename.strip().strip('"') == "script.py":
        return """File contents are:

```
words = ["Hello", "World!"]
print(" ".join(words))
```
""".lstrip()
    return "No such file exists"
        

read_tool = CustomTool(
    name="Read File",
    func=read_file,
    variable_configs=[
        VariableConfig(display="File to read", output_key="filename"),
    ],
    description=(
        "useful for when you need to read the contents of a file. The sole input "
        "is the path to the file to be read."
    ),
)
assert read_tool.variable_configs is not None


def write_file(filename, contents):  # dummy write file function  
    return "File written successfully"


write_tool = CustomTool(
    name="Write File",
    func=write_file,
    variable_configs=[
        VariableConfig(display="File to write", output_key="filename"),
        VariableConfig.for_code(display="File contents", output_key="contents"),
    ],
    description=(
        "useful for when you need to read the contents of a file. The inputs are "
        "the path to the file to write to, and the contents to write to that file. "
        "Enclose the file contents with ```"
    ),
)

Because we now have actions with varying numbers of inputs and varying names for each input, it's harder both for the LLM to fill out, and for us to parse, all of the tool and inputs in one go as we were previously doing. As such, let's make the loop consistent and simple with a dummy "Finished" tool to break out of the loop.

(In reality, this is in a sense generalizing the concept of tool usage to the overarching concept of decision making. Now, "answering the question" or "finishing the task" becomes a first-class decision just as much as "using a tool" is, and "tool input" generalizes to "information associated with the execution of this decision." This is not a "dummy" tool so much as it is a legitimate decision for the LLM to make.)

In [8]:
def finished_fn() -> str:
    return "Dummy return"


finished_tool = CustomTool(
    name="Finished",
    func=finished_fn,
    variable_configs=[],
    description=(
        "useful for when you have finished the requested task. There are no inputs."
    ),
)

Then, let's subclass `AgentAction` so that the LLM acting as an agent can actually use each of these tools:

In [9]:
from collections import namedtuple
from typing import Dict

from langchain.schema import AgentAction
from langchain.agents.step import StepOutput


CustomAction = namedtuple("CustomAction", AgentAction._fields + ("tool_inputs",))
# tool_inputs: Dict[str, str]

# test that we can construct a CustomAction successfully
dummy_action = CustomAction(
    tool="dummy tool",
    tool_input="dummy input",
    tool_inputs={"input": "real inputs"},
    log="dummy log",
)
# and that pydantic doesn't complain about the corresponding StepOutput
dummy_output = StepOutput.construct(decision=dummy_action, observation="foo")

Now, let's subclass `ZeroShotAgent` to produce this `CustomAction` as output during the planning phase:

In [18]:
from collections import OrderedDict
from typing import Optional, Any, Tuple

from langchain.agents.mrkl.base import ZeroShotAgent
from langchain.agents.mrkl.prompt import PREFIX
from langchain.agents.agent import Agent
from langchain.agents.memory import BaseAgentMemory
from langchain.prompts import PromptTemplate


PREFIX = """Finish the following task as best you can. You have access to the following tools:"""

FORMAT_INSTRUCTIONS = """
Use the following format:

Task: the task you must complete
Thought: you should always think about what to do
Action: the action to take; should be one of [{tool_names}].
<Action Inputs>: the inputs to the action, enclosed by quotes
Observation: the result of the action
... (this Thought/Action/Action Inputs/Observation can repeat N times)
Thought: I am now finished with the task
Action: Finished
""".strip()

SUFFIX = """Begin!

Task: {input}
{agent_scratchpad}"""


class CustomAgent(ZeroShotAgent):
    tools: Dict[str, Tool] = {}
    
    @classmethod
    def from_llm_and_tools(
        cls,
        tools: List[Tool],
        **kwargs: Any,
    ) -> Agent:
        custom_agent = super().from_llm_and_tools(tools=tools, **kwargs)
        custom_agent.tools = {tool.name: tool for tool in tools}
        return custom_agent
    
    @property
    def llm_prefix(self) -> str:
        return ""
    
    @property
    def finish_tool_name(self) -> str:
        """Name of the tool to use to finish the chain."""
        return "Finished"
    
    def get_full_inputs(
        self, intermediate_steps: List[Tuple[AgentAction, str]], **kwargs: Any
    ) -> Dict[str, Any]:
        # like ZeroShotAgent.get_full_inputs, but with no stop
        thoughts = self._construct_scratchpad(intermediate_steps)
        new_inputs = {"agent_scratchpad": thoughts}
        full_inputs = {**kwargs, **new_inputs}
        return full_inputs

    def _get_next_action(self, full_inputs: Dict[str, str]) -> AgentAction:
        prefix = self.llm_chain.prompt.format(**full_inputs)
        #print(prefix)
        thought_and_tool_chain = GetMultipleOutputsChain(
            llm=self.llm_chain.llm,
            prefix=prefix,
            variables=OrderedDict(
                thought="Thought",
                action="Action",
            )
        )
        thought_and_tool = thought_and_tool_chain({})
        #print(thought_and_tool)
        tool = self.tools[thought_and_tool["action"].strip()]
        log = "Thought: " + thought_and_tool_chain.log()

        if len(tool.variable_configs) == 0:
            tool_inputs = {}
        else:
            tool_inputs_chain = GetMultipleOutputsChain(
                llm=self.llm_chain.llm,
                prefix=prefix + log,
                variable_configs=tool.variable_configs,
            )
            tool_inputs = tool_inputs_chain({})
            #print(tool_inputs)
            log += "\n" + tool.variable_configs[0].prompt + ": " + tool_inputs_chain.log()

        return CustomAction(
            tool=tool.name,
            tool_input="dummy",
            tool_inputs=tool_inputs,
            log=log,
        )

And we subclass `AgentExecutor` to use our custom tools with this `CustomAction` as input during the observation feedback loop:

In [15]:
from langchain.schema import AgentFinish
from langchain.agents.agent import Agent, AgentExecutor
from langchain.agents.step import StepOutput


class CustomAgentExecutor(AgentExecutor):
    def _take_next_step(self, memory: BaseAgentMemory) -> StepOutput:
        """Take a single step in the thought-action-observation loop.

        Override this to take control of how the agent makes and acts on choices.
        """
        # Call the LLM to see what to do.
        output = self.agent.plan_structured(memory)
        # If the tool chosen is the finishing tool, then we end and return.
        if isinstance(output, AgentFinish):
            return StepOutput(decision=output)
        tool = self.name_to_tool_map[output.tool]
        self.callback_manager.on_tool_start(
            {"name": str(tool.func)[:60] + "..."},
            output,
            color="green",
            verbose=self.verbose,
        )
        observation = tool.func(**output.tool_inputs)
        color = self.color_mapping[output.tool]
        self.callback_manager.on_tool_end(
            observation,
            color=color,
            observation_prefix=self.agent.observation_prefix,
            llm_prefix=self.agent.llm_prefix,
            verbose=self.verbose,
        )
        # .construct because pydantic doesn't like us subclassing AgentAction
        return StepOutput.construct(decision=output, observation=observation)

Let's try running it now

In [20]:
custom_tools = [read_tool, write_tool, finished_tool]

custom_agent = CustomAgent.from_llm_and_tools(
    llm=llm,
    tools=custom_tools,
    prefix=PREFIX,
    format_instructions=FORMAT_INSTRUCTIONS,
    verbose=True,
)

custom_executor = CustomAgentExecutor.from_agent_and_tools(
    agent=custom_agent,
    tools=custom_tools,
    max_iterations=4,
    verbose=True,
)

custom_executor.run(
    "Ensure the file `script.py` has the shebang line \"#!/usr/bin/env python3\" at the top."
)



[1m> Entering new CustomAgentExecutor chain...[0m
[32;1m[1;3mThought: I need to read the contents of the file and add the shebang line if it is not already present.
Action:  Read File
File to read: "script.py"[0m
Observation: [36;1m[1;3mFile contents are:

```
words = ["Hello", "World!"]
print(" ".join(words))
```
[0m
[32;1m[1;3mThought: I need to check if the shebang line is present
Action:  Read File
File to read: "script.py"[0m
Observation: [36;1m[1;3mFile contents are:

```
words = ["Hello", "World!"]
print(" ".join(words))
```
[0m
[32;1m[1;3mThought: The shebang line is not present, so I need to add it
Action:  Write File
File to write: "script.py"
File contents: ```
#!/usr/bin/env python3
words = ["Hello", "World!"]
print(" ".join(words))
```[0m
Observation: [33;1m[1;3mFile written successfully[0m
[32;1m[1;3mThought: I am now finished with the task
Action:  Finished[0m

[1m> Finished chain.[0m


'dummy'

Note that because we're reading from a file, the contents of the file are unpredictable, and therefore it can be hard to write a regex that reliably parses the output. The usage of `VariableConfig` in `GetMultipleOutputsChain` allows you to experiment with different prompting styles and stops for each variable without requiring you to write and modify a parser.