# Filtering Tool Output

Tools are interfaces an agent can use to interact with any function. If you're building a chat bot that uses an agent, you may want to surface additional information in the UI without actually presenting it to the agent. For instance, if your tool returns a lengthy dataframe, you may have to truncate it for the agent, but you can surface the full value in the app for the user to see.

This notebook walks through how to surface this by nesting a call within the tool.


## Prerequisites

This example uses LangSmith and OpenAI. Please make sure the dependencies and API keys are configured appropriately before continuing.

In [None]:
# %pip install -U langchain openai sklearn pandas tabulate

In [None]:
import os

os.environ["LANGCHAIN_API_KEY"] = "<your api key>"
os.environ["LANGCHAIN_TRACING_V2"] = "true"
os.environ["LANGCHAIN_PROJECT"] = "Filtering Tool Output"
# os.environ["OPENAI_API_KEY"] = "<your openai api key>"

## 1. Define tool

We will show how to properly structure your nested calls both when using the decorator and when subclassing the `BaseTool` class.

### Option 1: Using the @tool decorator

The `@tool` decorator is the most concise way to define a LangChain tool. To propagate callbacks
through the tool function, simply include the "callbacks" option in the wrapped function. Below is an example.

In [None]:
import uuid

import pandas as pd
import requests
import sklearn.datasets
from langchain.callbacks.manager import Callbacks
from langchain.chat_models import ChatOpenAI
from langchain.prompts import ChatPromptTemplate
from langchain.schema.output_parser import StrOutputParser
from langchain.schema.runnable import RunnableLambda
from langchain.tools import tool


def load_dataframe(dataframe_name: str) -> dict:
    if dataframe_name == "iris":
        df = pd.DataFrame(sklearn.datasets.load_iris()["data"])
    else:
        raise ValueError(
            f"Dataframe with name '{dataframe_name}' not found. Please try again"
        )
    return {
        # We will expose the raw_output in our callback handler
        "raw_output": df.to_json(),
        "dataframe": df,
    }


load_dataframe_runnable = RunnableLambda(load_dataframe)


@tool
def get_dataframe(url: str, callbacks: Callbacks = None):
    """Gets the dataframe"""
    # This will register an `on_chain_{start,end}` event where the `end` call contains the full dataframe
    results = load_dataframe_runnable.invoke(dataframe_name, {"callbacks": callbacks})
    # The agent will only see the head of the dataframe
    return results["dataframe"].head().to_markdown()

### Option 2: Subclass BaseTool

If you directly subclass `BaseTool`, you have more control over the class behavior and state management. You can choose to propagate callbacks by accepting a "run_manager" argument in your `_run` method.

Below is an equivalent definition of our tool using inheritance.

In [None]:
import uuid
from typing import Optional

import aiohttp
import requests
from langchain.callbacks.manager import (
    AsyncCallbackManagerForToolRun,
    CallbackManagerForToolRun,
)
from langchain.chat_models import ChatOpenAI
from langchain.prompts import ChatPromptTemplate
from langchain.schema.runnable import Runnable
from langchain.tools import BaseTool
from pydantic import Field


class GetDataFrame(BaseTool):
    name = "get_dataframe"
    description = "Get the dataframe head"

    def _run(
        self,
        dataframe_name: str,
        run_manager: Optional[CallbackManagerForToolRun] = None,
    ) -> str:
        """Use the tool."""
        callbacks = run_manager.get_child() if run_manager else None
        # This will register an `on_chain_{start,end}` event where the `end` call contains the full dataframe
        results = load_dataframe_runnable.invoke(
            dataframe_name, {"callbacks": callbacks}
        )
        # The agent will only see the head of the dataframe
        return results["dataframe"].head().to_markdown()

    async def _arun(
        self,
        dataframe_name: str,
        run_manager: Optional[AsyncCallbackManagerForToolRun] = None,
    ) -> str:
        """Use the tool asynchronously."""
        callbacks = run_manager.get_child() if run_manager else None
        # This will register an `on_chain_{start,end}` event where the `end` call contains the full dataframe
        results = await load_dataframe_runnable.ainvoke(
            dataframe_name, {"callbacks": callbacks}
        )
        # The agent will only see the head of the dataframe
        return results["dataframe"].head().to_markdown()

## 2. Define agent

We will construct a simple agent using runnables and OpenAI functions, following the [Agents overview](https://python.langchain.com/docs/modules/agents/) in the LangChain documentation. The specifics aren't important to this example.

In [None]:
from langchain.prompts import ChatPromptTemplate, MessagesPlaceholder

prompt = ChatPromptTemplate.from_messages(
    [
        ("system", "You are very powerful assistant."),
        ("user", "{input}"),
        MessagesPlaceholder(variable_name="agent_scratchpad"),
    ]
)

In [None]:
from langchain.chat_models import ChatOpenAI
from langchain.tools.render import format_tool_to_openai_function

# Configure the LLM with access to the appropriate function definitions
llm = ChatOpenAI(temperature=0)
tools = [
    GetDataFrame()  # Could alternatively use the decorator-based method: get_dataframe
]
llm_with_tools = llm.bind(functions=[format_tool_to_openai_function(t) for t in tools])

In [None]:
from langchain.agents import AgentExecutor
from langchain.agents.format_scratchpad import format_to_openai_functions
from langchain.agents.output_parsers import OpenAIFunctionsAgentOutputParser

agent = (
    {
        "input": lambda x: x["input"],
        "agent_scratchpad": lambda x: format_to_openai_functions(
            x["intermediate_steps"]
        ),
        "chat_history": lambda x: x.get("chat_history") or [],
    }
    | prompt
    | llm_with_tools
    | OpenAIFunctionsAgentOutputParser()
).with_config(run_name="Agent")

agent_executor = AgentExecutor(agent=agent, tools=tools)

## 3. Create Callback Handler

Now we will create a custom callback handler to surface the events.

In [None]:
from langchain.callbacks.base import BaseCallbackHandler

In [None]:
from typing import Any, Dict, List
from uuid import UUID

from IPython.display import display


class MyToolCallbackHandler(BaseCallbackHandler):
    def __init__(self):
        self._run_map = {}

    def on_chain_start(
        self,
        serialized: Dict[str, Any],
        inputs: Dict[str, Any],
        *,
        run_id: UUID,
        tags: Optional[List[str]] = None,
        parent_run_id: Optional[UUID] = None,
        metadata: Optional[Dict[str, Any]] = None,
        run_type: Optional[str] = None,
        name: Optional[str] = None,
        **kwargs: Any,
    ) -> None:
        if name == "load_dataframe":
            self._run_map[run_id] = {"inputs": inputs, "name": name}

    def on_chain_end(
        self,
        outputs: Dict[str, Any],
        *,
        run_id: UUID,
        **kwargs: Any,
    ) -> None:
        if run_id in self._run_map:
            df = pd.read_json(outputs["raw_output"])
            # Display this in the jupyter notebook UX
            display(df)

## 4. Invoke

Now its time to call the agent. All callbacks, including your custom callback handler, will be passed to the nested runnable, but only the filtered result will be shown to the agent.

In [None]:
agent_executor.invoke(
    {"input": f"What's the first row of the iris dataset?"},
    {"callbacks": [MyToolCallbackHandler()]},
)

## Conclusion

In this example, you created a tool in two ways: using the decorator and by subclassing `BaseTool`. You configured the callbacks so the nested call to the "Summarize Text" chain is traced correctly.

LangSmith uses LangChain's callbacks system to trace the execution of your application. To trace nested components,  the callbacks have to be passed to that component.  Any time you see a trace show up on the top level when it ought to be nested, it's likely that somewhere the callbacks weren't correctly passed between components.

This is all made easy when composing functions and other calls as runnables (i.e., [LangChain expression language](https://python.langchain.com/docs/expression_language/)).