# Building Custom LangChain Tools for Efficient Data Processing
This notebook demonstrates how to build and integrate custom tools using LangChain to enhance data processing tasks. Although we use an insurance claim document as our example for demonstration purposes, the methodologies and tools developed here can be applied to any textual data. You will learn to create tools that analyze text for damage detection, cost calculation, and online price checking, and see how these tools can be integrated into an agent to streamline the handling of diverse data types.


## Initial Setup: Importing Libraries and Defining Tools
Begin by importing the necessary libraries. Then, dive into creating specialized tools:
- `damage_detector`: Detects and classifies damage described in text.
- `calculator`: Provides cost estimates for repairs.
- `online_checker`: Checks and compares repair costs with online prices.



In [1]:
import os

from dotenv import load_dotenv, find_dotenv
from langchain.agents import AgentExecutor
from langchain.agents import tool
from langchain.agents.format_scratchpad.openai_tools import (
    format_to_openai_tool_messages,
)
from langchain.agents.output_parsers.openai_tools import (
    OpenAIToolsAgentOutputParser,
)
from langchain_community.document_loaders import PyPDFLoader
from langchain_community.tools.ddg_search import DuckDuckGoSearchRun
from langchain_community.tools.ddg_search.tool import DuckDuckGoSearchResults
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_core.prompts import PromptTemplate
from langchain_openai import ChatOpenAI
from pyprojroot import here

In [2]:
@tool
def damage_detector(text: str) -> str:
    """
    Identifies and lists damaged items from a text description,
    including the extent of damage when available.
    """

    template = (
        "text: {text} \n\n Instruction: "
        "Create a list of all damaged objects. Indicate, if applicable, "
        "the extent of the damage for each item."
    )

    prompt = PromptTemplate(
        input_variables=["text"],
        template=template,
    )

    summary_chain = prompt | ChatOpenAI(
        openai_api_key=api_key,
        temperature=0,
        model_name="gpt-4",
    )

    summary = summary_chain.invoke({"text": text})

    return summary.content

In [3]:
@tool
def calculator(text: str) -> str:
    """
    Estimates the total costs for damaged objects to interpret
    and compute values based on textual descriptions.
    """

    template = (
        "text: {text} \n\n Instruction: Calculate the total costs"
        "of the damaged objects"
    )

    prompt = PromptTemplate(
        input_variables=["text"],
        template=template,
    )

    calculator_chain = prompt | ChatOpenAI(
        openai_api_key=api_key,
        temperature=0,
        model_name="gpt-4",
    )

    cost = calculator_chain.invoke({"text": text})

    return cost.content

In [4]:
@tool
def online_checker(text: str) -> str:
    """
    Checks on the web weather the price for an item is supposed to be in a reasonable range
    """

    search_engine = DuckDuckGoSearchRun()

    result = search_engine.run(text)

    return result

## Configuring Environment and Loading Data
Securely load API keys and environmental variables using the `dotenv` package, crucial for maintaining security. The dataset consists of descriptions from `damage_report.pdf`, setting the stage for our tool's operations.



In [5]:
load_dotenv(find_dotenv())
api_key = os.getenv("OPENAI_API_KEY")

In [6]:
data = PyPDFLoader(str(here("./damage_report.pdf"))).load()
print(data)

[Document(page_content='Property Damage Report - Storm Damage\nDear Sir/Madam,\nI am writing to report damage incurred to my property during the storm on February 22, 2024. As a\nholder of a home contents insurance policy, I am seeking prompt attention to this matter.\nA severe storm caused significant damage to my residence in the southern part of Cologne,\nspecifically at Mainzer Straße 45, 50678 Köln. The damage includes broken windows allowing water\nto enter and damage various household items including furniture and electronic devices.\nThe following items were affected: \n- Oak living room table, water damage, estimated current value: EUR 200\n- 55-inch Samsung television, water damage, estimated replacement value: EUR 800\n- Lenovo laptop, water damage, estimated replacement value: EUR 1200\n- Various books and decorative items, estimated current value: EUR 150\nI have taken all necessary measures to prevent further damage and ensure safety. I also have\nphotos of the damaged it

## Integrating Tools with LangChain Agent
Define the custom agent and integrate it with the tools outlined above:
- `damage_detector`: Extracts and details damage from claim texts.
- `calculator`: Calculates estimated repair costs.
- `online_checker`: Verifies cost estimates against current market values.



In [7]:
tools = [damage_detector, calculator, online_checker,]
llm = ChatOpenAI(model_name="gpt-4", openai_api_key=api_key, temperature=0)

In [8]:
prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "I am a dedicated assistant for insurance claims handling. "
            "I help process relevant information and provide cost overviews for claims. "
            "However, I do not have access to current events. "
            "It is essential for the claims handler to verify the plausibility of the "
            "provided cost estimations to ensure they are realistic."
        ),
        ("user", "{input}"),
        MessagesPlaceholder(variable_name="agent_scratchpad"),
    ]
)

In [9]:
llm_with_tools = llm.bind_tools(tools)

agent = (
    {
        "input": lambda x: x["input"],
        "agent_scratchpad": lambda x: format_to_openai_tool_messages(
            x["intermediate_steps"]
        ),
    }
    | prompt
    | llm_with_tools
    | OpenAIToolsAgentOutputParser()
)

## Running the Agent to Process Claims
Execute the agent to process a sample insurance claim. This demonstration uses the `AgentExecutor` class to manage tool interactions and outputs, providing a detailed claim analysis.

In [10]:
agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True)

response = agent_executor.invoke(
    {"input": data[0].page_content}, return_only_outputs=True
)



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m
Invoking: `damage_detector` with `{'text': 'Oak living room table, water damage, estimated current value: EUR 200\n55-inch Samsung television, water damage, estimated replacement value: EUR 800\nLenovo laptop, water damage, estimated replacement value: EUR 1200\nVarious books and decorative items, estimated current value: EUR 150'}`


[0m[36;1m[1;3m1. Oak living room table - Extent of damage: Water damage
2. 55-inch Samsung television - Extent of damage: Water damage
3. Lenovo laptop - Extent of damage: Water damage
4. Various books and decorative items - Extent of damage: Not specified[0m[32;1m[1;3m
Invoking: `calculator` with `{'text': 'Oak living room table, water damage, estimated current value: EUR 200\n55-inch Samsung television, water damage, estimated replacement value: EUR 800\nLenovo laptop, water damage, estimated replacement value: EUR 1200\nVarious books and decorative items, estimated current value: EUR 1

## Results
Here are the outcomes after running our agent on a sample claim from `damage_report.pdf`:
- **Damage Identification**: List of items and detected damages.
- **Cost Calculation**: Repair cost estimations.
- **Price Verification**: Market price comparison for cost validation.

These results highlight the effectiveness of our custom tools in automating and enhancing the accuracy and efficiency of insurance claims processing.

## Another Example

In [11]:
ddg_search = DuckDuckGoSearchResults()
new_tools = [
    damage_detector,
    calculator,
    ddg_search,
]
llm_with_new_tools = llm.bind_tools(new_tools)

In [12]:
new_agent = (
    {
        "input": lambda x: x["input"],
        "agent_scratchpad": lambda x: format_to_openai_tool_messages(
            x["intermediate_steps"]
        ),
    }
    | prompt
    | llm_with_new_tools
    | OpenAIToolsAgentOutputParser()
)

In [13]:
new_agent_executor = AgentExecutor(
    agent=new_agent, tools=new_tools, verbose=True
)

new_response = new_agent_executor.invoke(
    {"input": data[0].page_content}, return_only_outputs=True
)



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m
Invoking: `damage_detector` with `{'text': 'Oak living room table, water damage, estimated current value: EUR 200\n55-inch Samsung television, water damage, estimated replacement value: EUR 800\nLenovo laptop, water damage, estimated replacement value: EUR 1200\nVarious books and decorative items, estimated current value: EUR 150'}`


[0m[36;1m[1;3m1. Oak living room table - Extent of damage: Water damage
2. 55-inch Samsung television - Extent of damage: Water damage
3. Lenovo laptop - Extent of damage: Water damage
4. Various books and decorative items - Extent of damage: Not specified[0m[32;1m[1;3m
Invoking: `calculator` with `{'text': 'Oak living room table, water damage, estimated current value: EUR 200\n55-inch Samsung television, water damage, estimated replacement value: EUR 800\nLenovo laptop, water damage, estimated replacement value: EUR 1200\nVarious books and decorative items, estimated current value: EUR 1