To build capable assistants with LLMs, it's crucial to enhance their abilities with tools, specific agent architectures, and methods for extracting structured information and fact-checking. The following points summarize how to achieve this:

*   **Tool Integration**:
    *   LLMs can be augmented by connecting them to external data and services.
    *   Tools allow LLMs to interact with the real world, access real-time data, and perform tasks, thus overcoming their limitations with domain-specific or up-to-date knowledge.
    *   Examples include tools for web searches, database queries, email automation, or even handling phone calls.
    *   LangChain provides a platform to create these tools which can be used by agents, chains, or LLMs to interact with the world.
    *   Tools encapsulate a name, a function to execute, a description for the LLM, an optional schema for parameters, and a flag to return directly to the user.
    *   Tools can be built-in, like WikipediaQueryRun, or custom-defined using decorators, subclassing, or dataclasses.

*   **Custom Tool Definition**:
    *   The `@tool` decorator can be used to define a tool, using the function name as the tool name and the docstring as the description.
    *   Subclassing `BaseTool` allows for more control, enabling customization of behavior, complex logic, asynchronous operations, and error handling.
    *   `StructuredTool` offers a balance between the complexity of subclassing and the simplicity of the decorator.
    *   Error handling in tools is important to allow agents to continue executing even when a tool encounters an error.

*   **Agent Architectures:**
    *   Agents combine LLMs with tools to perform tasks, utilizing specific reasoning strategies.
    *   **Action agents** reason iteratively based on observations after each action.
    *  **Plan-and-execute agents** plan completely upfront before taking any action, then gather evidence to execute the plan.
    *   In plan-and-execute architectures, a Planner LLM creates a plan, then an agent gathers evidence with tools, and finally, a Solver LLM generates the output.

*   **Building a Research Assistant**:
    *   A basic research assistant can be constructed using LangChain by combining an LLM with external tools to gather information.
    *   Tools like DuckDuckGo, Wolfram Alpha, arXiv, and Wikipedia can be used to enhance the agent’s capabilities.
    *   The Streamlit framework provides an easy-to-use platform for wrapping the agent in a web application.
    *   Streamlit enables creating interactive web applications, integrating the agent's capabilities, and allows for real-time responses.

*   **Extracting Structured Information**:
    *   LangChain facilitates extracting structured information from documents by combining LLMs with schema definitions.
    *   Pydantic can be used to create custom data structures (models) for defining the desired output format.
    *   LangChain's output parsers transform LLM outputs into structured formats, such as JSON or Pydantic models.

*   **Mitigating Hallucinations**:
    *   Automatic fact-checking is crucial for verifying claims made by LLMs against external sources.
    *   Fact-checking includes claim detection, evidence retrieval, and verdict prediction.
    *  The `LLMCheckerChain` in LangChain can be used to make a model check its own assumptions.

By integrating these elements, more capable, reliable, and trustworthy LLM-powered assistants can be developed.


## Answering questions with tools

In [13]:
from langchain_community.tools import WikipediaQueryRun
from langchain_community.utilities import WikipediaAPIWrapper

api_wrapper = WikipediaAPIWrapper(top_k_results=1, doc_content_chars_max=100)
tool = WikipediaQueryRun(api_wrapper=api_wrapper)

Let us examine how to use a tool in LangChain. We will use the `tool.name` toexamine the to examine the tool's name, `tool.description` to get the tool's description, and `tool.args` to get the tool's arguments, `tool.return_direct` to get the tool's return value and `tool.run()` to execute the tool.

```python

In [14]:
tool.name

'wikipedia'

In [15]:
print(f"Tool Name: {tool.name}\n"
      f"Description: {tool.description}\n"
      f"Arguments: {tool.args}\n"
      f"Return Direct: {tool.return_direct}")


Tool Name: wikipedia
Description: A wrapper around Wikipedia. Useful for when you need to answer general questions about people, places, companies, facts, historical events, or other subjects. Input should be a search query.
Arguments: {'query': {'title': 'Query', 'type': 'string'}}
Return Direct: False


In [16]:
tool.run({"query": "langchain"})

'Page: LangChain\nSummary: LangChain is a software framework that helps facilitate the integration of '

### Defining custom tools
To define custom tools in LangChain, you can follow approaches such as these:

* @tool decorator
* Subclassing BaseTool
* StructuredTool dataclass

### **1.    The `@tool` decorator**

In Python, we can define a tool using the decorator syntax, which simplifies the process of adding functionality to a function. By default, when we apply a decorator, it automatically assigns the function's name as the tool's name and the function's docstring as the tool's description. This means that the function’s docstring, which explains what the function does, is automatically used as the tool's description in the context of the decorator. Therefore, it’s important to always include a meaningful docstring, as it provides the necessary description of the tool's functionality.

In [17]:
from langchain.tools import tool


@tool
def search(query: str) -> str:
    """Look up things online."""
    return "LangChain"

# Ask our question
search("What's the best application framework for LLMs?")

'LangChain'

We could also customize the tool name and JSON args by passing them into the tool decorator

In [18]:
from langchain.pydantic_v1 import BaseModel, Field

class SearchInput(BaseModel):
    query: str = Field(description="should be a search query")

@tool("search-tool", args_schema=SearchInput, return_direct=True)
def search(query: str) -> str:
    """Look up things online."""
    return "LangChain"

search("How do we implement LLM apps?")

'LangChain'


### **2.    Subclassing BaseTool**

Subclassing `BaseTool` provides full control over tool definition but requires more effort. It’s ideal when you need to:

- Customize tool behavior
- Implement complex logic
- Handle async operations
- Manage errors or logging

This approach allows you to define:

- Tool metadata (name, description)
- Input schema
- Sync/async execution methods
- Custom error handling and callbacks
```

In [19]:
from typing import Optional, Type
from langchain.tools import BaseTool
from langchain.callbacks.manager import (
    AsyncCallbackManagerForToolRun,
    CallbackManagerForToolRun,
)

class SearchInput(BaseModel):
    query: str = Field(description="should be a search query")

class CustomSearchTool(BaseTool):
    name = "custom_search"
    description = "useful for when you need to answer questions about current events"
    args_schema: Type[BaseModel] = SearchInput

    def _run(self, query: str, run_manager: Optional[CallbackManagerForToolRun] = None) -> str:
        """Use the tool."""
        return "LangChain"

    async def _arun(self, query: str, run_manager: Optional[AsyncCallbackManagerForToolRun] = None) -> str:
        """Use the tool asynchronously."""
        raise NotImplementedError("custom_search does not support async")

search = CustomSearchTool()
search("What's the most popular tool for writing LLM apps?")

'LangChain'

**We define a custom input schema (SearchInput)**

```python
class SearchInput(BaseModel):
    query: str = Field(description="should be a search query")
```

Here, a class `SearchInput` is defined to serve as the input schema for the tool. It inherits from `BaseModel` and uses `Field` to specify that the input should include a `query` field. This is the user input that the tool will process, and the description provides clarity on its expected content.

**The tool’s name and description are class attributes**

```python
class CustomSearchTool(BaseTool):
    name = "custom_search"
    description = "useful for when you need to answer questions about current events"
```

The `CustomSearchTool` class inherits from `BaseTool` and sets two key class attributes: `name` and `description`. These attributes provide essential metadata for the tool, allowing it to be identified by name ("custom_search") and providing a brief overview of its purpose—answering questions about current events.

**The _run method implements the synchronous tool execution**

```python
def _run(self, query: str, run_manager: Optional[CallbackManagerForToolRun] = None) -> str:
    """Use the tool."""
    return "LangChain"
```

The `_run` method defines how the tool will behave when invoked synchronously. It accepts a `query` argument and an optional `run_manager`. In this case, the method simply returns the string `"LangChain"`. The optional `run_manager` enables advanced control over the execution, though it’s not used here.

**The _arun method is a placeholder for asynchronous execution**

```python
async def _arun(self, query: str, run_manager: Optional[AsyncCallbackManagerForToolRun] = None) -> str:
    """Use the tool asynchronously."""
    raise NotImplementedError("custom_search does not support async")
```

The `_arun` method is defined as an asynchronous counterpart to `_run`. However, it raises a `NotImplementedError`, indicating that asynchronous execution is not supported for this tool. The method still accepts the `query` and `run_manager` arguments, but it serves only as a placeholder for future potential asynchronous support.

**Both methods include optional callback managers for advanced control**

Both `_run` and `_arun` methods include optional `run_manager` parameters. These are instances of callback managers (`CallbackManagerForToolRun` and `AsyncCallbackManagerForToolRun`), which are used to control execution flow, handle callbacks, and manage advanced features like logging or error handling. The inclusion of these parameters makes the tool extensible and adaptable to complex workflows, even if they are not actively used in this example.

**StructuredTool dataclass**

StructuredTool offers a convenient way to define tools for Langchain workflows. It provides a balance between inheriting from the base BaseTool class (more complex) and simply using a decorator (less functionality).

---

Now, let us create a tool named "Search" using `StructuredTool.from_function` and see how it works.

```python

In [20]:
from langchain.tools import StructuredTool

def search_function(query: str):
    return "LangChain"

search = StructuredTool.from_function(
    func=search_function,
    name="Search",
    description="useful for when you need to answer questions about current events",
)
search("Which framework has hundreds of integrations to use with LLMs?")

'LangChain'

The code above defines a simple function `search_function` that always returns `"LangChain"`. It then creates a `StructuredTool` object called `search`, assigning it the `function`, `name`, and `description`. Lastly, it calls the `run` method on the `search` tool with a query string, illustrating the tool’s usage.

`StructuredTool` allows you to define a custom input schema using a `BaseModel` subclass. This provides better type checking and documentation for your tool’s inputs.

In [21]:
class CalculatorInput(BaseModel):
    a: int = Field(description="first number")
    b: int = Field(description="second number")

def multiply(a: int, b: int) -> int:
    """Multiply two numbers."""
    return a * b

calculator = StructuredTool.from_function(
    func=multiply,
    name="Calculator",
    description="multiply numbers",
    args_schema=CalculatorInput,
    return_direct=True,
) 

calculator.run(dict(a=1_000_000_000, b=2))

2000000000

In this code, a `CalculatorInput` class is defined, specifying the input format for the tool. It uses `BaseModel` from the `pydantic` library to define two fields: `a` and `b`, both of which are integers. The `Field` function provides descriptions for these inputs, explaining that `a` is the first number and `b` is the second number. This class enforces type checking, ensuring that the tool only accepts valid input formats.

The `multiply` function is then defined, which simply multiplies the two input numbers `a` and `b`. The function returns the product of these two integers, and it is annotated to return an integer (`int`).

Next, a `StructuredTool` object named `calculator` is created using the `StructuredTool.from_function` method. This method binds the `multiply` function to the tool, setting the `name` to `"Calculator"` and providing a `description` that states it is used to multiply numbers. The `args_schema` is set to `CalculatorInput`, meaning the tool will expect input in the format defined by the `CalculatorInput` class.

The `return_direct=True` flag ensures that the result from the `multiply` function is returned directly, without any additional processing or transformations.

Finally, the `run` method of the `calculator` tool is called, passing a dictionary with the values `a=1_000_000_000` and `b=2`. The tool then executes the `multiply` function with these inputs and returns the product, which is `2_000_000_000`.


### **Error Handling in Tools**
```python
from langchain_core.tools import ToolException

def search_tool(s: str):
    raise ToolException("The search tool is not available.")

search = StructuredTool.from_function(
    func=search_tool,
    name="Search_tool",
    description="A bad tool",
    handle_tool_error=True,
)

search("Search the internet and compress everything into a paragraph!")
```

This code snippet introduces error handling within a LangChain tool by leveraging the `ToolException` class. The `search_tool` function is designed to intentionally raise an error, simulating a failure in the tool. The error message `"The search tool is not available."` clearly indicates that the tool has encountered an issue.

To ensure that the agent continues execution even when this error occurs, the `handle_tool_error` parameter is set to `True` when creating the `StructuredTool` object. This instructs LangChain to handle the error without halting the execution of the agent.

The execution of the tool with the `search` function causes the `ToolException` to be raised, and without proper error handling, this would stop the agent. However, by setting `handle_tool_error=True`, the tool informs LangChain that it should manage this error internally and allow the agent to proceed. The specific error is raised but won't disrupt the tool's operation unless explicitly handled.

Furthermore, LangChain offers the ability to define a custom error-handling strategy by passing a function to `handle_tool_error` instead of a simple `True`. This function should accept a `ToolException` and return a `str`, which allows for more refined control over the error response, possibly transforming the error into a more user-friendly message or logging it for later analysis.

In this case, the tool raises an error, but the agent does not stop; the tool simply informs the user that the search functionality is unavailable. This is part of LangChain's flexibility, which allows for efficient error management while maintaining the flow of execution—particularly useful when building more complex applications like a research assistant, where various tools might fail without bringing down the entire system.