## Building a ReAct Agent with Tool Integration

In this notebook, you will learn how to create and deploy a ReAct (Reasoning + Acting) agent capable of solving complex tasks by combining logical reasoning with external tool usage. The agent will use a structured loop of **Thought → Action → Observation** to iteratively reason through problems, gather data, and provide solutions.

By the end of this notebook, you will:

- Implement a ReAct agent that breaks down complex user queries into smaller, manageable steps.
- Integrate external tools (e.g., calculators, data retrieval functions) that the agent can use to enhance its problem-solving abilities.
- Efficiently handle tool outputs and errors, allowing the agent to adapt its actions based on real-time feedback.

## What is a ReAct Prompting?

**ReAct prompting** is a technique that combines reasoning and acting within large language models (LLMs) to solve tasks more effectively. By prompting the model with **task-solving trajectories**, ReAct enables the model to think through a problem step by step while simultaneously taking actions, such as retrieving information or using external tools. This synergy between reasoning and acting allows the model to both plan and execute in a flexible manner.

ReAct prompting is particularly effective because it integrates two important processes:
- **Reasoning**: The model thinks through the task by generating reasoning traces, helping break down complex queries into manageable steps.
- **Acting**: The model performs actions, such as interacting with external tools, APIs, or databases, to gather real-time information that informs its reasoning.

For more information, you can read the original paper on ReAct prompting: [ReAct: Synergizing Reasoning and Acting in Language Models](https://react-lm.github.io/).

### How Does ReAct Prompting Work?

ReAct prompting follows a **Thought → Action → Observation** loop. Here's how it works:
- **Thought**: The model reasons about the task and decides what action to take next.
- **Action**: The model executes the chosen action, such as querying an API, calculating a result, or retrieving information.
- **Observation**: The model observes the result of the action, updates its internal reasoning, and decides if further actions are needed.

This loop continues until the model gathers enough information to provide a complete answer or determines that no further actions are required.

### Why Use ReAct Prompting?

ReAct prompting allows LLMs to achieve state-of-the-art performance across various tasks by enhancing both reasoning and acting capabilities. It addresses several limitations:
- **Misinformation**: In cases where reasoning alone (e.g., chain-of-thought) may lead to errors due to reliance on internal knowledge, ReAct grounding in external actions prevents misinformation.
- **Lack of Synthesis**: Acting alone without reasoning can result in incomplete or incoherent solutions. ReAct allows for better synthesis of final answers by combining both reasoning and actions.

For example, in the context of our notebook, we'll use ReAct prompting to instruct an agent to use tools like a `basic_calculator` to perform mathematical operations. The agent reasons about when to use the tool, performs the operation, and observes the result to determine if more actions are needed before completing the task.

![image.png](images/react-diagram.png)

## 1. Setup

Before we begin, let's make sure your environment is set up correctly. We'll start by installing the necessary Python packages.

### Installing Required Packages

To get started, you'll need to install a few Python libraries. Run the following command to install them:

In [47]:
%pip install langchain langgraph langgraph-checkpoint-sqlite requests termcolor

Note: you may need to restart the kernel to use updated packages.


These packages are used for:

- **langchain:** A framework for developing applications powered by language models, with support for building agents, managing prompts, and integrating external tools.
- **langgraph:** A workflow orchestration library designed to integrate with LangChain, allowing you to create complex AI-driven workflows and reasoning paths.
- **langgraph-checkpoint-sqlite:** A tool used for storing checkpoints and managing workflow states using SQLite, ensuring the agent's progress is saved and retrievable.
- **requests:** Making HTTP requests to interact with APIs and external services.
- **termcolor:** Adding colored text output to the terminal, which is useful for debugging and improving the readability of logs and outputs.

### Datetime Function

We'll create is a simple function to get the current time. This is important because our agent might need to timestamp certain actions or events. Let's write a function that returns the current date and time in UTC format:

In [48]:
from datetime import datetime, timezone


def get_current_utc_datetime():
    now_utc = datetime.now(timezone.utc)
    return now_utc.strftime("%Y-%m-%d %H:%M:%S.%f UTC")[:-3]


# Example usage:
print("Current UTC datetime:", get_current_utc_datetime())

Current UTC datetime: 2024-09-10 14:09:39.446498 


## 2. Configuring a Simple Model

In this section, we configure the machine learning model that our agent will use to process tasks. The `ModelService` class manages the interaction with the model (in this case, "llama3.1:8b-instruct-fp16"), allowing the agent to handle tasks such as listing VMs and retrieving details.

### Model Configuration

We initialize the `ModelService` with a specific model configuration, including parameters such as model endpoint, temperature (for controlling randomness), and others. This step enables our agent to perform model-based tasks using the provided configuration.

In [49]:
from services.model_service import ModelService

# Initialize the service with the model configuration
ollama_service = ModelService(model="llama3.1:8b-instruct-fp16")

## 3. Custom Tools

In this section, we demonstrate how to integrate custom tools into the ReAct agent's workflow. These tools allow the agent to perform specific actions based on the task requirements. We'll start with a basic calculator tool that can perform fundamental arithmetic operations.

### Basic Calculator Tool

The `basic_calculator` tool performs basic arithmetic operations like addition, subtraction, multiplication, and division. The tool accepts two numbers and an operation as input and returns the result.

#### Supported Operations:
- `add`: Adds two numbers.
- `subtract`: Subtracts the second number from the first.
- `multiply`: Multiplies two numbers.
- `divide`: Divides the first number by the second (raises an exception for division by zero).
- `modulus`: Finds the remainder when the first number is divided by the second.
- `power`: Raises the first number to the power of the second.
- Comparison operators: `lt` (less than), `le` (less than or equal to), `eq` (equal to), `ne` (not equal to), `ge` (greater than or equal to), `gt` (greater than).

The agent will invoke this tool based on the reasoning process and provide structured input in the form of JSON. Let's take a look at the implementation:

In [50]:
import operator
from langchain.tools import tool


@tool(parse_docstring=True)
def basic_calculator(num1, num2, operation):
    """
    Perform a numeric operation on two numbers based on the input string.

    Parameters:
    'num1' (int): The first number.
    'num2' (int): The second number.
    'operation' (str): The operation to perform. Supported operations are 'add', 'subtract',
                        'multiply', 'divide', 'floor_divide', 'modulus', 'power', 'lt',
                        'le', 'eq', 'ne', 'ge', 'gt'.

    Returns:
    str: The formatted result of the operation.

    Raises:
    Exception: If an error occurs during the operation (e.g., division by zero).
    ValueError: If an unsupported operation is requested or input is invalid.
    """

    # Define the supported operations
    operations = {
        "add": operator.add,
        "subtract": operator.sub,
        "multiply": operator.mul,
        "divide": operator.truediv,
        "floor_divide": operator.floordiv,
        "modulus": operator.mod,
        "power": operator.pow,
        "lt": operator.lt,
        "le": operator.le,
        "eq": operator.eq,
        "ne": operator.ne,
        "ge": operator.ge,
        "gt": operator.gt,
    }

    # Check if the operation is supported
    if operation in operations:
        try:
            # Perform the operation
            result = operations[operation](num1, num2)
            result_formatted = (
                f"The answer is: {result}.\nCalculated with basic_calculator."
            )
            return result_formatted
        except Exception as e:
            return str(e), "\n\nError during operation execution."
    else:
        return "\n\nUnsupported operation. Please provide a valid operation."

In [51]:
tools = [basic_calculator]

## 4. Creating a ReAct Agent

In this section, we will build a ReAct agent that autonomously reasons through tasks and selects the appropriate tools to solve problems. The agent will follow a structured process to deliver actionable results.

### Architecture Explanation

- **Agent Components**:
  - **Memory**: The agent's memory module stores past interactions and relevant data to inform future decisions.
  - **Planning**: This module breaks down complex tasks into manageable steps, helping the agent decide the sequence of actions.
  - **Tools**: The tools module enables the agent to interact with external systems, such as performing calculations, retrieving data, or running APIs.

- **System Prompt & User Prompt**: 
  - The **System Prompt** provides context and instructions for how the agent should operate within its environment, including the tools available.
  - The **User Prompt** is the input provided by the user, specifying the task or query the agent needs to handle.

- **LLM Interaction**: 
  - The Large Language Model (LLM) is at the core of the agent’s decision-making. It processes both the system and user prompts, and its outputs guide the agent’s actions. The LLM can reason about tasks, plan, and adapt based on external tool outputs.
  
- **External Environments**:
  - The **Local Environment** consists of tools and systems the agent interacts with locally on its workstation.
  - The **External Environment** is any external system or API that the agent can call for additional information or actions.

This architecture allows the ReAct agent to efficiently break down complex tasks, take action, and observe the results in a structured, repeatable process.

![React Agent Architecture](images/react_agent_architecture_modules.png)

## System Prompt

The system prompt provides the instructions that guide the agent in reasoning through tasks, using tools, and generating structured responses. It establishes the context, including the environment (e.g., ipython) and knowledge cut-off date (December 2023), ensuring the agent understands the limits of its information.

The agent is tasked with using tools to solve problems and must decide which tool to use and in what sequence. Each interaction follows a structured JSON format, ensuring clarity in both tool inputs and outputs.

The agent operates in a cycle of **thought → action → observation**:
- **Thought**: The agent thinks about the task and determines the next action.
- **Action**: The agent selects and uses the appropriate tool.
- **Observation**: The agent analyzes the tool's result and decides the next step.

This process repeats until the agent reaches a sufficient conclusion to answer the user’s query. If the tool provides a clear result, the agent will stop further actions and present the final answer. If the task cannot be completed, the agent will explain the limitation and provide suggestions.

The system prompt ensures that the agent behaves logically, utilizes tools efficiently, and delivers structured and coherent responses.

For more details on LLAMA 3.1, refer to the [LLAMA 3.1 Model Card](https://llama.meta.com/docs/model-cards-and-prompt-formats/llama3_1/).

![React System Prompt](images/react_system_prompt.png)

In [52]:
DEFAULT_SYS_REACT_PROMPT = """
<|begin_of_text|><|start_header_id|>system<|end_header_id|>

Environment: ipython
Tools: {tools_name} 
Knowledge Cutoff Date: December 2023  
Current Date: {datetime}

You are an intelligent assistant designed to handle various tasks, including answering questions, providing summaries, and performing detailed analyses. All outputs must strictly be in JSON format.

---

## Tools
You have access to a variety of tools to assist in completing tasks. You are responsible for determining the appropriate sequence of tool usage to break down complex tasks into subtasks when necessary.

The available tools include:

{tools_description}

---

## Output Format:
To complete the task, please use the following format:

{{
  "thought": "Describe your thought process here, including why a tool may be necessary to proceed.",
  "action": "Specify the tool you want to use.",
  "action_input": {{ # Provide valid JSON input for the action, ensuring it matches the tool’s expected format and data types.
    "key": "Value inputs to the tool in valid JSON format."
  }}
}}

After performing an action, the tool will provide a response in the following format:

{{
  "observation": "The result of the tool invocation",
}}

You should keep repeating the format (thought → action → observation) until you have the answer to the original question. 

If the tool result is successful and the task is complete:

{{
  "answer": "I have the answer: {{tool_result}}."
}}


Or, if you cannot answer:

{{
  "answer": "Sorry, I cannot answer your query."
}}

---

### Remember:
- **If a tool provides a complete and clear answer, do not continue invoking further tools.**
- Use the tools effectively and ensure inputs match the required format exactly as described in the task.
- Maintain the JSON format and ensure all fields are filled out correctly.
- Do not include additional metadata such as `title`, `description`, or `type` in the `tool_input`.

<|eot_id|>
{user_prompt}
{agent_scratchpad}
"""

### Creating a Simple Agent Class

In this section, we define the `ReActAgent` class, responsible for managing an agent that processes tasks based on user input. The class handles interactions with the language model, executes tools, and follows an action-observation loop until a final answer is generated.

### Key Components:

1. **Model Invocation (`invoke_model`)**:
   - Prepares the input, sends it to the language model, and processes the response. This is the main interface for interacting with the model using system and user prompts.

2. **ReAct Loop (`react`)**:
   - Implements the **Thought → Action → Observation** cycle:
     - The agent processes the user’s request, interacts with the model, and checks for an action or final answer.
     - If an action is required, the agent executes the tool, observes the result, and continues the loop until a final answer is produced.

3. **Tool Execution (`execute_tool`)**:
   - Simulates tool execution based on the model’s action request. In practice, this could involve calling real-world tools or APIs. The result is returned to the agent as an observation.

In [53]:
import json
from typing import Dict, Any
from termcolor import colored
from langchain_core.messages.ai import AIMessage
from langchain_core.messages import SystemMessage
from langchain_core.messages import HumanMessage
from services.model_service import ModelService
from utils.general.tools import get_tools_name, get_tools_description


class ReactAgent:

    def __init__(
        self,
        state: Dict[str, Any],
        role: str,
        tools: list,
        ollama_service: ModelService,
    ):
        """
        Initialize the Agent with a state, role, and model configuration.
        """
        self.state = state
        self.role = role
        self.tools = tools
        self.ollama_service = ollama_service

        # Initialize lists to accumulate responses if they don't exist in the state
        if "messages" not in self.state:
            self.state["messages"] = []

    def invoke_model(self, sys_prompt: str, user_prompt: str):
        """
        Prepare the payload, send the request to the model, and process the response.
        """
        # Prepare the payload
        payload = self.ollama_service.prepare_payload(
            user_prompt,
            sys_prompt,
        )

        # Invoke the model and get the response
        response_json = self.ollama_service.request_model_generate(
            payload,
        )

        # Process the model's response
        response_content = self.ollama_service.process_model_response(response_json)

        # Return the processed response
        return response_content

    def write_react_prompt(
        self,
        user_prompt: str = "",
        agent_scratchpad: str = "",
    ) -> str:
        return DEFAULT_SYS_REACT_PROMPT.format(
            user_prompt=user_prompt,
            agent_scratchpad=agent_scratchpad,
            tools_name=get_tools_name(self.tools),
            tools_description=get_tools_description(self.tools),
            datetime=get_current_utc_datetime(),
        )

    # Function to format the scratchpad into a properly indented string
    def format_scratchpad(self, scratchpad):
        formatted_output = ""
        for entry in scratchpad:
            formatted_output += entry.strip() + "\n"
        return formatted_output

    def react(self, user_request: str) -> dict:
        """
        Execute the task based on the user's request by following the thought → action → observation loop.
        """

        self.state["messages"].append(HumanMessage(content=user_request))

        answer = None

        # Start with the user's request as the first input
        user_prompt = (
            f"""<|start_header_id|>user<|end_header_id|>\n\n{user_request}<|eot_id|>"""
        )

        sys_prompt = self.write_react_prompt(user_prompt=user_prompt)
        # user_prompt = user_request
        tool_response = None
        action = None
        action_input = None
        scratchpad = []

        print(colored(user_prompt, "green"))

        # Loop until a final answer is generated
        while answer is None:
            # Invoke the model with the system prompt and current user input

            response = self.invoke_model(sys_prompt=sys_prompt, user_prompt=user_prompt)

            try:
                # Parse the response assuming it's in JSON format
                response_dict = json.loads(
                    response
                )  # Assuming response is a JSON object

                assistant_message = f"""<|start_header_id|>assistant<|end_header_id|>\n\n{response}<|eot_id|>"""

                print(colored(assistant_message, "cyan"))
                self.state["messages"].append(AIMessage(content=response))

                scratchpad.append(assistant_message)

                formatted_scratchpad = self.format_scratchpad(scratchpad)
                sys_prompt = self.write_react_prompt(
                    user_prompt=user_prompt, agent_scratchpad=formatted_scratchpad
                )

                action = response_dict.get("action", None)
                action_input = response_dict.get("action_input", None)

                # If there is an action, execute the corresponding tool
                if action:
                    status, tool_response = self.execute_tool(action, action_input)

                    # Formulate the observation to feed back into the model
                    tool_response_dict = {
                        "observation": tool_response,
                    }

                    tool_response_json = json.dumps(tool_response_dict, indent=4)

                    result_message = f"""<|start_header_id|>ipython<|end_header_id|>\n\n{tool_response_json}<|eot_id|>"""

                    print(colored(result_message, "yellow"))

                    user_prompt = tool_response_json
                    # Append the tool response to the state
                    self.state["messages"].append(
                        SystemMessage(content=tool_response)
                    )

                # Check if the model has given an answer
                if "answer" in response_dict:
                    answer = response_dict["answer"]

            except Exception as e:
                print(str(e))
                system_message = f"""<|start_header_id|>ipython<|end_header_id|>\n\n{str(e)}<|eot_id|>"""
                scratchpad.append(system_message)
                formatted_scratchpad = self.format_scratchpad(scratchpad)
                sys_prompt = self.write_react_prompt(
                    user_prompt=user_prompt, agent_scratchpad=formatted_scratchpad
                )

        # Return the final answer
        return self.state

    def execute_tool(self, action: str, action_input: dict):
        """
        Simulate the tool execution based on the action and action_input.
        In a real-world scenario, this would call the appropriate tool.
        """
        # Simulate some tool actions (this would be replaced by actual tool logic)
        tool_message = f"""<|python_tag|>{action}.call({action_input})\n<|eom_id|>"""
        print(
            colored(
                tool_message,
                "magenta",
            )
        )

        for tool in self.tools:
            if tool.name == action:
                try:
                    result = tool.invoke(action_input)
                    result_message = f"""<|start_header_id|>ipython<|end_header_id|>\n\n{result}<|eot_id|>"""
                    print(colored(result_message, "magenta"))
                    return True, result
                except Exception as e:
                    return False, f"Error executing tool {action}: {str(e)}"
        else:
            return f"Tool {action} not found or unsupported operation."

## 5. Running the Agent

In this section, we will execute the `ReActAgent` to demonstrate how it processes user input, interacts with tools, and follows the **Thought → Action → Observation** loop to produce a final answer.

### Steps to Run the Agent:

1. **Initialize the Agent**: 
   - We start by setting up the initial state, defining the agent's role, specifying the available tools, and connecting the agent to a language model service.
   
2. **User Request**: 
   - A user query is provided as input, such as "What is the sum of 15 and 27?" The agent receives this request and starts reasoning through it.

3. **Processing the Request**:
   - The agent constructs a system prompt with the user’s input and interacts with the language model in a continuous loop, checking for actions or final answers.
   - If a tool is needed (e.g., a calculator), the agent will invoke the appropriate tool, observe the result, and feed the output back into its reasoning process.

4. **Final Output**:
   - The agent continues looping through the **Thought → Action → Observation** cycle until a final answer is generated or the task cannot be completed. Once complete, the result is stored in the agent’s state, and the final answer is returned.

By the end of this process, the agent will have used reasoning and external tools to autonomously solve the problem presented by the user.

![React Flow](images/react_flow.png)

### Example

In the next cell, we provide an example where the agent solves a mathematical problem by using a tool and returning the result.

Running this example will give you insight into how the ReAct agent works in practice, allowing it to break down complex tasks, interact with tools, and generate structured outputs.

In [54]:
# Define the initial state, role, tools, and the model service instance
initial_state = {
    "messages": [],  # To store agent's responses
}

react_agent = ReactAgent(
    state=initial_state,
    role="REACT_AGENT",
    tools=tools,
    ollama_service=ollama_service,
)

user_input = "What is 10+10?"

final_state = react_agent.react(user_request=user_input)



[32m<|start_header_id|>user<|end_header_id|>

What is 10+10?<|eot_id|>[0m
[36m<|start_header_id|>assistant<|end_header_id|>

{
    "thought": "The problem requires a basic arithmetic operation, so I will use the 'basic_calculator' tool.",
    "action": "basic_calculator",
    "action_input": {
        "num1": 10,
        "num2": 10,
        "operation": "add"
    }
}<|eot_id|>[0m
[35m<|python_tag|>basic_calculator.call({'num1': 10, 'num2': 10, 'operation': 'add'})
<|eom_id|>[0m
[35m<|start_header_id|>ipython<|end_header_id|>

The answer is: 20.
Calculated with basic_calculator.<|eot_id|>[0m
[33m<|start_header_id|>ipython<|end_header_id|>

{
    "observation": "The answer is: 20.\nCalculated with basic_calculator."
}<|eot_id|>[0m
[36m<|start_header_id|>assistant<|end_header_id|>

{
    "answer": "I have the answer: 20."
}<|eot_id|>[0m


In [55]:
print("Messages from the Agent's State:")

for message in final_state["messages"]:
    pretty_message = message.pretty_repr()
    if isinstance(message, AIMessage):
        print(colored(f"{pretty_message}", "cyan"))
    elif isinstance(message, SystemMessage):
        print(colored(f"{pretty_message}", "yellow"))
    elif isinstance(message, HumanMessage):
        print(colored(f"{pretty_message}", "green"))

Messages from the Agent's State:

What is 10+10?[0m

{
    "thought": "The problem requires a basic arithmetic operation, so I will use the 'basic_calculator' tool.",
    "action": "basic_calculator",
    "action_input": {
        "num1": 10,
        "num2": 10,
        "operation": "add"
    }
}[0m

The answer is: 20.
Calculated with basic_calculator.[0m

{
    "answer": "I have the answer: 20."
}[0m


## Conclusion

In this notebook, we successfully built a ReAct agent capable of autonomously solving tasks by reasoning through problems and interacting with external tools. We explored key concepts such as the **Thought → Action → Observation** loop, where the agent breaks down complex tasks into manageable steps, executes actions via tools, and refines its approach based on real-time feedback.

### Key Takeaways:
- **ReAct Prompting**: The agent uses a combination of reasoning and action to solve tasks, invoking tools dynamically based on the problem at hand.
- **Agent Architecture**: We learned how different components such as memory, planning, and tools integrate with the LLM to enable the agent’s decision-making process.
- **Tool Integration**: We demonstrated how the agent interacts with external systems and tools to enhance its capabilities, making it more effective at solving complex, multi-step tasks.
- **System Prompts and User Prompts**: Properly crafting prompts ensures the agent operates within its environment and understands the scope of its tasks.

By the end of this notebook, you should have a solid understanding of how ReAct agents work and how they can be employed to tackle a wide range of tasks. You can now extend this framework to include more sophisticated tools or refine the agent’s reasoning abilities to suit more advanced applications.