## Building a ReAct Agent with Tool Integration

In this notebook, you will learn how to create and deploy a ReAct (Reasoning + Acting) agent capable of solving complex tasks by combining logical reasoning with external tool usage. The agent will use a structured loop of **Thought → Action → Observation** to iteratively reason through problems, gather data, and provide solutions.

By the end of this notebook, you will:

- Implement a ReAct agent that breaks down complex user queries into smaller, manageable steps.
- Integrate external tools (e.g., calculators, data retrieval functions) that the agent can use to enhance its problem-solving abilities.
- Efficiently handle tool outputs and errors, allowing the agent to adapt its actions based on real-time feedback.

## Basic Concepts

Before implementing a ReAct agent, it's important to understand the foundational concepts.

A **ReAct Agent (Reasoning + Acting)** alternates between reasoning through a task and taking actions using external tools. It operates in a loop of **Thought → Action → Observation** until it reaches a final answer or decides no further action is needed.

The **Thought → Action → Observation Loop** is the core of the ReAct agent’s operation. It reasons through a problem, chooses the appropriate tool or action, observes the results, and then decides the next step. This process continues iteratively until the task is completed.

**Tool Integration** enables the agent to use external functions or APIs to perform actions, such as retrieving data or making calculations. The agent selects tools dynamically based on its reasoning.

**Error Handling and Feedback Loops** ensure the agent can recover from failures or unexpected results. If a tool fails or produces incorrect output, the agent adjusts its reasoning and continues to process the task.

**Prompt Engineering** is crucial for guiding the ReAct agent’s decisions. Carefully designed prompts allow the agent to use tools efficiently and respond effectively based on the information gathered.

These concepts will provide the foundation to build a ReAct agent capable of reasoning through tasks, using tools, and delivering structured, actionable results.

## 1. Setup

Before we begin, let's make sure your environment is set up correctly. We'll start by installing the necessary Python packages.

### Installing Required Packages

To get started, you'll need to install a few Python libraries. Run the following command to install them:

In [21]:
%pip install langchain langgraph langgraph-checkpoint-sqlite requests termcolor

Note: you may need to restart the kernel to use updated packages.


These packages are used for:

- **requests:** Making HTTP requests to interact with models.
- **jsonschema:** Validating the structure of the agent's output.
- **tenacity:** Handling retries in case of errors when communicating with the model.

### Datetime Function

We'll create is a simple function to get the current time. This is important because our agent might need to timestamp certain actions or events. Let's write a function that returns the current date and time in UTC format:

In [22]:
from datetime import datetime, timezone


def get_current_utc_datetime():
    now_utc = datetime.now(timezone.utc)
    return now_utc.strftime("%Y-%m-%d %H:%M:%S.%f UTC")[:-3]


# Example usage:
print("Current UTC datetime:", get_current_utc_datetime())

Current UTC datetime: 2024-09-09 21:10:16.524900 


## 2. Configuring a Simple Model

In this section, we configure the machine learning model that our agent will use to process tasks. The `ModelService` class manages the interaction with the model (in this case, "llama3.1:8b-instruct-fp16"), allowing the agent to handle tasks such as listing VMs and retrieving details.

### Model Configuration

We initialize the `ModelService` with a specific model configuration, including parameters such as model endpoint, temperature (for controlling randomness), and others. This step enables our agent to perform model-based tasks using the provided configuration.

In [23]:
from services.model_service import ModelService

# Initialize the service with the model configuration
ollama_service = ModelService(model="llama3.1:8b-instruct-fp16")

## 3. Custom Tools

In this section, we demonstrate how to integrate custom tools into the ReAct agent's workflow. These tools allow the agent to perform specific actions based on the task requirements. We'll start with a basic calculator tool that can perform fundamental arithmetic operations.

### Basic Calculator Tool

The `basic_calculator` tool performs basic arithmetic operations like addition, subtraction, multiplication, and division. The tool accepts two numbers and an operation as input and returns the result.

#### Supported Operations:
- `add`: Adds two numbers.
- `subtract`: Subtracts the second number from the first.
- `multiply`: Multiplies two numbers.
- `divide`: Divides the first number by the second (raises an exception for division by zero).
- `modulus`: Finds the remainder when the first number is divided by the second.
- `power`: Raises the first number to the power of the second.
- Comparison operators: `lt` (less than), `le` (less than or equal to), `eq` (equal to), `ne` (not equal to), `ge` (greater than or equal to), `gt` (greater than).

The agent will invoke this tool based on the reasoning process and provide structured input in the form of JSON. Let's take a look at the implementation:

In [24]:
import operator
from langchain.tools import tool


@tool(parse_docstring=True)
def basic_calculator(num1, num2, operation):
    """
    Perform a numeric operation on two numbers based on the input string.

    Parameters:
    'num1' (int): The first number.
    'num2' (int): The second number.
    'operation' (str): The operation to perform. Supported operations are 'add', 'subtract',
                        'multiply', 'divide', 'floor_divide', 'modulus', 'power', 'lt',
                        'le', 'eq', 'ne', 'ge', 'gt'.

    Returns:
    str: The formatted result of the operation.

    Raises:
    Exception: If an error occurs during the operation (e.g., division by zero).
    ValueError: If an unsupported operation is requested or input is invalid.
    """

    # Define the supported operations
    operations = {
        "add": operator.add,
        "subtract": operator.sub,
        "multiply": operator.mul,
        "divide": operator.truediv,
        "floor_divide": operator.floordiv,
        "modulus": operator.mod,
        "power": operator.pow,
        "lt": operator.lt,
        "le": operator.le,
        "eq": operator.eq,
        "ne": operator.ne,
        "ge": operator.ge,
        "gt": operator.gt,
    }

    # Check if the operation is supported
    if operation in operations:
        try:
            # Perform the operation
            result = operations[operation](num1, num2)
            result_formatted = (
                f"The answer is: {result}.\nCalculated with basic_calculator."
            )
            return result_formatted
        except Exception as e:
            return str(e), "\n\nError during operation execution."
    else:
        return "\n\nUnsupported operation. Please provide a valid operation."

In [25]:
tools = [basic_calculator]

## 4. State Management

State management is an essential concept when working with agents. The "state" of an agent refers to its current condition or the information it has at any given time. For instance, if an agent is working through a list of tasks, its state might include which tasks have been completed, which are in progress, and which are yet to be started. Without proper state management, an agent might lose track of its progress, repeat tasks, or skip important steps. In this notebook, you'll learn how to manage an agent's state effectively, ensuring that it operates smoothly and efficiently.

### Implementation of State Management

To implement state management for our ReAct agent, we define a structured data model (`AgentGraphState`) that holds all the relevant information the agent needs to function effectively. This model includes:
- **Input:** The current command or task that the agent is working on.
- **Response:** The outputs or actions the agent has generated.

Additionally, we provide a utility function, `update_state`, which allows for updating specific elements of the agent's state. This function ensures that the state is consistently and accurately maintained, which is critical for the agent to operate effectively. By checking for the existence of keys before updating, the function helps prevent errors and maintains the integrity of the state.

Together, these components form the backbone of the agent's state management system, enabling it to manage complex workflows and adapt to changes dynamically.

In [26]:
from langgraph.graph.message import add_messages
from typing import Annotated, TypedDict, Any


class AgentGraphState(TypedDict):
    input: str
    response: Annotated[list, add_messages]


def update_state(state: AgentGraphState, key: str, value: Any):
    """
    Update the state of the agent. Warn if the key doesn't exist.
    """
    if key in state:
        state[key] = value
    else:
        print(f"Warning: Attempting to update a non-existing state key '{key}'.")

## 5. What is a ReAct Agent?

A **ReAct agent** is a specialized type of intelligent agent designed to reason about a task, take appropriate actions, and adapt based on the outcomes of those actions. The term **ReAct** stands for **Reasoning + Acting**, which reflects how the agent alternates between thinking through a problem and interacting with tools or external systems to achieve the desired result.

### How Does a ReAct Agent Work?

A ReAct agent follows a cycle known as the **Thought → Action → Observation** loop:
1. **Thought**: The agent reasons about the task, breaks it down into smaller, manageable parts, and decides which action or tool to use next.
2. **Action**: The agent executes the chosen action, such as invoking a tool to perform a calculation or retrieve information.
3. **Observation**: After the action is completed, the agent observes the result, evaluates if more steps are necessary, and then returns to the reasoning stage if needed.

This cycle repeats until the agent has gathered enough information to provide a complete answer or complete the task.

### Why Use a ReAct Agent?

The primary benefit of a ReAct agent is its ability to autonomously solve complex problems through a process of reasoning and interaction. Unlike simple agents that just follow pre-defined rules, a ReAct agent can:
- **Reason through multi-step tasks**: It can break down complex queries into smaller steps and handle them one by one.
- **Adapt to new information**: Based on the outcome of each action, the agent can modify its approach and try different strategies if needed.
- **Perform external actions**: ReAct agents can interact with tools, APIs, and systems to gather data, perform calculations, or execute external processes.

For example, in the context of our notebook, the ReAct agent uses tools like the `basic_calculator` to perform mathematical operations. The agent reasons about when to use the tool, performs the operation, and observes the result to determine if more actions are needed before completing the task.

In summary, a ReAct agent is a versatile, intelligent system capable of reasoning, acting, and learning from its actions to accomplish tasks autonomously, making it highly effective for problem-solving in dynamic environments.

![image.png](images/react-diagram.png)

## System Prompt

The system prompt provides the instructions that guide the agent in reasoning through tasks, using tools, and generating structured responses. It establishes the context, including the environment (e.g., ipython) and knowledge cut-off date (December 2023), ensuring the agent understands the limits of its information.

The agent is tasked with using tools to solve problems and must decide which tool to use and in what sequence. Each interaction follows a structured JSON format, ensuring clarity in both tool inputs and outputs.

The agent operates in a cycle of **thought → action → observation**:
- **Thought**: The agent thinks about the task and determines the next action.
- **Action**: The agent selects and uses the appropriate tool.
- **Observation**: The agent analyzes the tool's result and decides the next step.

This process repeats until the agent reaches a sufficient conclusion to answer the user’s query. If the tool provides a clear result, the agent will stop further actions and present the final answer. If the task cannot be completed, the agent will explain the limitation and provide suggestions.

The system prompt ensures that the agent behaves logically, utilizes tools efficiently, and delivers structured and coherent responses.

For more details on LLAMA 3.1, refer to the [LLAMA 3.1 Model Card](https://llama.meta.com/docs/model-cards-and-prompt-formats/llama3_1/).

In [27]:
DEFAULT_SYS_REACT_PROMPT = """
<|begin_of_text|><|start_header_id|>system<|end_header_id|>

Environment: ipython
Tools: {tools_name} 
Knowledge Cutoff Date: December 2023  
Current Date: {datetime}

You are an intelligent assistant designed to handle various tasks, including answering questions, providing summaries, and performing detailed analyses. All outputs must strictly be in JSON format.

---

## Tools
You have access to a variety of tools to assist in completing tasks. You are responsible for determining the appropriate sequence of tool usage to break down complex tasks into subtasks when necessary.

The available tools include:

{tools_description}

---

## Output Format:
To complete the task, please use the following format:

{{
  "thought": "Describe your thought process here, including why a tool may be necessary to proceed.",
  "action": "Specify the tool you want to use.",
  "action_input": {{ # Provide valid JSON input for the action, ensuring it matches the tool’s expected format and data types.
    "key": "Value inputs to the tool in valid JSON format."
  }}
}}

After performing an action, the tool will provide a response in the following format:

{{
  "observation": "The result of the tool invocation",
}}

You should keep repeating the format (thought → action → observation) until you have the answer to the original question. 

If the tool result is successful and the task is complete:

{{
  "answer": "I have the answer: {{tool_result}}."
}}


Or, if you cannot answer:

{{
  "answer": "Sorry, I cannot answer your query."
}}

---

### Remember:
- **If a tool provides a complete and clear answer, do not continue invoking further tools.**
- Use the tools effectively and ensure inputs match the required format exactly as described in the task.
- Maintain the JSON format and ensure all fields are filled out correctly.
- Do not include additional metadata such as `title`, `description`, or `type` in the `tool_input`.

<|eot_id|>
{user_prompt}
{agent_scratchpad}
"""

### Creating a Simple Agent Class

In this section, we define the core `ReActAgent` class, responsible for managing the lifecycle of an agent that processes tasks based on user requests. This class handles interactions with the language model, executes tools, and manages an action-observation loop until a final answer is generated.

### Key Components:

1. **Model Invocation (`invoke_model`)**:
   - This method prepares the input payload, sends it to the model, and processes the returned response. It serves as the interface for querying the language model with system and user prompts.

2. **ReAct Loop (`react`)**:
   - This method implements the core thought → action → observation loop:
     - The agent begins by processing the user’s request.
     - It enters a loop where it interacts with the model, parses responses, and checks for an "action" or final "answer".
     - If an action is required, the agent executes the corresponding tool and observes the result, feeding it back to the model until a final answer is generated.
     - This loop ensures continuous interaction and adjustment based on model outputs.

3. **Tool Execution (`execute_tool`)**:
   - This method simulates the execution of tools based on the action and input provided by the model. In practice, this could involve invoking real-world tools or APIs. The result of the tool's execution is then returned to the agent as an observation.

### Workflow:

- The agent starts by receiving a user request.
- It constructs a system prompt with user input and continuously interacts with the model in a loop until the desired output (an answer) is obtained.
- If the model suggests an action, the agent executes the tool corresponding to that action, processes the result, and continues.
- The loop concludes once the agent successfully generates a final answer.

In [28]:
import json
from termcolor import colored
from langchain_core.messages.ai import AIMessage
from langchain_core.messages import SystemMessage
from state.agent_graph import AgentGraphState
from services.model_service import ModelService
from utils.general.helpers import get_current_utc_datetime
from utils.general.tools import get_tools_name, get_tools_description


class ReactAgent:

    def __init__(
        self,
        state: AgentGraphState,
        role: str,
        tools: list,
        ollama_service: ModelService,
    ):
        """
        Initialize the Agent with a state, role, and model configuration.
        """
        self.state = state
        self.role = role
        self.tools = tools
        self.ollama_service = ollama_service

    def invoke_model(self, sys_prompt: str, user_prompt: str):
        """
        Prepare the payload, send the request to the model, and process the response.
        """
        # Prepare the payload
        payload = self.ollama_service.prepare_payload(
            user_prompt,
            sys_prompt,
        )

        # Invoke the model and get the response
        response_json = self.ollama_service.request_model_generate(
            payload,
        )

        # Process the model's response
        response_content = self.ollama_service.process_model_response(response_json)

        # Return the processed response
        return response_content

    def write_react_prompt(
        self,
        user_prompt: str = "",
        agent_scratchpad: str = "",
    ) -> str:
        return DEFAULT_SYS_REACT_PROMPT.format(
            user_prompt=user_prompt,
            agent_scratchpad=agent_scratchpad,
            tools_name=get_tools_name(self.tools),
            tools_description=get_tools_description(self.tools),
            datetime=get_current_utc_datetime(),
        )

    # Function to format the scratchpad into a properly indented string
    def format_scratchpad(self, scratchpad):
        formatted_output = ""
        for entry in scratchpad:
            formatted_output += entry.strip() + "\n"
        return formatted_output

    def react(self, user_request: str) -> dict:
        """
        Execute the task based on the user's request by following the thought → action → observation loop.
        """

        answer = None

        # Start with the user's request as the first input
        user_prompt = (
            f"""<|start_header_id|>user<|end_header_id|>\n\n{user_request}<|eot_id|>"""
        )

        sys_prompt = self.write_react_prompt(user_prompt=user_prompt)
        # user_prompt = user_request
        tool_response = None
        action = None
        action_input = None
        scratchpad = []

        print(colored(user_prompt, "green"))

        # Loop until a final answer is generated
        while answer is None:
            # Invoke the model with the system prompt and current user input

            response = self.invoke_model(sys_prompt=sys_prompt, user_prompt=user_prompt)

            try:
                # Parse the response assuming it's in JSON format
                response_dict = json.loads(
                    response
                )  # Assuming response is a JSON object

                assistant_message = f"""<|start_header_id|>assistant<|end_header_id|>\n\n{response}<|eot_id|>"""

                print(colored(assistant_message, "cyan"))

                scratchpad.append(assistant_message)

                formatted_scratchpad = self.format_scratchpad(scratchpad)
                sys_prompt = self.write_react_prompt(
                    user_prompt=user_prompt, agent_scratchpad=formatted_scratchpad
                )

                action = response_dict.get("action", None)
                action_input = response_dict.get("action_input", None)

                # If there is an action, execute the corresponding tool
                if action:
                    status, tool_response = self.execute_tool(action, action_input)

                    # Formulate the observation to feed back into the model
                    tool_response_dict = {
                        "observation": tool_response,
                    }

                    tool_response_json = json.dumps(tool_response_dict, indent=4)

                    result_message = f"""<|start_header_id|>ipython<|end_header_id|>\n\n{tool_response_json}<|eot_id|>"""

                    print(colored(result_message, "yellow"))

                    user_prompt = tool_response_json

                # Check if the model has given an answer
                if "answer" in response_dict:
                    answer = response_dict["answer"]

            except Exception as e:
                print(str(e))
                system_message = f"""<|start_header_id|>ipython<|end_header_id|>\n\n{str(e)}<|eot_id|>"""
                scratchpad.append(system_message)
                formatted_scratchpad = self.format_scratchpad(scratchpad)
                sys_prompt = self.write_react_prompt(
                    user_prompt=user_prompt, agent_scratchpad=formatted_scratchpad
                )

        # Return the final answer
        return {
            "response": AIMessage(content=answer),
            "tool_response": SystemMessage(content=str(tool_response)),
        }

    def execute_tool(self, action: str, action_input: dict):
        """
        Simulate the tool execution based on the action and action_input.
        In a real-world scenario, this would call the appropriate tool.
        """
        # Simulate some tool actions (this would be replaced by actual tool logic)
        tool_message = f"""<|python_tag|>{action}.call({action_input})\n<|eom_id|>"""
        print(
            colored(
                tool_message,
                "magenta",
            )
        )

        for tool in self.tools:
            if tool.name == action:
                try:
                    result = tool.invoke(action_input)
                    result_message = f"""<|start_header_id|>ipython<|end_header_id|>\n\n{result}<|eot_id|>"""
                    print(colored(result_message, "magenta"))
                    return True, result
                except Exception as e:
                    return False, f"Error executing tool {action}: {str(e)}"
        else:
            return f"Tool {action} not found or unsupported operation."

## 6. Creating and Compiling the Workflow Graph

In this step, we build the workflow graph that represents the agent's process. This graph outlines the sequence of operations, including task execution, validation, and decision-making.

### Key Components:
- **Node Definitions**: Each node in the graph represents a step in the workflow, such as invoking the ReAct agent.
  
- **Edge Definitions**: Edges define the flow between nodes, determining how the agent progresses through the tasks and validation steps.

- **Workflow Compilation**: Once the graph is defined, it is compiled into a workflow that can be executed.

By constructing and compiling this workflow graph, we ensure that the PM agent operates in a structured and efficient manner, handling tasks and making decisions in a logical sequence.

In [29]:
def react_node_function(state: AgentGraphState):
    react_agent = ReactAgent(
        state=state,
        role="REACT_AGENT",
        tools=tools,
        ollama_service=ollama_service,
    )

    return react_agent.react(user_request=state["input"])

In [30]:
from langgraph.graph import StateGraph, START, END


def create_graph() -> StateGraph:
    """
    Create the state graph by defining nodes and edges.

    Returns:
    - StateGraph: The compiled state graph ready for execution.
    """
    graph = StateGraph(AgentGraphState)

    # Add nodes
    graph.add_node("react_agent", react_node_function)

    # Define the flow of the graph
    graph.add_edge(START, "react_agent")
    graph.add_edge("react_agent", END)

    return graph

### Initializing Sqlite Persistence for Graph State

In this cell, we define and use a method to initialize an `SqliteSaver` instance from the `langgraph.checkpoint.sqlite` module. The `SqliteSaver` class allows the graph state to be persisted in an SQLite database, which is more durable and suitable for applications requiring longer-term storage compared to an in-memory solution.

The `from_conn_stringx` method is defined as a class method that takes a connection string as input, creates a connection to the SQLite database using `sqlite3.connect`, and then returns an `SqliteSaver` instance using this connection. This method simplifies the creation of an `SqliteSaver` instance directly from a connection string.

This approach is particularly useful for ensuring that the state of the `StateGraph` is saved to a local or memory-based SQLite database, enabling the retention of context across multiple interactions in AI-driven applications.

In [31]:
from langgraph.checkpoint.sqlite import SqliteSaver
import sqlite3


def from_conn_stringx(
    cls,
    conn_string: str,
) -> "SqliteSaver":
    return SqliteSaver(conn=sqlite3.connect(conn_string, check_same_thread=False))


SqliteSaver.from_conn_stringx = classmethod(from_conn_stringx)

memory = SqliteSaver.from_conn_stringx(":memory:")

## 7. Executing the Workflow

With the workflow graph compiled, the final step is to execute the workflow. This involves providing the agent with an input query, such as a request to generate a VM migration plan, and allowing the workflow to run through its defined sequence.

### Creating and Running the Workflow

In this step, we create and execute the PM agent's workflow to process a set of tasks.

- **Graph Creation**: 
  - We first create the workflow graph using `create_graph()` and compile it with a memory-based checkpoint.
  - The compiled workflow will manage the task execution, validation, and feedback handling.

- **Workflow Parameters**:
  - We define the number of iterations (`iterations = 10`), set verbose mode to `True`, and configure the thread ID.
  - A query containing three tasks (VM details retrieval, migration plan creation, and migration start) is provided as input.

- **Workflow Execution**:
  - The workflow is executed using `workflow.stream()`, and it processes each task sequentially.
  - Depending on the state of the workflow, feedback or task responses are printed to track progress.

This step runs the agent through the defined tasks and prints the state changes for each event in the workflow.

In [32]:
# Create the graph and compile the workflow
graph = create_graph()
workflow = graph.compile(checkpointer=memory)
print("Graph and workflow created.")

# Define workflow parameters
iterations = 10
verbose = True
config = {"configurable": {"thread_id": "1"}}

query = "What is 10+10?"
dict_inputs = {"input": query}
limit = {"recursion_limit": iterations}

# Execute the workflow and print state changes
for event in workflow.stream(dict_inputs, config):
    if verbose:
        print("\nEvent:", event)
    else:
        print("\n")

Graph and workflow created.
[32m<|start_header_id|>user<|end_header_id|>

What is 10+10?<|eot_id|>[0m
[36m<|start_header_id|>assistant<|end_header_id|>

{
    "thought": "To calculate the result of 10+10, I need to use a tool that can perform addition.",
    "action": "basic_calculator",
    "action_input": {
        "num1": 10,
        "num2": 10,
        "operation": "add"
    }
}<|eot_id|>[0m
[35m<|python_tag|>basic_calculator.call({'num1': 10, 'num2': 10, 'operation': 'add'})
<|eom_id|>[0m
[35m<|start_header_id|>ipython<|end_header_id|>

The answer is: 20.
Calculated with basic_calculator.<|eot_id|>[0m
[33m<|start_header_id|>ipython<|end_header_id|>

{
    "observation": "The answer is: 20.\nCalculated with basic_calculator."
}<|eot_id|>[0m
[36m<|start_header_id|>assistant<|end_header_id|>

{
    "answer": "I have the answer: 20."
}<|eot_id|>[0m

Event: {'react_agent': {'response': AIMessage(content='I have the answer: 20.'), 'tool_response': SystemMessage(content='The 