# 2. Understanding ReAct Prompting in Python

In this notebook, we will explore the ReAct (Reasoning + Acting) framework, a powerful approach for enhancing the capabilities of large language models (LLMs). ReAct allows models to not only reason through tasks step by step but also to take actions and observe the results, creating a more dynamic problem-solving process.

By the end of this tutorial, you'll have a solid understanding of how the ReAct framework works, how it integrates reasoning with actions, and how to set up a simple workflow to implement this approach in Python.

## What is a ReAct Prompting?

**ReAct prompting** is a technique that combines reasoning and acting within large language models (LLMs) to solve tasks more effectively. By prompting the model with **task-solving trajectories**, ReAct enables the model to think through a problem step by step while simultaneously taking actions, such as retrieving information or using external tools. This synergy between reasoning and acting allows the model to both plan and execute in a flexible manner.

ReAct prompting is particularly effective because it integrates two important processes:
- **Reasoning**: The model thinks through the task by generating reasoning traces, helping break down complex queries into manageable steps.
- **Acting**: The model performs actions, such as interacting with external tools, APIs, or databases, to gather real-time information that informs its reasoning.

For more information, you can read the original paper on ReAct prompting: [ReAct: Synergizing Reasoning and Acting in Language Models](https://react-lm.github.io/).

### How Does ReAct Prompting Work?

ReAct prompting follows a **Thought → Action → Observation** loop. Here's how it works:
- **Thought**: The model reasons about the task and decides what action to take next.
- **Action**: The model executes the chosen action, such as querying an API, calculating a result, or retrieving information.
- **Observation**: The model observes the result of the action, updates its internal reasoning, and decides if further actions are needed.

This loop continues until the model gathers enough information to provide a complete answer or determines that no further actions are required.

### Why Use ReAct Prompting?

ReAct prompting allows LLMs to achieve state-of-the-art performance across various tasks by enhancing both reasoning and acting capabilities. It addresses several limitations:
- **Misinformation**: In cases where reasoning alone (e.g., chain-of-thought) may lead to errors due to reliance on internal knowledge, ReAct grounding in external actions prevents misinformation.
- **Lack of Synthesis**: Acting alone without reasoning can result in incomplete or incoherent solutions. ReAct allows for better synthesis of final answers by combining both reasoning and actions.

For example, in the context of our notebook, we'll use ReAct prompting to instruct an agent to use tools like a `basic_calculator` to perform mathematical operations. The agent reasons about when to use the tool, performs the operation, and observes the result to determine if more actions are needed before completing the task.

![image.png](images/react-diagram.png)

## 1. Setup

Before we begin, let's make sure your environment is set up correctly. We'll start by installing the necessary Python packages.

### Installing Required Packages

To get started, you'll need to install a few Python libraries. Run the following command to install them:

In [293]:
%pip install termcolor

Note: you may need to restart the kernel to use updated packages.


These packages are used for:

- **termcolor:** Adding colored text output to the terminal, which is useful for debugging and improving the readability of logs and outputs.

### Datetime Function

We'll create is a simple function to get the current time. This is important because our agent might need to timestamp certain actions or events. Let's write a function that returns the current date and time in UTC format:

In [294]:
from datetime import datetime, timezone


def get_current_utc_datetime():
    now_utc = datetime.now(timezone.utc)
    return now_utc.strftime("%Y-%m-%d %H:%M:%S.%f UTC")[:-3]


# Example usage:
print("Current UTC datetime:", get_current_utc_datetime())

Current UTC datetime: 2024-09-11 19:41:43.928592 


## 2. Configuring a Simple Model

In this section, we configure the machine learning model that we will use to process tasks. The `ModelService` class manages the interaction with the model (in this case, "llama3.1:8b-instruct-fp16").

### Model Configuration

We initialize the `ModelService` with a specific model configuration, including parameters such as model endpoint, temperature (for controlling randomness), and others. This step enables us to perform model-based tasks using the provided configuration.

In [295]:
from services.model_service import ModelService

# Initialize the service with the model configuration
ollama_service = ModelService(model="llama3.1:8b-instruct-fp16")

### Invoking the Model

The `invoke_model` function handles the communication between the system and the language model. It takes two inputs: `sys_prompt` (system prompt) and `user_prompt` (the user's input), and performs the following steps:

1. **Prepare the Payload**: Combines the prompts into a format the model can understand.
2. **Send the Request**: Sends the prepared data to the model.
3. **Process the Response**: Once the model returns a response, it processes the output into a usable format.

The function returns the final processed response, which will be used to guide further actions.


In [296]:
def invoke_model(sys_prompt: str, user_prompt: str):
    """
    Prepare the payload, send the request to the model, and process the response.
    """
    # Prepare the payload
    payload = ollama_service.prepare_payload(
        user_prompt,
        sys_prompt,
    )

    # Invoke the model and get the response
    response_json = ollama_service.request_model_generate(
        payload,
    )

    # Process the model's response
    response_content = ollama_service.process_model_response(response_json)

    # Return the processed response
    return response_content

## 3. Custom Tools

In this section, we'll show how to integrate custom tools into the workflow. These tools enable the model to perform specific actions when required by the task. We'll begin with a simple calculator tool that can handle basic arithmetic operations.

### Basic Calculator Tool

The `basic_calculator` tool can perform fundamental operations such as addition, subtraction, multiplication, and division. It takes two numbers and an operation as inputs and returns the result.

#### Supported Operations:
- `add`: Adds two numbers.
- `subtract`: Subtracts one number from another.
- `multiply`: Multiplies two numbers.
- `divide`: Divides one number by another (raises an error if division by zero).
- `modulus`: Returns the remainder of a division.
- `power`: Raises one number to the power of another.
- Comparison operators: `lt` (less than), `le` (less than or equal to), `eq` (equal), `ne` (not equal), `ge` (greater than or equal to), `gt` (greater than).

The model will decide when to use this tool during its reasoning process and provide input in a structured format. Let's take a look at how it's implemented:

In [297]:
import operator
from langchain.tools import tool


@tool(parse_docstring=True)
def basic_calculator(num1, num2, operation):
    """
    Perform a numeric operation on two numbers based on the input string.

    Parameters:
    'num1' (int): The first number.
    'num2' (int): The second number.
    'operation' (str): The operation to perform. Supported operations are 'add', 'subtract',
                        'multiply', 'divide', 'floor_divide', 'modulus', 'power', 'lt',
                        'le', 'eq', 'ne', 'ge', 'gt'.

    Returns:
    str: The formatted result of the operation.

    Raises:
    Exception: If an error occurs during the operation (e.g., division by zero).
    ValueError: If an unsupported operation is requested or input is invalid.
    """

    # Define the supported operations
    operations = {
        "add": operator.add,
        "subtract": operator.sub,
        "multiply": operator.mul,
        "divide": operator.truediv,
        "floor_divide": operator.floordiv,
        "modulus": operator.mod,
        "power": operator.pow,
        "lt": operator.lt,
        "le": operator.le,
        "eq": operator.eq,
        "ne": operator.ne,
        "ge": operator.ge,
        "gt": operator.gt,
    }

    # Check if the operation is supported
    if operation in operations:
        try:
            # Perform the operation
            result = operations[operation](num1, num2)
            result_formatted = (
                f"The answer is: {result}.\nCalculated with basic_calculator."
            )
            return result_formatted
        except Exception as e:
            return str(e), "\n\nError during operation execution."
    else:
        return "\n\nUnsupported operation. Please provide a valid operation."

In [298]:
tools = [basic_calculator]

In [299]:
def execute_tool(action: str, action_input: dict):
    """
    Simulate the tool execution based on the action and action_input.
    In a real-world scenario, this would call the appropriate tool.
    """
    # Simulate some tool actions (this would be replaced by actual tool logic)
    tool_message = f"""<|python_tag|>{action}.call({action_input})\n<|eom_id|>"""
    print(
        colored(
            tool_message,
            "magenta",
        )
    )

    for tool in tools:
        if tool.name == action:
            try:
                result = tool.invoke(action_input)
                result_message = f"""<|start_header_id|>ipython<|end_header_id|>\n\n{result}<|eot_id|>"""
                print(colored(result_message, "magenta"))
                return True, result
            except Exception as e:
                return False, f"Error executing tool {action}: {str(e)}"
    else:
        return f"Tool {action} not found or unsupported operation."

## 4. ReAct System Prompt

The system prompt provides instructions that guide the model in reasoning through tasks, using tools, and generating structured responses. It defines the context, including the environment (e.g., ipython) and knowledge cut-off date (December 2023), helping the model understand the scope of its information.

The model follows a structured process to solve problems by deciding which tools to use and when. Each interaction is formatted in JSON for clear communication.

The model operates in a cycle of **thought → action → observation**:
- **Thought**: The model reasons about the task and determines what to do next.
- **Action**: The model selects and uses the appropriate tool.
- **Observation**: The model analyzes the tool's result and decides the next step.

This loop continues until the model can provide a final result. If a tool gives a clear answer, the model stops further actions and presents the outcome. If the task cannot be completed, the model will explain the limitation.

The system prompt ensures that the model behaves logically, uses tools efficiently, and delivers clear, structured responses.

For more information on LLAMA 3.1, refer to the [LLAMA 3.1 Model Card](https://llama.meta.com/docs/model-cards-and-prompt-formats/llama3_1/).

![React System Prompt](images/react_system_prompt.png)

In [300]:
DEFAULT_SYS_REACT_PROMPT = """
<|begin_of_text|><|start_header_id|>system<|end_header_id|>

Environment: ipython
Tools: {tools_name} 
Knowledge Cutoff Date: December 2023  
Current Date: {datetime}

You are an intelligent assistant designed to handle various tasks, including answering questions, providing summaries, and performing detailed analyses. All outputs must strictly be in JSON format.

---

## Tools
You have access to a variety of tools to assist in completing tasks. You are responsible for determining the appropriate sequence of tool usage to break down complex tasks into subtasks when necessary.

The available tools include:

{tools_description}

---

## Output Format:
To complete the task, please use the following format:

{{
  "thought": "Describe your thought process here, including why a tool may be necessary to proceed.",
  "action": "Specify the tool you want to use.",
  "action_input": {{ # Provide valid JSON input for the action, ensuring it matches the tool’s expected format and data types.
    "key": "Value inputs to the tool in valid JSON format."
  }}
}}

After performing an action, the tool will provide a response in the following format:

{{
  "observation": "The result of the tool invocation",
}}

You should keep repeating the format (thought → action → observation) until you have the answer to the original question. 

If the tool result is successful and the task is complete:

{{
  "answer": "I have the answer: {{tool_result}}."
}}


Or, if you cannot answer:

{{
  "answer": "Sorry, I cannot answer your query."
}}

---

### Remember:
- **If a tool provides a complete and clear answer, do not continue invoking further tools.**
- Use the tools effectively and ensure inputs match the required format exactly as described in the task.
- Maintain the JSON format and ensure all fields are filled out correctly.
- Do not include additional metadata such as `title`, `description`, or `type` in the `tool_input`.

<|eot_id|>
{first_user_prompt}
{history}
"""

In [301]:
def write_sys_prompt(
    first_user_prompt: str = "",
    history: str = "",
) -> str:
    return DEFAULT_SYS_REACT_PROMPT.format(
        first_user_prompt=first_user_prompt,
        history=history,
        tools_name=get_tools_name(tools),
        tools_description=get_tools_description(tools),
        datetime=get_current_utc_datetime(),
    )

## 5. ReAct Loop Implementation

The `react` function follows a Thought → Action → Observation loop to process user requests and generate responses using tools.

### Key Steps:
1. **Prompt Initialization**: The user's request is formatted into a `user_prompt` and a `sys_prompt` to guide the loop.
2. **Model Invocation**: The model is called to reason about the task, and the response is parsed.
3. **Action Execution**: If an action (e.g., using a tool) is required, the action is performed and the result is observed.
4. **History Tracking**: Each step (prompts, responses, and tool outputs) is recorded in a `history` list.
5. **Final Output**: The loop continues until a final answer is reached, which is returned along with the history.

This structure ensures the model handles tasks iteratively, using tools when needed.

In [302]:
import json
from termcolor import colored
from langchain_core.messages.ai import AIMessage
from langchain_core.messages import SystemMessage
from langchain_core.messages import HumanMessage
from utils.general.tools import get_tools_name, get_tools_description

# Function to format the scratchpad into a properly indented string
def format_history(history):
    formatted_output = ""
    for entry in history:
        formatted_output += entry.strip() + "\n"
    return formatted_output

def react(user_request: str) -> dict:
    """
    Execute the task based on the user's request by following the thought → action → observation loop.
    """

    answer = None

    # Start with the user's request as the first input
    first_user_prompt = (
        f"""<|start_header_id|>user<|end_header_id|>\n\n{user_request}<|eot_id|>"""
    )

    sys_prompt = write_sys_prompt(first_user_prompt=first_user_prompt)

    # user_prompt = user_request
    tool_response = None
    action = None
    action_input = None
    history = []

    print(colored(first_user_prompt, "green"))

    user_prompt = first_user_prompt

    # Loop until a final answer is generated
    while answer is None:
        # Invoke the model with the system prompt and current user input

        response = invoke_model(sys_prompt=sys_prompt, user_prompt=user_prompt)

        try:
            # Parse the response assuming it's in JSON format
            response_dict = json.loads(
                response
            )  # Assuming response is a JSON object

            assistant_message = f"""<|start_header_id|>assistant<|end_header_id|>\n\n{response}<|eot_id|>"""

            print(colored(assistant_message, "cyan"))

            history.append(assistant_message)

            formatted_history = format_history(history)
            sys_prompt = write_sys_prompt(
                first_user_prompt=first_user_prompt, history=formatted_history
            )

            action = response_dict.get("action", None)
            action_input = response_dict.get("action_input", None)

            # If there is an action, execute the corresponding tool
            if action:
                tool_message = (
                    f"""<|python_tag|>{action}.call({action_input})\n<|eom_id|>"""
                )
                history.append(tool_message)
                status, tool_response = execute_tool(action, action_input)

                # Formulate the observation to feed back into the model
                tool_response_dict = {
                    "observation": tool_response,
                }

                tool_response_json = json.dumps(tool_response_dict, indent=4)

                result_message = f"""<|start_header_id|>ipython<|end_header_id|>\n\n{tool_response_json}<|eot_id|>"""

                print(colored(result_message, "yellow"))

                user_prompt = tool_response_json
                history.append(result_message)

            # Check if the model has given an answer
            if "answer" in response_dict:
                answer = response_dict["answer"]

        except Exception as e:
            print(str(e))
            system_message = f"""<|start_header_id|>ipython<|end_header_id|>\n\n{str(e)}<|eot_id|>"""
            history.append(system_message)
            formatted_history = format_history(history)
            sys_prompt = write_sys_prompt(
                first_user_prompt=first_user_prompt, history=formatted_history
            )

    # Return the final answer
    return answer, history

## 6. Testing the ReAct Workflow

Now that we've set up the system, tools, and model, it's time to test the ReAct prompting loop with the `basic_calculator` tool. This will show how the agent reasons through tasks by utilizing thought-action-observation steps.

In this test, we will ask the agent to solve a simple arithmetic problem using the calculator tool. Let's see how the ReAct framework works in practice.

In [303]:
user_input = "Please calculate 15 divided by 3, then subtract 2."

final_answer, history = react(user_request=user_input)

[32m<|start_header_id|>user<|end_header_id|>

Please calculate 15 divided by 3, then subtract 2.<|eot_id|>[0m
[36m<|start_header_id|>assistant<|end_header_id|>

{
    "thought": "To solve this problem, we first need to divide 15 by 3 and store the result.",
    "action": "basic_calculator",
    "action_input": {
        "num1": 15,
        "num2": 3,
        "operation": "divide"
    }
}<|eot_id|>[0m
[35m<|python_tag|>basic_calculator.call({'num1': 15, 'num2': 3, 'operation': 'divide'})
<|eom_id|>[0m
[35m<|start_header_id|>ipython<|end_header_id|>

The answer is: 5.0.
Calculated with basic_calculator.<|eot_id|>[0m
[33m<|start_header_id|>ipython<|end_header_id|>

{
    "observation": "The answer is: 5.0.\nCalculated with basic_calculator."
}<|eot_id|>[0m
[36m<|start_header_id|>assistant<|end_header_id|>

{
    "thought": "Now, we need to subtract 2 from the result of the division.",
    "action": "basic_calculator",
    "action_input": {
        "num1": 5,
        "num2": 2,
