We will be modifying ReAct a little to take advantage of the progress that has been made in LLMs since the paper was originally released. We make two modifications:

1. We use chat models, when ReAct was released LLMs were not fine-tuned specifically for chat and instead were prompted to generate chat-like dialogues. Most SotA LLMs nowadays are built specifically for chat and so the input into them must be modified to be chat-model friendly.

2. We will use JSON-mode to force structured output from our LLMs. The original ReAct method simply instructed the LLM to output everything in a particular format. That works but is prone to occasionally breaking. By forcing JSON-like output we reduce the likelihood of poorly structured output. To accomodate this we modify the instructions to ask for `thought` and `action` steps in a JSON format.

In [1]:
system_prompt = """
You are a helpful assistant. Given a user query you must provide a `thought` and
`action` step that take one step towards solving the user's query. Both the
`thought` and `action` steps will be contained in JSON output.

The `thought` is the first key mapping to your reasoning on how to solve the
user's query.

The `action` step is the second key mapping that describes how you wish to use
the chosen tool to solve the user's query. It contains a `tool` key mapping to
the name of the tool to use and a `args` key containing a JSON object of
arguments to pass to the tool.

Here is an example:

user: What is the weather in Tokyo?
assistant: {
  "thought": "I need to find out the current temperature in Tokyo",
  "action": {"tool": "search", "args": {"query": "current temperature in Tokyo"}}
}

If you have performed any previous thought and action steps, you will find them
below under the "Previous Steps" section. Alongside these you will find an
`observation` key containing the output of those previous actions.
"""

We haven't defined any tools or agent logic yet, but let's see what type of output
our LLM produces if we prompt it with this system prompt.

Make sure you have Llama 3.2 downloaded already via:

```
ollama run llama3.2:3b-instruct-fp16
```

In [2]:
import ollama

res = ollama.chat(
    model="llama3.2:3b-instruct-fp16",
    messages=[
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": "What is the date today?"},
    ],
    format="json",
)

print(res["message"]["content"])

{
  "thought": "I need to determine the current date",
  "action": {
    "tool": "calendar.today()",
    "args": {}
  }
}


Great, we're outputting the correct format *however* we don't have a `date` tool, in fact, we don't have *any* tools! Let's define some.

First, we'll define a search tool. This tool will allow our agent to search the web for information. To implement it we will use the Tavily API, it comes with a number of requests for free but we do need to [sign up for the API](https://app.tavily.com/home) and get an API key to use it.

In [3]:
import requests

TAVILY_API_KEY = "tvly-..."

tavily_url = "https://api.tavily.com"

res = requests.post(
    f"{tavily_url}/search",
    json={
        "api_key": TAVILY_API_KEY,
        "query": "What is the weather in Tokyo?"
    },
)

res.json()

{'query': 'What is the weather in Tokyo?',
 'follow_up_questions': None,
 'answer': None,
 'images': [],
 'results': [{'title': 'Weather in Tokyo',
   'url': 'https://www.weatherapi.com/',
   'content': "{'location': {'name': 'Tokyo', 'region': 'Tokyo', 'country': 'Japan', 'lat': 35.6895, 'lon': 139.6917, 'tz_id': 'Asia/Tokyo', 'localtime_epoch': 1730666031, 'localtime': '2024-11-04 05:33'}, 'current': {'last_updated_epoch': 1730665800, 'last_updated': '2024-11-04 05:30', 'temp_c': 15.2, 'temp_f': 59.4, 'is_day': 0, 'condition': {'text': 'Clear', 'icon': '//cdn.weatherapi.com/weather/64x64/night/113.png', 'code': 1000}, 'wind_mph': 2.2, 'wind_kph': 3.6, 'wind_degree': 146, 'wind_dir': 'SSE', 'pressure_mb': 1023.0, 'pressure_in': 30.21, 'precip_mm': 0.0, 'precip_in': 0.0, 'humidity': 72, 'cloud': 0, 'feelslike_c': 15.2, 'feelslike_f': 59.4, 'windchill_c': 16.6, 'windchill_f': 61.9, 'heatindex_c': 16.6, 'heatindex_f': 61.9, 'dewpoint_c': 9.1, 'dewpoint_f': 48.4, 'vis_km': 10.0, 'vis_mile

As you can see, we don't return much useful info here beyond that we have some URLs that *do contain* the information we need. Fortunately, we can extract this information via Tavily's `/extract` endpoint.

In [4]:
urls = [x["url"] for x in res.json()["results"]]

res = requests.post(
    f"{tavily_url}/extract",
    json={
        "api_key": TAVILY_API_KEY,
        "urls": urls
    },
)

In [5]:
res.json()

{'results': [{'url': 'https://world-weather.info/forecast/japan/tokyo/march-2024/',
   'raw_content': "Weather in Tokyo in March 2024 (Tōkyō-to) - Detailed Weather Forecast for a Month\n\nAdd the current city\nSearch\n\nWeather\nArchive\nWidgets\n\n°F\n\nWorld\nJapan\nTōkyō-to\nWeather in Tokyo\n\nWeather in Tokyo in March 2024\nTokyo Weather Forecast for March 2024 is based on statistical data.\n20152016201720182019202020212022202320242025\nJanFebMarAprMayJunJulAugSepOctNovDec\nMarch\nStart Week On\nSunday\nMonday\n\nSun\nMon\nTue\nWed\nThu\nFri\n\nSat\n\n\n1 +57°+43°\n\n2 +48°+45°\n3 +54°+39°\n4 +55°+45°\n5 +46°+45°\n6 +48°+37°\n7 +50°+43°\n8 +46°+39°\n9 +52°+41°\n10 +52°+41°\n11 +54°+41°\n12 +50°+50°\n13 +54°+48°\n14 +57°+43°\n15 +59°+46°\n16 +63°+54°\n17 +66°+54°\n18 +54°+52°\n19 +54°+45°\n20 +52°+48°\n21 +50°+43°\n22 +54°+41°\n23 +50°+45°\n24 +57°+45°\n25 +54°+52°\n26 +52°+52°\n27 +57°+46°\n28 +57°+48°\n29 +66°+57°\n30 +68°+59°\n31 +73°+59°\n\nAverage weather in March 2024\n7 days

We can see that most requests didn't work, but that's okay we made multiple requests and fortunately received one result that looks perfect, we can extract that information like so:

In [6]:
print(res.json()["results"][0]["raw_content"])

Weather in Tokyo in March 2024 (Tōkyō-to) - Detailed Weather Forecast for a Month

Add the current city
Search

Weather
Archive
Widgets

°F

World
Japan
Tōkyō-to
Weather in Tokyo

Weather in Tokyo in March 2024
Tokyo Weather Forecast for March 2024 is based on statistical data.
20152016201720182019202020212022202320242025
JanFebMarAprMayJunJulAugSepOctNovDec
March
Start Week On
Sunday
Monday

Sun
Mon
Tue
Wed
Thu
Fri

Sat


1 +57°+43°

2 +48°+45°
3 +54°+39°
4 +55°+45°
5 +46°+45°
6 +48°+37°
7 +50°+43°
8 +46°+39°
9 +52°+41°
10 +52°+41°
11 +54°+41°
12 +50°+50°
13 +54°+48°
14 +57°+43°
15 +59°+46°
16 +63°+54°
17 +66°+54°
18 +54°+52°
19 +54°+45°
20 +52°+48°
21 +50°+43°
22 +54°+41°
23 +50°+45°
24 +57°+45°
25 +54°+52°
26 +52°+52°
27 +57°+46°
28 +57°+48°
29 +66°+57°
30 +68°+59°
31 +73°+59°

Average weather in March 2024
7 days
Precipitation
10 days
Cloudy
14 days
Sunny
Day
+55
°F
Night
+47
°F
Compare with another month
Extended weather forecast in Tokyo
HourlyWeek10-Day14-Day30-DayYear

Weather 

Okay so this is how we use the Tavily API to search the web, now let's implement this logic within a function which we can then use as a tool (ie action) for our ReAct agent.

In [7]:
def search(query: str):
    """Use this tool to search the web for information."""
    # first we need to search the web for the query
    res = requests.post(
        f"{tavily_url}/search",
        json={
            "api_key": TAVILY_API_KEY,
            "query": query
        },
    )
    # now get all the URLs from the search results
    urls = [x["url"] for x in res.json()["results"]]
    # now extract the information from the URLs
    res = requests.post(
        f"{tavily_url}/extract",
        json={
            "api_key": TAVILY_API_KEY,
            "urls": urls
        },
    )
    # we return just the top result as otherwise we overload our LLM
    return res.json()["results"][0]["raw_content"]

Let's test our function:

In [8]:
print(search(query="What is the weather in Tokyo?"))

March 2024 Weather History in Tokyo Japan
The data for this report comes from the Tokyo International Airport. See all nearby weather stations
This report shows the past weather for Tokyo, providing a weather history for March 2024. It features all historical weather data series we have available, including the Tokyo temperature history for March 2024. You can drill down from year to month and even day level reports by clicking on the graphs.
Tokyo Temperature History March 2024
Hourly Temperature in March 2024 in Tokyo
Compare Tokyo to another city:
Cloud Cover in March 2024 in Tokyo
Observed Weather in March 2024 in Tokyo
Hours of Daylight and Twilight in March 2024 in Tokyo
Sunrise & Sunset with Twilight in March 2024 in Tokyo
Solar Elevation and Azimuth in March 2024 in Tokyo
Moon Rise, Set & Phases in March 2024 in Tokyo
Humidity Comfort Levels in March 2024 in Tokyo
Wind Speed in March 2024 in Tokyo
Hourly Wind Speed in March 2024 in Tokyo
Hourly Wind Direction in 2024 in Tokyo
A

Great, now let's define two more tools...

We will create a simple "current date and time tool".

In [28]:
from datetime import datetime

def date():
    """Use this tool to get the current date and time."""
    now = datetime.now().strftime("%Y-%m-%d %H:%M:%S")
    return f"The current date and time is {now}"

And very importantly, we will define a tool that will be triggered when our LLM would like to provide it's final answer to the user.

In [10]:
def answer(text: str):
    """Use this tool to provide your final answer to the user."""
    return text

Now we generate an additional part to our `system_prompt` to explain which tools are available to the LLM.

In [11]:
import inspect

# we get the various parameters/description from each tool function
tools = [search, date, answer]
tool_descriptions = [
    {
        "name": tool.__name__,
        "description": str(inspect.getdoc(tool)),
        "args": {
            k: str(v).split(": ")[1] for k, v in inspect.signature(tool).parameters.items()
        }
    }
    for tool in tools
]
tool_descriptions

[{'name': 'search',
  'description': 'Use this tool to search the web for information.',
  'args': {'query': 'str'}},
 {'name': 'date',
  'description': 'Use this tool to get the current date and time.',
  'args': {}},
 {'name': 'answer',
  'description': 'Use this tool to provide your final answer to the user.',
  'args': {'text': 'str'}}]

Now let's parse these into text instructions that can be added to our `system_prompt`.

In [12]:
tool_instructions = (
    "You have access to the following tools ONLY, no other tools exist:\n\n"
    + "\n".join([str(x) for x in tool_descriptions])
)
print(tool_instructions)

You have access to the following tools ONLY, no other tools exist:

{'name': 'search', 'description': 'Use this tool to search the web for information.', 'args': {'query': 'str'}}
{'name': 'date', 'description': 'Use this tool to get the current date and time.', 'args': {}}
{'name': 'answer', 'description': 'Use this tool to provide your final answer to the user.', 'args': {'text': 'str'}}


Now let's try calling our LLM again with these additional instructions.

In [13]:
res = ollama.chat(
    model="llama3.2:3b-instruct-fp16",
    messages=[
        {"role": "system", "content": f"{system_prompt}\n\n{tool_instructions}"},
        {"role": "user", "content": "What is Ollama in the context of AI?"},
    ],
    format="json",
    options={"temperature": 0.0}
)

step = res["message"]["content"]
print(step)

{
  "thought": "I need to find out what Ollama refers to in the context of AI",
  "action": {
    "tool": "search",
    "args": {"query": "Ollama AI"}
  }
}


Perfect! Our LLM has correctly generated the query we need. We can now parse this and pass it into the `search` tool as specified by our LLM.

In [14]:
import json

tool_choice = json.loads(res["message"]["content"])["action"]["tool"]
args = json.loads(res["message"]["content"])["action"]["args"]

# we use a dictionary to map the tool name to the tool function
tool_selector = {x.__name__: x for x in tools}

# now we select the tool and call it with the arguments
observation = tool_selector[tool_choice](**args)
print(observation)

Ollama Explained: Transforming AI Accessibility and Language Processing
In the rapidly evolving landscape of artificial intelligence (AI), accessibility and innovation are paramount. Among the myriad platforms and tools emerging in this space, one name stands out: Ollama. But what exactly is Ollama, and why is it garnering attention in the AI community? This article delves into the intricacies of Ollama, its methodologies, its potential impact on AI applications, and what this could mean for the future of human-machine interaction.
Table of Content
Understanding Ollama
Ollama stands for (Omni-Layer Learning Language Acquisition Model), a novel approach to machine learning that promises to redefine how we perceive language acquisition and natural language processing.
At its core, Ollama is a groundbreaking platform that democratizes access to large language models (LLMs) by enabling users to run them locally on their machines. Developed with a vision to empower individuals and organizat

Nice! Within the ReAct framework we would then pass this information back to our LLM via a new `observation` variable. Let's try.

In [15]:
iteration = json.loads(step)
iteration["observation"] = observation
iteration_str = json.dumps(iteration, indent=2)
print(iteration_str)

{
  "thought": "I need to find out what Ollama refers to in the context of AI",
  "action": {
    "tool": "search",
    "args": {
      "query": "Ollama AI"
    }
  },
  "observation": "Ollama Explained: Transforming AI Accessibility and Language Processing\nIn the rapidly evolving landscape of artificial intelligence (AI), accessibility and innovation are paramount. Among the myriad platforms and tools emerging in this space, one name stands out: Ollama. But what exactly is Ollama, and why is it garnering attention in the AI community? This article delves into the intricacies of Ollama, its methodologies, its potential impact on AI applications, and what this could mean for the future of human-machine interaction.\nTable of Content\nUnderstanding Ollama\nOllama stands for (Omni-Layer Learning Language Acquisition Model), a novel approach to machine learning that promises to redefine how we perceive language acquisition and natural language processing.\nAt its core, Ollama is a groundb

Now we feed this back into our chat.

In [17]:
res = ollama.chat(
    model="llama3.2:3b-instruct-fp16",
    messages=[
        {
            "role": "system",
            "content": f"{system_prompt}\n\n{tool_instructions}"
        },
        {"role": "user", "content": "What is Ollama in the context of AI?"},
        {"role": "assistant", "content": f"Step 1:\n{iteration_str}\n\nWhat do I do next to answer the user's question..."},
    ],
    format="json",
    options={"temperature": 0.0}
)

step2 = res["message"]["content"]
print(step2)

{ "thought": "I need to summarize what Ollama is in simple terms", "action": {"tool": "answer", "args": {"text": "Ollama is a platform that enables users to run large language models locally on their machines, making it easier to leverage AI for various applications and use cases."}} }


There we go! Our final answer from the LLM is:

In [18]:
step2_json = json.loads(step2)
print(step2_json["action"]["args"]["text"])

Ollama is a platform that enables users to run large language models locally on their machines, making it easier to leverage AI for various applications and use cases.


Perfect, now let's take everything we've done so far and use it to construct our ReAct agent.

In [23]:
from typing import Callable


class ReActAgent:
    def __init__(self, tools: list[Callable]):
        self.messages = []
        self.tools = {x.__name__: x for x in tools}
        tool_instructions = self._format_tool_instructions(tools=tools)
        self.system_prompt = system_prompt
        # add system prompt and tool instructions to our messages
        self.messages.append({
            "role": "system",
            "content": f"{self.system_prompt}\n\n{tool_instructions}"
        })

    def _format_tool_instructions(self, tools: list[Callable]):
        # get the various parameters/description from each tool function
        tool_descriptions = [
            {
                "name": tool.__name__,
                "description": str(inspect.getdoc(tool)),
                "args": {
                    k: str(v).split(": ")[1] for k, v in inspect.signature(tool).parameters.items()
                }
            } for tool in tools
        ]
        # parse these into text instructions that can be added to our system prompt
        tool_instructions = (
            "You have access to the following tools ONLY, no other tools exist:\n\n"
            + "\n".join([str(x) for x in tool_descriptions])
        )
        return tool_instructions

    def __call__(self, prompt: str, max_steps: int = 10):
        self.messages.append({"role": "user", "content": prompt})
        step_count = 1
        steps = []
        while step_count < max_steps:
            # get the next step
            step_dict = self._call_llm(
                messages=self.messages + (
                    [self._format_scratchpad(steps)] if steps else []
                )
            )
            try:
                # get the tool choice and arguments
                tool_choice = step_dict["action"]["tool"]
                args = step_dict["action"]["args"]
            except KeyError as e:
                print(f"! Handled error: {e}")
                # in case we get error, return content direct
                step_dict = {
                    "thought": step_dict.get("thought", "No thought provided"),
                    "action": step_dict.get("action", {
                        "tool": "answer",
                        "args": {"text": str(step_dict)}
                    }),
                }
                tool_choice = step_dict["action"]["tool"]
                args = step_dict["action"]["args"]
            self._print_react(step_count=step_count, step_dict=step_dict)
            if tool_choice == "answer":
                # we've reached the final step
                return step_dict["action"]["args"]["text"]
            else:
                # otherwise we call the chosen tool
                observation = self.tools[tool_choice](**args)
                print(f"Observation {step_count}: {observation[:200]}...")
            # add the step to our scratchpad
            steps.append({
                "thought": step_dict["thought"],
                "action": step_dict["action"],
                "observation": observation
            })
            step_count += 1
        # if we get here we've hit the max steps so we force the answer tool
        # to do this we modify the system prompt to only show the answer tool
        messages = self.messages.copy()
        tool_instructions = self._format_tool_instructions(tools=[self.tools["answer"]])
        messages[0]["content"] = f"{self.system_prompt}\n\n{tool_instructions}"
        # now we call the LLM with the modified system prompt
        step_dict = self._call_llm(messages=messages)
        return step_dict["action"]["args"]["text"]

    def _call_llm(self, messages: list[dict]) -> dict:
        res = ollama.chat(
            model="llama3.2:3b-instruct-fp16",
            messages=messages,
            format="json",
            options={"temperature": 0.0},
        )
        step_dict = json.loads(res["message"]["content"])
        return step_dict
        
    def _format_scratchpad(self, steps: list[dict]) -> dict:
        steps_str = ""
        for i, step in enumerate(steps):
            steps_str += f"Step {i+1}:\n{json.dumps(step, indent=2)}\n\n"
        steps_str += "What do I do next to answer the user's question..."
        return {"role": "assistant", "content": steps_str}

    def _print_react(self, step_count: int, step_dict: dict) -> None:
        """Prints the Reasoning (thought) and Action step"""
        react = "\n".join([
            f"Thought {step_count}: {step_dict['thought']}",
            f"Action {step_count}: {step_dict['action']}",
        ])
        print(react)

agent = ReActAgent(tools=[search, date, answer])

In [24]:
agent("What is Ollama in the context of AI?")

Thought 1: I need to find out what Ollama refers to in the context of AI
Action 1: {'tool': 'search', 'args': {'query': 'Ollama AI'}}
Observation 1: Ollama Explained: Transforming AI Accessibility and Language Processing
In the rapidly evolving landscape of artificial intelligence (AI), accessibility and innovation are paramount. Among the myriad ...
Thought 2: I need to summarize what Ollama is in simple terms
Action 2: {'tool': 'answer', 'args': {'text': 'Ollama is a platform that enables users to run large language models locally on their machines, making it easier to leverage AI for various applications and use cases.'}}


'Ollama is a platform that enables users to run large language models locally on their machines, making it easier to leverage AI for various applications and use cases.'

To begin a new conversation, we must reinitialize our agent:

In [29]:
agent = ReActAgent(tools=[search, date, answer])

agent("What time is it right now?")

Thought 1: I need to find out the current date and time
Action 1: {'tool': 'date', 'args': {}}
Observation 1: The current date and time is 2024-11-03 21:46:43...
! Handled error: 'action'
Thought 2: No thought provided
Action 2: {'tool': 'answer', 'args': {'text': '{}'}}


'{}'