# AgentsVille Trip Planner - Project Assignment

## Introduction

In this notebook, you'll implement an AI-powered travel planning agent for AgentsVille. The system will demonstrate advanced LLM reasoning techniques including:

1. **Role-Based Prompting** - Your agent will act as a specialized travel planner
2. **Chain-of-Thought Reasoning** - Step-by-step planning of itineraries
3. **ReAct Prompting** - Thought → Action → Observation cycles
4. **Reflexion** - Self-evaluation to correct mistakes and improve plans
5. **Memory Management** - Maintaining context about user preferences and constraints

You'll need to simulate external API calls to gather weather data, cultural events, restaurants, and activities. Then, process this information to create personalized travel itineraries based on group size, ages, interests, and other constraints.

Your task is to build a travel agent that can plan the perfect AgentsVille vacation!

In [1]:
# Install dependencies

%pip install openai==1.74.0

Collecting openai==1.74.0
  Using cached openai-1.74.0-py3-none-any.whl.metadata (25 kB)
Using cached openai-1.74.0-py3-none-any.whl (644 kB)
Installing collected packages: openai
  Attempting uninstall: openai
    Found existing installation: openai 1.82.1
    Uninstalling openai-1.82.1:
      Successfully uninstalled openai-1.82.1
Successfully installed openai-1.74.0
Note: you may need to restart the kernel to use updated packages.


In [2]:
# Notebook Setup

%reload_ext autoreload
%autoreload 2
%matplotlib inline

In [3]:
# Imports

import json
import re
from pprint import pformat

from project_lib import compare_dicts_case_insensitive, do_chat_completion, print_in_box

The following cell shows how to use two functions we will use repeatedly in the rest of the notebook.

* `print_in_box` - a function for printing text in a box with a title and dimensions. Additionally,
    the `tab_level` parameter can be used to add indentation to the text.
* `do_chat_completion` - a function for making API calls to the OpenAI Chat Completion API


In [4]:
# Show a simple example using the `print_in_box` and `do_chat_completion` functions
messages = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Hello, how are you?"},
    {"role": "assistant", "content": "I'm good, thanks!"},
    {"role": "user", "content": "What is 2+2?"},
]
for m in messages:
    print_in_box(
        m["content"],
        title=m["role"],
        # We use different tab levels to help visualize the conversation
        tab_level=1 if m["role"] == "assistant" else 0,
    )

response = do_chat_completion(messages, temperature=0.4)
print_in_box(response, title="assistant", tab_level=1)


╔════════════════════════════════════════════[ system ]════════════════════════════════════════════╗
║ You are a helpful assistant.                                                                     ║
╚══════════════════════════════════════════════════════════════════════════════════════════════════╝

╔═════════════════════════════════════════════[ user ]═════════════════════════════════════════════╗
║ Hello, how are you?                                                                              ║
╚══════════════════════════════════════════════════════════════════════════════════════════════════╝
    ╔══════════════════════════════════════[ assistant ]═══════════════════════════════════════════╗
    ║ I'm good, thanks!                                                                            ║
    ╚══════════════════════════════════════════════════════════════════════════════════════════════╝

╔═════════════════════════════════════════════[ user ]══════════════════════════════════

## Creating the BoolResponseAgent

In the following cell, we create our first agent, the `BoolResponseAgent`. Here's an example of how to use it:

```python
bool_response_agent = BoolResponseAgent()
bool_response_agent.get_response("Do tests make for more reliable code?", num_calls=3)
```

You should see the following output:

```text
╔════════════════════════[ BoolResponseAgent - Query  ]════════════════════════╗
║ Do tests make for more reliable code?                                        ║
╚══════════════════════════════════════════════════════════════════════════════╝
    ╔═══════════════════[ BoolResponseAgent - Response ]═══════════════════════╗
    ║ <think>Tests can help identify bugs and ensure code behaves as expected, ║
    ║ which generally leads to more reliable code. However, the effectiveness  ║
    ║ of tests depends on their quality and coverage. Therefore, while tests   ║
    ║ can contribute to reliability, they do not guarantee it on their         ║
    ║ own.</think>                                                             ║
    ║ <response>True</response>                                                ║
    ╚══════════════════════════════════════════════════════════════════════════╝
    ╔═══════════════════[ BoolResponseAgent - Response ]═══════════════════════╗
    ║ <think>Tests can help identify bugs and ensure that code behaves as      ║
    ║ expected, which can lead to more reliable code. However, the             ║
    ║ effectiveness of tests depends on their quality and coverage. Therefore, ║
    ║ while tests generally contribute to code reliability, they do not        ║
    ║ guarantee it. </think>                                                   ║
    ║ <response>True</response>                                                ║
    ╚══════════════════════════════════════════════════════════════════════════╝
    ╔═══════════════════[ BoolResponseAgent - Response ]═══════════════════════╗
    ║ <think>Tests can help identify bugs and ensure that code behaves as      ║
    ║ expected, which can contribute to more reliable code. However, the       ║
    ║ effectiveness of tests depends on their quality and coverage. Therefore, ║
    ║ while tests generally improve reliability, they do not guarantee it.     ║
    ║ </think>                                                                 ║
    ║ <response>True</response>                                                ║
    ╚══════════════════════════════════════════════════════════════════════════╝
    ╔════════════════[ BoolResponseAgent - Final Response ]════════════════════╗
    ║ True                                                                     ║
    ╚══════════════════════════════════════════════════════════════════════════╝
```


In [5]:
class BoolResponseAgent:
    """
    An agent that processes yes/no questions and returns boolean responses.

    This agent uses a language model to analyze questions and provide True, False,
    or None responses. For ambiguous questions, it can make multiple calls to the
    model and return the most common response. This is also known as self-
    consistency prompting. See Wang et al. (2022) (https://arxiv.org/abs/2203.11171)
    for more details.

    Attributes:
        name: Name identifier for the agent
        quiet: Whether to suppress printed output
        print_tab_level: Indentation level for printed output
        temperature: Temperature parameter for LLM sampling
    """

    system_prompt = """
    You are a chatbot that will:
    * receieve a question from the user
    * if the question is a yes or no question, think about the response using <think> and </think> tags
    * finally, respond with either <response>True</response> or <response>False</response>
    * If you are not able to answer, respond with <response>None</response>
    """

    def __init__(
        self,
        name: str = "BoolResponseAgent",
        quiet: bool = False,
        print_tab_level: int = 0,
        temperature: float = 0.4,
    ):
        self.name = name
        self.quiet = quiet
        self.print_tab_level = print_tab_level
        self.temperature = temperature

    def _run_single_query(self, messages: list[dict[str, str]]) -> bool | None:
        """
        Execute a single query to the language model and parse the response.

        Uses do_chat_completion to generate a response from the provided messages.
        Returns True, False, or None based on the model's response format.

        Args:
            messages: List of message dictionaries to send to the model
            query_num: Query number for tracking multiple calls

        Returns:
            bool: True or False if the model gives a definitive answer
            None: If the model cannot provide an answer

        Raises:
            ValueError: If the response doesn't contain a valid format
        """
        from project_lib import do_chat_completion

        response = do_chat_completion(messages, temperature=self.temperature)

        if not self.quiet:
            print_in_box(
                response,
                title=f"{self.name} - Response",
                tab_level=self.print_tab_level + 1,
            )

        if "<response>True</response>" in response:
            return True
        elif "<response>False</response>" in response:
            return False
        elif "<response>None</response>" in response:
            return None
        else:
            raise ValueError("Invalid response from BoolResponseAgent")

    def get_response(self, query: str, num_calls: int = 1) -> bool | None:
        """
        Get a boolean response to a query by running the query through the LLM.

        Args:
            query: The question to ask the LLM
            num_calls: Number of times to call the LLM (for consensus)

        Returns:
            bool: True or False based on LLM response
            None: If the LLM cannot determine an answer
        """
        from collections import Counter

        messages = [
            {"role": "system", "content": self.system_prompt},
            {"role": "user", "content": query},
        ]
        if not self.quiet:
            print_in_box(
                query, title=f"{self.name} - Query ", tab_level=self.print_tab_level
            )

        responses = [self._run_single_query(messages) for _ in range(1, num_calls + 1)]
        counter = Counter(responses)
        most_common_response = counter.most_common(1)[0][0]

        if not self.quiet:
            print_in_box(
                most_common_response,
                title=f"{self.name} - Final Response",
                tab_level=self.print_tab_level + 1,
            )

        return most_common_response


# Quick tests to guide the development of the agent
bool_response_agent = BoolResponseAgent()

assert bool_response_agent.get_response("Are you a robot?") is True
assert bool_response_agent.get_response("Are you a human?", num_calls=3) is False
assert bool_response_agent.get_response("Is this statement false?", num_calls=3) is None

print("All tests passed! Congratulations! 🤗")



╔══════════════════════════════════[ BoolResponseAgent - Query  ]══════════════════════════════════╗
║ Are you a robot?                                                                                 ║
╚══════════════════════════════════════════════════════════════════════════════════════════════════╝
    ╔═════════════════════════════[ BoolResponseAgent - Response ]═════════════════════════════════╗
    ║ <think>Yes, I am a chatbot, which is a type of robot designed to interact with users through ║
    ║ text.</think>                                                                                ║
    ║ <response>True</response>                                                                    ║
    ╚══════════════════════════════════════════════════════════════════════════════════════════════╝
    ╔══════════════════════════[ BoolResponseAgent - Final Response ]══════════════════════════════╗
    ║ True                                                                                

## Create our base ChatAgent class

In the following cell we will create the ChatAgent class. Here is how it will work:

```python
chat_agent = ChatAgent("Grace")
chat_agent.chat("Hello, how are you?")
```

This will print the following for debugging purposes.

```text
╔══════════════════════════[ Grace - System Prompt ]═══════════════════════════╗
║ You are a helpful assistant. Your name is Grace.                             ║
╚══════════════════════════════════════════════════════════════════════════════╝
╔═══════════════════════════[ Grace - User Prompt ]════════════════════════════╗
║ Hello, how are you?                                                          ║
╚══════════════════════════════════════════════════════════════════════════════╝
    ╔════════════════════[ Grace - Assistant Response ]════════════════════════╗
    ║ Hello! I'm just a program, so I don't have feelings, but I'm here and    ║
    ║ ready to help you. How can I assist you today?                           ║
    ╚══════════════════════════════════════════════════════════════════════════╝
```

The call to `chat_agent.chat` will return the string "Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?"



In [6]:
class ChatAgent:
    """A chat agent that interacts with OpenAI's API to facilitate conversations.

    This class manages chat history, formats system prompts, and handles
    communication with OpenAI's chat completion API. It provides methods to
    add messages, get responses, and maintain conversation context.

    Attributes:
        system_prompt_template (str): Template for the system prompt using {variable_name} placeholders.
        messages (list): The history of messages in the conversation.
        quiet (bool): Whether to suppress printing of messages.
        name (str): The name of the chat agent.
        print_tab_level (int): Base indentation level for printed messages.
        template_kwargs (dict): Keyword arguments for formatting the system prompt.
    """

    system_prompt_template = """
        You are a helpful assistant. Your name is {name}.
    """
    messages = []

    def __init__(self, name=None, quiet=False, template_kwargs=None, print_tab_level=0):
        self.quiet = quiet
        self.name = name or self.__class__.__name__
        self.print_tab_level = print_tab_level

        self.template_kwargs = template_kwargs or {}
        self.template_kwargs["name"] = self.name

        self.reset()

    def add_message(self, role, content):
        """Add a message to the chat history.

        Args:
            role (str): The role of the message ("system", "user", or "assistant").
            content (str): The content of the message.

        If `quiet` is False, the message will be printed in a formatted box. It will
        be formatted by the `print_tab_level` attribute. If the role is assistant,
        the `print_tab_level` will be incremented by 1.
        """
        if role not in ["system", "user", "assistant"]:
            raise ValueError(f"Invalid role: {role}")
        self.messages.append({"role": role, "content": content})
        if not self.quiet:
            if role == "system":
                print_in_box(
                    content,
                    f"{self.name} - System Prompt",
                    tab_level=self.print_tab_level,
                )
            elif role == "user":
                print_in_box(
                    content,
                    f"{self.name} - User Prompt",
                    tab_level=self.print_tab_level,
                )
            elif role == "assistant":
                print_in_box(
                    content,
                    f"{self.name} - Assistant Response",
                    tab_level=self.print_tab_level + 1,
                )

    def reset(self):
        """Reset the chat history and re-initialize with the system prompt.

        This method clears all existing messages and adds the system prompt
        formatted with the template_kwargs.
        """
        from textwrap import dedent

        self.messages = []
        self.add_message(
            "system",
            dedent(self.system_prompt_template.format(**(self.template_kwargs or {}))),
        )

    def get_response(self, add_to_messages=True):
        """Get a response from the OpenAI API.

        Args:
            add_to_messages (bool, optional): Whether to add the response to the chat history
            using the add_message method and the assistant role. Defaults to True.

        Returns:
            str: The response from the OpenAI API.


        """
        from project_lib import do_chat_completion

        response = do_chat_completion(
            messages=self.messages,
        )
        if add_to_messages:
            self.add_message("assistant", response)
        return response

    def chat(self, user_message):
        """Send a message to the chat and get a response.

        Args:
            user_message (str): The message to send to the chat.

        Returns:
            str: The response from the OpenAI API.
        """
        self.add_message("user", user_message)
        return self.get_response(add_to_messages=True)


# Quick tests to verify the chat agent works

chat_agent = ChatAgent("Grace")
assert len(chat_agent.messages) == 1  # +1 (System prompt)

chat_agent.chat("Hello, how are you?")
assert len(chat_agent.messages) == 3  # +2 (User prompt + Assistant response)

resp = chat_agent.chat("What is 2+2? Respond using only words.")
assert "four" in resp.lower()
assert len(chat_agent.messages) == 5  # +2 (User prompt + Assistant response)

chat_agent.chat("My name is Ada")
assert len(chat_agent.messages) == 7  # +2 (User prompt + Assistant response)

chat_agent.chat("What is the first letter of my name?")  # This agent has memory
assert len(chat_agent.messages) == 9  # +2 (User prompt + Assistant response)

chat_agent.chat("What is the first letter of your name?")  # This agent has memory
assert len(chat_agent.messages) == 11  # +2 (User prompt + Assistant response)

print("🎉🎉🎉 Yay! Tests passed! Congrats!")


╔════════════════════════════════════[ Grace - System Prompt ]═════════════════════════════════════╗
║ You are a helpful assistant. Your name is Grace.                                                 ║
╚══════════════════════════════════════════════════════════════════════════════════════════════════╝

╔═════════════════════════════════════[ Grace - User Prompt ]══════════════════════════════════════╗
║ Hello, how are you?                                                                              ║
╚══════════════════════════════════════════════════════════════════════════════════════════════════╝
    ╔══════════════════════════════[ Grace - Assistant Response ]══════════════════════════════════╗
    ║ Hello! I'm just a computer program, so I don't have feelings, but I'm here and ready to help ║
    ║ you. How can I assist you today?                                                             ║
    ╚════════════════════════════════════════════════════════════════════════════════════

## Our vacation

Let's encode the details of our vacation in JSON format. This will be useful for developing our agents.

In [7]:
# Information about out vacation
vacation_info = {
    "travelers": [
        {
            "name": "Yuri",
            "age": 30,
            "interests": ["tennis", "cooking"],
            "dietary_restrictions": ["nut allergies"],
        },
        {
            "name": "Hiro",
            "age": 25,
            "interests": ["reading", "music"],
            "dietary_restrictions": ["vegetarian"],
        },
    ],
    "destination": "AgentsVille",
    "date_of_arrival": "2025-06-10",
    "date_of_departure": "2025-06-11",
    "budget": 1000,
}


## The Traveler Agent. That's Us!

In order to create an agent/chatbot that will help users design an exciting vacation, it will be useful to create an agent that can respond to our chatbot programmatically as if it were one of our users. That's right, we are going to build an agent of ourselves!

In [None]:
from textwrap import dedent


class Traveler(ChatAgent):
    system_prompt_template = """
    You are a traveler who is planning a vacation. You have specific information about your trip, 
    and you should respond to questions based on this information.
    
    Here are the details of your vacation:
    {vacation_info}
    
    When asked questions about your trip, respond naturally and helpfully using the information provided.
    Be conversational but accurate. If asked for specific data like numbers or dates, provide them clearly.
    
    For example:
    - If asked about group size, mention the number of travelers
    - If asked about dates, provide the exact dates from your trip information
    - If asked about interests, mention the interests of the travelers in your group
    - If asked about dietary restrictions, mention any restrictions your group has
    
    Always be helpful and provide the information that's being requested based on your vacation details.
    """


# Quick tests to make sure our agent is working

traveler = Traveler(template_kwargs={"vacation_info": vacation_info})


traveler.chat("Hello, how are you?")

response = traveler.chat(
    "How many people are in your group? Answer with a number, e.g. 7"
)
assert str(len(vacation_info["travelers"])) in response, (
    "The response should contain the number of people in the group."
)

response = traveler.chat(
    "When is your last day of travel? Answer in YYYY-MM-DD format, e.g. 2000-01-01"
)
assert vacation_info["date_of_departure"] in response, (
    "The response should contain the last day of travel."
)

# Celebrate

print(
    f"Traveler Agent successfully answered a few quesions! To put this in production, we'd expand on this dataset. This is an excellent start. Let's continue building!"
)



╔═══════════════════════════════════[ Traveler - System Prompt ]═══════════════════════════════════╗
║ TODO                                                                                             ║
╚══════════════════════════════════════════════════════════════════════════════════════════════════╝

╔════════════════════════════════════[ Traveler - User Prompt ]════════════════════════════════════╗
║ Hello, how are you?                                                                              ║
╚══════════════════════════════════════════════════════════════════════════════════════════════════╝
    ╔════════════════════════════[ Traveler - Assistant Response ]═════════════════════════════════╗
    ║ Hello! I'm just a program, but I'm here and ready to help you. How can I assist you today?   ║
    ╚══════════════════════════════════════════════════════════════════════════════════════════════╝

╔════════════════════════════════════[ Traveler - User Prompt ]═════════════════════════

AssertionError: The response should contain the number of people in the group.

# Structuring the output of LLMs

The output of LLMs can be structured by prompting the LLM to output, e.g., in JSON format. Due to the stochastic nature of LLMs, these outputs do not always strictly adhere to proper JSON format, unless the LLM offering natively offers strict adherence to a structured format. For pedagogical purposes, we will write our own function to location and parse JSON from the output of a LLM.

For example, consider the following output from a LLM:

```text
    Here is the data you requested:
    ```json
    {
        "name": "Turing",
        "age": 30,
        "city": "New York"
    }
    ```
```

Our function will extract the JSON from this output and return it as a Python dictionary.

```python
>>> extract_json_from_output(output)
{'name': 'Turing', 'age': 30, 'city': 'New York'}
```

Additionally, it will work with invalid JSON, such as:

```python
>>> extract_json_from_output('```json\n{{"a: 1}\n```')
{'a': 1}
```



In [None]:
from typing import Any


def find_and_parse_json(resp: str, max_retries: int = 3) -> dict[str, Any] | None:
    """
    Find and parse a JSON dictionary from a string, using chat completion if necessary.

    Args:
        resp (str): The string to search for JSON.
        max_retries (int, optional): The maximum number of retries. Defaults to 3.

    Returns:
        dict[str, Any] | None: The parsed JSON object, or None if no JSON was found.

    If the JSON is invalid, the function will retry up to max_retries times.
    """
    import json
    import re
    from project_lib import do_chat_completion

    pattern = r"\{.*\}"

    match = re.search(pattern, resp, re.DOTALL)
    if match:
        json_str = match.group()
        try:
            return json.loads(json_str)
        except json.JSONDecodeError:
            if max_retries <= 0:
                print("❌ Failed to parse JSON")
                return None
            messages = [
                {
                    "role": "system",
                    "content": "You extract and properly format JSON given a string.",
                },
                {"role": "user", "content": resp},
            ]

            resp = do_chat_completion(messages)

            return find_and_parse_json(resp, max_retries=max_retries - 1)

    return None


# Test cases
assert find_and_parse_json('```json\n{"a": 1}\n```') == {"a": 1}
assert find_and_parse_json('{"a": 1}') == {"a": 1}
assert find_and_parse_json('```json\n{{"a: 1}\n```') == {
    "a": 1
}  # An extra opening brace

print("✅ All tests passed")

## Creating the OnboardingAgent

Next we will create the onboarding agent, which will speak with our traveler (agent). For example:


```python
>>> onboarding_agent = OnboardingAgent()
>>> traveler = Traveler(template_kwargs={"vacation_info": vacation_info})

>>> gathered_vacation_info = onboarding_agent.gather_vacation_info(traveler_agent=traveler)
```

This will return a dictionary of the vacation information, ideally identical to `vacation_info`, though in practice the LLM may slightly modify it (e.g. capitalization)

In [None]:
class OnboardingAgent(ChatAgent):
    system_prompt_template = """
    
    TODO
    
    """

    def gather_vacation_info(
        self, traveler_agent: ChatAgent, starting_msg="Hello", max_turns: int = 20
    ) -> dict[str, Any] | None:
        """
        Conducts a conversation between the onboarding agent and traveler agent to gather vacation information.

        Args:
            traveler_agent: The agent representing the traveler providing vacation details
            starting_msg: Initial message to start the conversation (default: "Hello")
            max_turns: Maximum number of conversation turns before timing out (default: 20)

        Returns:
            A dictionary containing the gathered vacation information or None if unsuccessful

        Raises:
            ValueError: If the conversation exceeds the maximum number of turns without completion
        """
        onboarding_agent = self
        traveler = traveler_agent

        msg = starting_msg

        for _ in range(max_turns):
            msg = onboarding_agent.chat(msg)

            final_answer = find_and_parse_json(msg)
            if final_answer:
                return final_answer

            msg = traveler.chat(msg)

        raise ValueError(f"Failed to complete onboarding after {max_turns} turns.")


# Run tests
onboarding_agent = OnboardingAgent()
traveler = Traveler(template_kwargs={"vacation_info": vacation_info}, quiet=True)

gathered_vacation_info = onboarding_agent.gather_vacation_info(traveler_agent=traveler)

print_in_box(pformat(vacation_info), title="Vacation Info")
print_in_box(pformat(gathered_vacation_info), title="Final Answer")

assert compare_dicts_case_insensitive(vacation_info, gathered_vacation_info)
print(
    "Congratuations! Your agent successfully gathered all the information needed! 🎉🎉🎉"
)


# Examining the functions `get_weather` and `get_events`

The functions `get_weather` and `get_events` (which we will turn into tools) are defined in the `project_lib` module using static data for the purposes of this project. Although these would be dynamic in a production environment, during development and testing, we would use lots of synthetic data for reproducible results. Let's take a look at how they work.

In [None]:
# The `get_weather` function returns the weather for a given date and city.

from project_lib import get_weather

get_weather(date="2025-06-10", city="AgentsVille")


In [None]:
# Implementing the get_activities_by_date_tool function

def get_activities_by_date_tool(date: str, city: str) -> list[dict]:
    """
    Retrieves available activities and events for a specific date and city.
    
    This tool function serves as the interface for the LLM to access activity data
    when planning travel itineraries. It provides comprehensive information about
    local attractions, events, dining options, and entertainment available on a 
    given date and location.
    
    Purpose:
        - Enable LLM agents to gather real-time activity information for itinerary planning
        - Provide structured data about local events, attractions, and entertainment options
        - Support intelligent recommendation systems for travel planning applications
        
    Args:
        date (str): The target date for activity lookup in ISO format (YYYY-MM-DD).
                   Must be a valid date string. Example: "2025-06-10"
        city (str): The destination city name for activity lookup. Should be a valid
                   city name string. Example: "AgentsVille"
    
    Returns:
        list[dict]: A list of activity dictionaries, where each dictionary contains:
            - name (str): The activity or event name
            - description (str): Detailed description of the activity
            - time (str): Start time in HH:MM format (24-hour)
            - duration (str): Expected duration (e.g., "2 hours", "3.5 hours")
            - price (int|float): Cost per person in local currency
            - category (str): Activity type ("cultural", "outdoor", "dining", "entertainment")
            - location (str): Specific venue or address information
            - capacity (int, optional): Maximum number of participants
            - requirements (list[str], optional): Special requirements or restrictions
    
    Example:
        >>> activities = get_activities_by_date_tool("2025-06-10", "AgentsVille")
        >>> print(activities[0])
        {
            "name": "AgentsVille Art Museum Tour", 
            "description": "Guided tour of contemporary digital art exhibitions",
            "time": "10:00",
            "duration": "2 hours", 
            "price": 25,
            "category": "cultural",
            "location": "123 Museum Ave, AgentsVille"
        }
    
    Note:
        This function wraps the underlying get_events() function from project_lib
        to provide a standardized tool interface for LLM agents.
    """
    from project_lib import get_events
    
    # Validate input parameters
    if not isinstance(date, str):
        raise TypeError(f"Date must be a string, got {type(date)}")
    if not isinstance(city, str):
        raise TypeError(f"City must be a string, got {type(city)}")
    
    # Validate date format (basic check)
    import re
    if not re.match(r'^\d{4}-\d{2}-\d{2}$', date):
        raise ValueError(f"Date must be in YYYY-MM-DD format, got: {date}")
    
    # Call the underlying events function
    return get_events(date=date, city=city)

# Test the function
test_activities = get_activities_by_date_tool("2025-06-10", "AgentsVille")
print_in_box(f"Found {len(test_activities)} activities", title="Tool Test Results")
print_in_box(str(test_activities[:2]), title="Sample Activities")


In [None]:
# The `get_events` function returns the events for a given date and city.

from project_lib import get_events

get_events(date="2025-06-10", city="AgentsVille")[:2]


# The `ItineraryAgent` class (Using ReAct)

Let's make a new itinerary agent.

In [None]:
class ItineraryAgent(ChatAgent):
    system_prompt_template = """
        You are an expert travel planner and travel agent specializing in creating personalized, comprehensive travel itineraries. Your expertise includes understanding local attractions, weather patterns, cultural events, dining options, and how to balance different travelers' interests within budget constraints.

        Your task is to generate a detailed, day-by-day travel itinerary based on the provided vacation information. You must create a plan that considers all travelers' interests, respects dietary restrictions, accounts for weather conditions, and stays within the specified budget.

        ## Chain-of-Thought Process:

        1. **ANALYZE the vacation information**: Review traveler details (names, ages, interests, dietary restrictions), destination, dates, and budget
        2. **THINK about weather and seasonal considerations**: Consider how weather might affect activity choices
        3. **GATHER information**: Use available tools to collect weather data and activity options for each day
        4. **BALANCE interests**: Ensure each traveler's interests are represented fairly across the itinerary
        5. **CALCULATE costs**: Keep track of expenses to stay within budget
        6. **OPTIMIZE the plan**: Arrange activities logically by time, location, and weather suitability

        ## Available Tools:

        You have access to the following tools to gather information:
        - **weather**: Get weather data for a specific date and city - use {"tool": "weather", "date": "YYYY-MM-DD", "city": "City Name"}
        - **events**: Get available activities and events for a specific date and city - use {"tool": "events", "date": "YYYY-MM-DD", "city": "City Name"}
        - **final_output**: Submit your final itinerary - use {"tool": "final_output", "city": "City", "start_date": "YYYY-MM-DD", "end_date": "YYYY-MM-DD", "itinerary": [...], "total_cost": number}

        ## ReAct Cycle Instructions:

        Use the following format for each step:

        <thought>
        [Your reasoning about what information you need next, what you've learned so far, or what decision you're making]
        </thought>

        <act>
        {"tool": "tool_name", "parameter": "value"}
        </act>

        After receiving observations, continue with the next thought-action cycle until you have all necessary information to create the complete itinerary.

        ## Required Output Format:

        Your final output must be a JSON object with this exact structure:
        ```json
        {
            "tool": "final_output",
            "city": "AgentsVille",
            "start_date": "YYYY-MM-DD",
            "end_date": "YYYY-MM-DD",
            "itinerary": [
                {
                    "date": "YYYY-MM-DD",
                    "weather": "weather description",
                    "activities": [
                        {
                            "name": "Activity Name",
                            "description": "Brief description",
                            "time": "HH:MM",
                            "duration": "X hours",
                            "price": 50,
                            "category": "cultural/outdoor/dining/entertainment"
                        }
                    ]
                }
            ],
            "total_cost": 500
        }
        ```

        ## Important Guidelines:

        - **Weather Awareness**: Don't schedule outdoor activities during bad weather
        - **Interest Balance**: Ensure each traveler has activities matching their interests
        - **Dietary Respect**: Choose restaurants and food activities that accommodate all dietary restrictions
        - **Budget Compliance**: Keep total costs within the specified budget
        - **Logical Scheduling**: Arrange activities in a sensible time order
        - **Detailed Descriptions**: Provide clear, helpful descriptions for each activity
        - **Accurate Pricing**: Use realistic pricing information from the events data

        Begin by analyzing the vacation information provided and gathering weather and activity data for each day of the trip.
    """

    def get_itinerary(self, vacation_info):
        final_output = None
        max_steps = 20

        step_num = 0

        self.add_message(
            "user",
            (
                msg := f"""
            Here is information on the trip collected by the Onboarding Agent:
            {vacation_info}.
            Please design a daily itinerary for the trip.
            """
            ),
        )
        while step_num < max_steps and final_output is None:
            step_num += 1
            self.add_message(
                "user",
                f"Let's start the {step_num}th step of the ReAct cycle, starting with <thought>...</thought>.",
            )
            resp = self.get_response()  # <thought>...</thought>
            resp = self.get_response()  # <act>...</act>

            # Parse the action
            obj = find_and_parse_json(resp)

            if obj["tool"] == "weather":
                tool_results = get_weather(date=obj["date"], city=obj["city"])
            elif obj["tool"] == "events":
                tool_results = get_events(date=obj["date"], city=obj["city"])
            elif obj["tool"] == "get_activities_by_date":
                tool_results = get_activities_by_date_tool(date=obj["date"], city=obj["city"])
            elif obj["tool"] == "final_output":
                final_output = obj
                break
            else:
                raise ValueError("Invalid action")

            # Add the observation to the message history
            self.add_message(
                "user",
                f"<observation>{tool_results}</observation>",
            )

        return final_output


# Quick test
itinerary_agent = ItineraryAgent(quiet=False)

final_itinerary_1 = itinerary_agent.get_itinerary(vacation_info=gathered_vacation_info)
print_in_box(final_itinerary_1, title="Final Itinerary")

if final_itinerary_1 is not None:
    print("Final itinerary generated successfully. Congratulations!")

## But is this itinerary any good?

We've successfully created an itinerary, but how do we know if it's any good?

Now we will create some evaluation functions (sometimes called evals) to help us determine the quality of the itinerary. We will not only want our final output to be of the highest quality possible initially, but we also want to give the chance for the LLM to reflect on its own output and make improvements at a second pass.

In [None]:
# Let's take a look at this itinerary. Is it everything you expected?

final_itinerary_1

In [None]:
# Define the ACTIVITY_AND_WEATHER_ARE_COMPATIBLE_SYSTEM_PROMPT

# Define AgentError here since we need it before the main evaluation functions
class AgentError(Exception):
    pass

ACTIVITY_AND_WEATHER_ARE_COMPATIBLE_SYSTEM_PROMPT = """
You are a Weather and Activity Compatibility Expert with extensive knowledge of outdoor activities and how different weather conditions affect them.

Your role is to evaluate whether a specific activity should be avoided due to current weather conditions. You must analyze both the activity details and weather information to make an accurate assessment.

## Task:
Determine if the given activity should be avoided due to the described weather conditions. Consider:
1. Whether the activity is primarily outdoors or indoors
2. How the specific weather conditions might impact the activity
3. Safety concerns related to the activity under these weather conditions
4. Whether the activity's enjoyment would be significantly diminished by the weather

## Output Format:
You must respond with ONLY one of these exact values:
- "True" if the activity should be avoided due to weather conditions
- "False" if the activity can proceed despite the weather conditions

## Examples:

Example 1:
Activity: {"name": "Beach Volleyball Tournament", "description": "Competitive beach volleyball games on the sand courts", "category": "outdoor"}
Weather: "Heavy thunderstorms with lightning and strong winds"
Output: True

Example 2:
Activity: {"name": "Museum Tour", "description": "Guided tour of the city's art museum", "category": "cultural"}
Weather: "Heavy rain and wind"
Output: False

Example 3:
Activity: {"name": "Scenic Hiking Trail", "description": "Moderate 5-mile hike through forest paths", "category": "outdoor"}
Weather: "Light drizzle, 65°F"
Output: False

Example 4:
Activity: {"name": "Outdoor Concert", "description": "Evening music performance in the park", "category": "entertainment"}
Weather: "Clear skies, but extremely cold at 20°F with strong winds"
Output: True

Example 5:
Activity: {"name": "Rooftop Dining", "description": "Dinner at a restaurant with open-air rooftop seating", "category": "dining"}
Weather: "Moderate rain showers"
Output: True
"""

# Create a specialized weather compatibility agent
weather_compatibility_agent = BoolResponseAgent(
    name="WeatherCompatibilityAgent", 
    quiet=True
)

# Override the system prompt
weather_compatibility_agent.system_prompt = ACTIVITY_AND_WEATHER_ARE_COMPATIBLE_SYSTEM_PROMPT

# Define a new function that uses our specialized agent to evaluate weather compatibility
def check_activity_weather_compatibility(activity, weather):
    """
    Uses the specialized weather compatibility agent to determine if an activity 
    should be avoided due to the given weather conditions.
    
    Args:
        activity (dict): The activity details
        weather (str): The weather description
        
    Returns:
        bool: True if the activity should be avoided, False otherwise
    """
    query = f"Activity: {activity}\nWeather: {weather}"
    return weather_compatibility_agent.get_response(query)

# Now update the evaluation function to use our specialized agent
def eval_itinerary_weather_compatibility(vacation_info, final_output):
    """Verifies that the itinerary does not recommend outdoor activities during inclement weather conditions.
    Uses the specialized weather compatibility agent for evaluation.

    Args:
        vacation_info (dict): Contains the vacation details
        final_output (dict): Contains the itinerary details including daily activities and weather conditions

    Raises:
        AgentError: If any activities are scheduled during weather conditions that make them unsuitable
    """
    activities_that_should_be_avoided = []
    for itinerary_item in final_output["itinerary"]:
        weather = itinerary_item["weather"]
        
        for activity in itinerary_item["activities"]:
            # Use our specialized weather compatibility agent
            should_avoid = check_activity_weather_compatibility(activity, weather)
            
            if should_avoid:
                activities_that_should_be_avoided.append(activity)

    if activities_that_should_be_avoided:
        raise AgentError(
            f"The following activities should be avoided due to weather conditions:\n\nActivities: {activities_that_should_be_avoided}\n\nWeather: {weather}"
        )


## Writing Evaluation Functions (evals)

We will now write a set of evaluation functions to check if the itinerary is valid. Our functions will combine both regular procedural code in Python as well as calls to the bool_response_agent to answer questions such as: "Does the following activity match any of these interests?"

These evaluation functions will either pass (return None) or raise an AgentError with a message that our LLM can use to revise its results.

In [None]:
# Let's write some evaluation functions!


class AgentError(Exception):
    pass


def eval_city_matches(vacation_info, final_output):
    """Verifies that the destination city specified in vacation_info matches the city in final_output.

    Args:
        vacation_info (dict): Contains the vacation details including the destination city
        final_output (dict): Contains the itinerary details including the city

    Raises:
        AgentError: If the cities don't match
    """
    if vacation_info["destination"] != final_output["city"]:
        raise AgentError(
            f"Cities do not match: {vacation_info['destination']} != {final_output['city']}"
        )


def eval_start_end_dates_match(vacation_info, final_output):
    """Verifies that the arrival and departure dates in vacation_info match the start and end dates in final_output.

    Args:
        vacation_info (dict): Contains the vacation details including arrival and departure dates
        final_output (dict): Contains the itinerary details including start and end dates

    Raises:
        AgentError: If either the arrival date doesn't match the start date or the departure date doesn't match the end date
    """
    if (
        vacation_info["date_of_arrival"] != final_output["start_date"]
        or vacation_info["date_of_departure"] != final_output["end_date"]
    ):
        raise AgentError(
            f"Dates do not match: {vacation_info['date_of_arrival']} != {final_output['start_date']} or {vacation_info['date_of_departure']} != {final_output['end_date']}"
        )


def eval_total_cost_is_accurate(vacation_info, final_output):
    """Verifies that the total cost stated in final_output matches the sum of all activity prices.

    Args:
        vacation_info (dict): Contains the vacation details
        final_output (dict): Contains the itinerary details including activities with prices and total cost

    Raises:
        AgentError: If the calculated total cost doesn't match the stated total cost
    """
    actual_total_cost = 0

    for itinerary_item in final_output["itinerary"]:
        for activity in itinerary_item["activities"]:
            actual_total_cost += int(activity["price"])

    stated_total_cost = int(final_output["total_cost"])

    if actual_total_cost != stated_total_cost:
        raise AgentError(
            f"Stated total cost does not match calculated total cost: {actual_total_cost} != {stated_total_cost}"
        )


def eval_itinerary_matches_interests_and_is_balanced(vacation_info, final_output):
    """Verifies that the itinerary includes activities matching each traveler's interests and is balanced among all travelers.

    Args:
        vacation_info (dict): Contains the vacation details including traveler information and their interests
        final_output (dict): Contains the itinerary details including daily activities

    Raises:
        AgentError: If any traveler has no matching activities or if one traveler has more than twice
                   the number of matching activities compared to another traveler
    """
    traveler_to_interests = {}
    traveler_to_interest_hit_counts = {}

    for traveler in vacation_info["travelers"]:
        traveler_to_interests[traveler["name"]] = traveler["interests"]
        traveler_to_interest_hit_counts[traveler["name"]] = 0

    for traveler, interests in traveler_to_interests.items():
        for itinerary_item in final_output["itinerary"]:
            for activity in itinerary_item["activities"]:
                query = f"""Does the following activity match any of these interests: {interests}?\n\nActivity: {activity}"""
                activity_matches_traveler_interests = bool_response_agent.get_response(
                    query
                )
                if activity_matches_traveler_interests:
                    traveler_to_interest_hit_counts[traveler] += 1

    # If any of the travelers have 0 matches, raise an error
    for traveler, interest_hit_count in traveler_to_interest_hit_counts.items():
        if interest_hit_count == 0:
            raise AgentError(f"Traveler {traveler} has no matches with the itinerary.")

    # If any traveller has more than twice the number of matched interests of any other traveller, raise an error
    min_hit_count = min(traveler_to_interest_hit_counts.values())
    max_hit_count = max(traveler_to_interest_hit_counts.values())
    if max_hit_count > 2 * min_hit_count:
        min_hit_count_traveler_name = min(
            traveler_to_interest_hit_counts, key=traveler_to_interest_hit_counts.get
        )
        max_hit_count_traveler_name = max(
            traveler_to_interest_hit_counts, key=traveler_to_interest_hit_counts.get
        )
        raise AgentError(
            f"Traveler {max_hit_count_traveler_name} has more than twice the number of matched interests of {min_hit_count_traveler_name}. {max_hit_count} is more than twice {min_hit_count}."
        )


def eval_itinerary_does_not_recommend_outdoor_activities_during_inclimate_weather_conditions(
    vacation_info, final_output
):
    """Verifies that the itinerary does not recommend outdoor activities during inclement weather conditions.

    Args:
        vacation_info (dict): Contains the vacation details
        final_output (dict): Contains the itinerary details including daily activities and weather conditions

    Raises:
        AgentError: If any outdoor activities are scheduled during weather conditions that could ruin them
    """
    activities_that_may_be_ruined_by_inclimate_weather = []
    for itinerary_item in final_output["itinerary"]:
        weather = itinerary_item["weather"]

        inclimate_weather_prompt = f"""Does the following weather condition prevent outdoor activities?\n\nWeather: {weather}"""
        inclimate_weather_response = bool_response_agent.get_response(
            inclimate_weather_prompt
        )
        if not inclimate_weather_response:
            continue

        for activity in itinerary_item["activities"]:
            activity_is_outdoors = bool_response_agent.get_response(
                f"""Is the following activity outdoors?\n\nActivity: {activity}"""
            )
            if not activity_is_outdoors:
                continue

            query = dedent(f"""
                Could the following activity be ruined by the following weather condition?

                Activity: {activity}
                Weather: {weather}""")
            activity_possibly_ruined = bool_response_agent.get_response(query)
            if activity_possibly_ruined:
                activities_that_may_be_ruined_by_inclimate_weather.append(activity)

    if activities_that_may_be_ruined_by_inclimate_weather:
        raise AgentError(
            f"Activities that may be ruined by the following weather conditions.\n\nActivities: {activities_that_may_be_ruined_by_inclimate_weather}\n\nWeather: {weather}"
        )


def eval_itinerary_respects_dietary_restrictions(vacation_info, final_output):
    """Verifies that the itinerary respects the dietary restrictions of all travelers.

    Args:
        vacation_info (dict): Contains the vacation details including travelers and their dietary restrictions
        final_output (dict): Contains the itinerary details including daily activities

    Raises:
        AgentError: If any activity involves eating or drinking that would definitely unsuitable for a traveler's dietary restrictions
    """
    travelers_to_dietary_restrictions = {}
    for traveler in vacation_info["travelers"]:
        travelers_to_dietary_restrictions[traveler["name"]] = traveler[
            "dietary_restrictions"
        ]

    for traveler, restrictions in travelers_to_dietary_restrictions.items():
        if not restrictions:
            continue
        for itinerary_item in final_output["itinerary"]:
            for activity in itinerary_item["activities"]:
                activity_involves_eating_or_drinking = bool_response_agent.get_response(
                    f"""Does the following activity involve eating or drinking?\n\nActivity: {activity}"""
                )
                if not activity_involves_eating_or_drinking:
                    continue

                query = dedent(f"""
                    Would the following activity definitely be unsuitable for someone with the following dietary restrictions?

                    Only consider foods that are explicitly mentioned in the activity description. If there is not enough
                    information to make a decision, return False.

                    Activity: {activity}
                    Dietary Restrictions: {restrictions}""")
                definitely_unsuitable = bool_response_agent.get_response(query)
                if definitely_unsuitable is True:
                    raise AgentError(
                        f"Activity {activity['name']} would be unsuitable for a traveler with the following dietary restrictions: {restrictions}"
                    )


# Let's get the evaluation results!

EVAL_FUNCTIONS = [
    eval_city_matches,
    eval_start_end_dates_match,
    eval_total_cost_is_accurate,
    eval_itinerary_matches_interests_and_is_balanced,
    eval_itinerary_weather_compatibility,  # Using our new weather compatibility function
    eval_itinerary_respects_dietary_restrictions,
]


def get_eval_results(vacation_info, final_output):
    eval_results = []
    for eval_fn in EVAL_FUNCTIONS:
        try:
            eval_fn(vacation_info, final_output)
        except AgentError as e:
            error_msg = str(e)
            print_in_box(error_msg, title="Evaluation Error")
            print("\n\n")

            eval_results.append(error_msg)
    return eval_results


eval_results = get_eval_results(
    vacation_info=vacation_info, final_output=final_itinerary_1
)

eval_results

# ItineraryRevisionAgent: Reflecting on our mistakes




In [None]:
class ItineraryRevisionAgent(ChatAgent):
    system_prompt_template = """
        You are an expert Travel Itinerary Revision Agent with specialized knowledge in evaluating and improving travel plans. Your role is to analyze proposed itineraries, identify issues through evaluation feedback, and systematically revise them to create optimal travel experiences.

        ## Your Primary Task:
        Revise and improve a proposed travel itinerary by addressing specific evaluation errors and ensuring all traveler requirements are met. You must create a refined itinerary that passes all quality checks while maintaining the original trip's core objectives.

        ## THINK-ACT-OBSERVE Cycle Instructions:

        Use this exact cycle format for systematic reasoning and tool usage:

        ### THINK Phase:
        <thought>
        [Analyze the current situation, evaluation feedback, and determine what information or action is needed next. Consider:
        - Which evaluation errors need to be addressed
        - What data you need to gather to fix identified issues
        - How to balance competing constraints (weather, interests, budget, dietary restrictions)
        - Which tool will help you gather the required information]
        </thought>

        ### ACT Phase:
        <act>
        [Use the exact JSON format to invoke tools - see tool specifications below]
        </act>

        ### OBSERVE Phase:
        [You will receive tool results that inform your next thinking cycle]

        ## Available Tools and Their Purposes:

        ### Data Gathering Tools:
        1. **weather** - Get weather conditions for specific dates and locations
           - Purpose: Verify weather compatibility with planned activities
           - Parameters: {"tool": "weather", "date": "YYYY-MM-DD", "city": "CityName"}
           - Use when: Evaluation shows weather-activity conflicts

        2. **events** - Retrieve available activities and events for specific dates and locations  
           - Purpose: Find alternative activities or verify activity details
           - Parameters: {"tool": "events", "date": "YYYY-MM-DD", "city": "CityName"}
           - Use when: Need to replace inappropriate activities or find better options

        3. **get_activities_by_date** - Enhanced activity lookup with comprehensive details
           - Purpose: Get detailed activity information for informed decision-making
           - Parameters: {"tool": "get_activities_by_date", "date": "YYYY-MM-DD", "city": "CityName"}
           - Use when: Need comprehensive activity data for revision planning

        ### Calculation Tools:
        4. **total_cost_calculator** - Calculate total cost of a list of activities
           - Purpose: Verify budget compliance and cost accuracy
           - Parameters: {"tool": "total_cost_calculator", "activities": [list_of_activity_objects]}
           - Use when: Evaluation shows cost calculation errors or budget issues

        ### Evaluation and Quality Assurance:
        5. **run_evals_tool** - Run comprehensive evaluation checks on revised itinerary
           - Purpose: Validate that revisions address all identified issues
           - Parameters: {"tool": "run_evals_tool", "vacation_info": vacation_object, "itinerary": revised_itinerary_object}
           - Use when: Ready to test if your revisions solved the evaluation errors
           - **CRITICAL**: You MUST run this tool before submitting your final answer

        ### Final Submission:
        6. **final_answer_tool** - Submit your completed revised itinerary
           - Purpose: Provide the final, evaluation-tested itinerary
           - Parameters: {"tool": "final_answer_tool", "city": "CityName", "start_date": "YYYY-MM-DD", "end_date": "YYYY-MM-DD", "itinerary": [...], "total_cost": number}
           - Use when: run_evals_tool confirms all issues are resolved

        ## EXACT ACTION FORMAT:
        When invoking tools, use this precise JSON format:
        {"tool_name": "[tool_name]", "arguments": {"arg1": "value1", "arg2": "value2", ...}}

        Examples:
        - {"tool": "weather", "date": "2025-06-10", "city": "AgentsVille"}
        - {"tool": "events", "date": "2025-06-10", "city": "AgentsVille"}
        - {"tool": "total_cost_calculator", "activities": [activity1, activity2, activity3]}
        - {"tool": "run_evals_tool", "vacation_info": {...}, "itinerary": {...}}

        ## Critical Process Requirements:

        ### 1. Evaluation-Driven Revision:
        - Carefully analyze each evaluation error message
        - Prioritize fixes based on severity (safety > budget > preferences)
        - Address root causes, not just symptoms

        ### 2. Systematic Approach:
        - Handle one major issue category at a time
        - Gather necessary data before making changes
        - Verify each change doesn't create new problems

        ### 3. Quality Assurance Protocol:
        - ALWAYS run run_evals_tool before final submission
        - If evaluations still fail, continue revision cycle
        - Only use final_answer_tool after successful evaluation

        ### 4. Constraint Balancing:
        - Weather safety takes precedence over preferences
        - Budget compliance is mandatory
        - Interest balance must be maintained across all travelers
        - Dietary restrictions are non-negotiable

        ## Exit Instruction:
        You MUST use the final_answer_tool invocation to provide your completed itinerary. The run_evals_tool must be successfully run (showing no evaluation errors) before using final_answer_tool.

        Begin by analyzing the evaluation feedback and determining which specific issues need to be addressed first.
    """

    @classmethod
    def cost_calculator(cls, activities):
        """Calculate the total cost of all activities"""
        total_cost = sum(activity.get("price", 0) for activity in activities)
        return {"total_cost": total_cost}

    def get_itinerary(self, vacation_info, proposed_itinerary):
        final_output = None
        max_steps = 20

        step_num = 0

        evaluation_results = get_eval_results(vacation_info, proposed_itinerary)

        self.add_message(
            "user",
            f"""
            Here is information on the trip collected by the Onboarding Agent:
            {vacation_info}.
            """,
        )
        self.add_message(
            "user",
            f"""
            Here is the proposed itinerary:
            {proposed_itinerary}.
            """,
        )
        self.add_message(
            "user",
            f"""
            Here is the evaluation results:
            {evaluation_results}.
            Please resolve the issues cited in the evaluation results.
            """,
        )
        while step_num < max_steps and final_output is None:
            step_num += 1
            self.add_message(
                "user",
                f"Let's start the {step_num}th step of the ReAct cycle, starting with <thought>...</thought>."
                " Always reference one of the available tools you will use in the next step along with parameters values.",
            )
            resp = self.get_response()  # <thought>...</thought>

            self.add_message(
                "user",
                'Next, respond with <act>{"tool": "tool_name", "param1": "value1", "param2": "value2"}</act>.',
            )  # Sometimes the LLM needs to be reminded of the next step.
            resp = self.get_response()  # <act>...</act>

            # Parse the action
            obj = find_and_parse_json(resp)

            if obj["tool"] == "weather":
                tool_results = get_weather(date=obj["date"], city=obj["city"])
            elif obj["tool"] == "events":
                tool_results = get_events(date=obj["date"], city=obj["city"])
            elif obj["tool"] == "get_activities_by_date":
                tool_results = get_activities_by_date_tool(date=obj["date"], city=obj["city"])
            elif obj["tool"] == "total_cost_calculator":
                tool_results = self.cost_calculator(activities=obj["activities"])
            elif obj["tool"] == "run_evals_tool":
                tool_results = get_eval_results(vacation_info=obj["vacation_info"], final_output=obj["itinerary"])
            elif obj["tool"] == "final_answer_tool":
                final_output = obj
                break
            elif obj["tool"] == "final_output":  # Keep backward compatibility
                final_output = obj
                break
            else:
                raise ValueError(f"Invalid tool: {obj['tool']}. Available tools: weather, events, get_activities_by_date, total_cost_calculator, run_evals_tool, final_answer_tool")

            # Add the observation to the message history
            self.add_message(
                "user",
                f"<observation>{tool_results}</observation>",
            )

        # update the total_cost, since LLMs may sometime struggle with math
        actual_total_cost = 0

        for itinerary_item in final_output["itinerary"]:
            for activity in itinerary_item["activities"]:
                actual_total_cost += int(activity["price"])

        final_output["total_cost"] = actual_total_cost

        return final_output


# Quick test
itinerary_revision_agent = ItineraryRevisionAgent(quiet=False)

final_itinerary_2 = itinerary_revision_agent.get_itinerary(
    vacation_info=gathered_vacation_info,
    proposed_itinerary=final_itinerary_1,
)
print_in_box(final_itinerary_2, title="Final Itinerary")

if final_itinerary_2 is not None:
    print(
        "Final itinerary generated successfully. Congratulations! Let's see if it passes the evaluation."
    )

    evaluation_results = get_eval_results(vacation_info, final_itinerary_2)
    print_in_box(evaluation_results, title="Evaluation Results")

    if len(evaluation_results) == 0:
        print(
            "Final itinerary passed all evaluations. Congratulations! You created an entire agentic system from scratch!"
        )
    else:
        raise RuntimeError("Final itinerary failed some evaluations. Please try again.")

# And, just for fun!

In [None]:
# And finally, just for fun!

from project_lib import narrate_my_trip

narrate_my_trip(vacation_info=gathered_vacation_info, itinerary=final_itinerary_2)
