# Chapter 4 – Reflection and Introspection in Agents
---

## Install dependencies

In [None]:
!pip install -U openai ipywidgets crewai pysqlite3-binary openai

# 1. Meta Reasoning - example
---

Let's take a look at a simple meta-reasoning approach without AI.

In [16]:
import random

## Simulated travel agent with meta-reasoning capabilities

- recommend_destination: The agent recommends a destination based on user preferences (budget, luxury, adventure) and internal weightings.

- get_user_feedback: The agent receives feedback on the recommendation (positive or negative).

- meta_reasoning: The agent adjusts its reasoning by updating the weights based on feedback, improving future recommendations.


In [2]:
# Simulated travel agent with meta-reasoning capabilities
class ReflectiveTravelAgent:
    def __init__(self):
        # Initialize preference weights that determine how user preferences influence recommendations
        self.preferences_weights = {
            "budget": 0.5,    # Weight for budget-related preferences
            "luxury": 0.3,    # Weight for luxury-related preferences
            "adventure": 0.2  # Weight for adventure-related preferences
        }
        self.user_feedback = []  # List to store user feedback for meta-reasoning

    def recommend_destination(self, user_preferences):
        """
        Recommend a destination based on user preferences and internal weightings.

        Args:
            user_preferences (dict): User's preferences with keys like 'budget', 'luxury', 'adventure'

        Returns:
            str: Recommended destination
        """
        # Calculate scores for each destination based on weighted user preferences
        score = {
            "Paris": (self.preferences_weights["luxury"] * user_preferences["luxury"] + 
                      self.preferences_weights["adventure"] * user_preferences["adventure"]),
            "Bangkok": (self.preferences_weights["budget"] * user_preferences["budget"] +
                        self.preferences_weights["adventure"] * user_preferences["adventure"]),
            "New York": (self.preferences_weights["luxury"] * user_preferences["luxury"] +
                         self.preferences_weights["budget"] * user_preferences["budget"])
        }
        # Select the destination with the highest calculated score
        recommendation = max(score, key=score.get)
        return recommendation

    def get_user_feedback(self, actual_experience):
        """
        Simulate receiving user feedback and trigger meta-reasoning to adjust recommendations.

        Args:
            actual_experience (str): The destination the user experienced
        """
        # Simulate user feedback: 1 for positive, -1 for negative
        feedback = random.choice([1, -1])
        print(f"Feedback for {actual_experience}: {'Positive' if feedback == 1 else 'Negative'}")
        
        # Store the feedback for later analysis
        self.user_feedback.append((actual_experience, feedback))
        
        # Trigger meta-reasoning to adjust the agent's reasoning process based on feedback
        self.meta_reasoning()

    def meta_reasoning(self):
        """
        Analyze collected feedback and adjust preference weights to improve future recommendations.
        This simulates the agent reflecting on its reasoning process and making adjustments.
        """
        for destination, feedback in self.user_feedback:
            if feedback == -1:  # Negative feedback indicates dissatisfaction
                # Reduce the weight of the main attribute associated with the destination
                if destination == "Paris":
                    self.preferences_weights["luxury"] *= 0.9  # Decrease luxury preference
                elif destination == "Bangkok":
                    self.preferences_weights["budget"] *= 0.9  # Decrease budget preference
                elif destination == "New York":
                    self.preferences_weights["budget"] *= 0.9  # Decrease budget preference
            elif feedback == 1:  # Positive feedback indicates satisfaction
                # Increase the weight of the main attribute associated with the destination
                if destination == "Paris":
                    self.preferences_weights["luxury"] *= 1.1  # Increase luxury preference
                elif destination == "Bangkok":
                    self.preferences_weights["budget"] *= 1.1  # Increase budget preference
                elif destination == "New York":
                    self.preferences_weights["budget"] *= 1.1  # Increase budget preference

        # Normalize weights to ensure they sum up to 1 for consistency
        total_weight = sum(self.preferences_weights.values())
        for key in self.preferences_weights:
            self.preferences_weights[key] /= total_weight

        # Display updated weights after meta-reasoning adjustments
        print(f"Updated weights: {self.preferences_weights}\n")

## Simulation

- User Preferences: Defines the user's preferences for budget, luxury, and adventure.

- First Recommendation: The agent recommends a destination based on the initial weights and user preferences.

- User Feedback Simulation: Simulates the user providing feedback on the recommended destination.

- Second Recommendation: After adjusting the weights based on feedback, the agent makes a new recommendation that reflects the updated reasoning process.


In [5]:
# Simulate agent usage
if __name__ == "__main__":
    agent = ReflectiveTravelAgent()

    # User's initial preferences
    user_preferences = {
        "budget": 0.8,      # High preference for budget-friendly options
        "luxury": 0.2,      # Low preference for luxury
        "adventure": 0.5    # Moderate preference for adventure activities
    }

    # First recommendation based on initial preferences and weights
    recommended = agent.recommend_destination(user_preferences)
    print(f"Recommended destination: {recommended}")

    # Simulate user experience and provide feedback
    agent.get_user_feedback(recommended)

    # Second recommendation after adjusting weights based on feedback
    recommended = agent.recommend_destination(user_preferences)
    print(f"Updated recommendation: {recommended}")


Recommended destination: Bangkok
Feedback for Bangkok: Positive
Updated weights: {'budget': 0.5238095238095238, 'luxury': 0.2857142857142857, 'adventure': 0.19047619047619047}

Updated recommendation: Bangkok


## Meta-reasoning with AI
---

Now let's bring in AI to perform meta-reasoning with agents. In this case we will use CrewAI framework to create our meta-reasoning Agents with OpenAI LLMs. We will also emulate a user feedback using AI just for demonstration purposes. First, let's make sure we initialize our OpenAI API key and then let's define the "Crew" (with CrewAI) and the Agents.

In [1]:
import getpass
__import__('pysqlite3')
import sys
import os

sys.modules['sqlite3'] = sys.modules.pop('pysqlite3')

api_key = getpass.getpass(prompt="Enter OpenAI API Key: ")

os.environ["OPENAI_API_KEY"] = api_key

We will define three tools that our agents will use-

1. `recommend_destination`: This tool will use a set of base weights that prioritizes budget, luxury, and adventure equally and then uses user's preference weights to recommend a destination. Paris will emphasize luxury, NYC emphasizes luxury and adventure, whereas Bangkok emphasizes budget.
2. `update_weights_on_feedback`: This tool will update the internal base weights based on the user's feedback on the recommended destination. A positive feedback will tell the model that it's recommendation is correct and it needs to update it's internal base weights based and increase it by a given (arbitrary adjustment factor), or reduce the weights using the adjustment factor if the feedback is dissatisfied.
3. `feedback_emulator`: This tool will emulate a user prividing "satisfied" or "dissatisfied" feedback to the AI agent's destination recommendation

In [None]:
from crewai.tools import tool

# Tool 1
@tool("Recommend travel destination based on preferences.")
def recommend_destination(user_preferences: dict) -> str:
    """
    Recommend a destination based on user preferences and internal weightings.

    Args:
        user_preferences (dict): User's preferences with keys - 'budget', 'luxury', 'adventure'
                                default user_preference weights 'budget' = 0.8, 'luxury' = 0.2, 'adventure' = 0.5
                                user_preferences = {
                                                "budget": 0.8,
                                                "luxury": 0.4,
                                                "adventure": 0.3
                                            }
    Returns:
        str: Recommended destination
    """
    internal_default_weights = {
            "budget": 0.33,    # Weight for budget-related preferences
            "luxury": 0.33,    # Weight for luxury-related preferences
            "adventure": 0.33  # Weight for adventure-related preferences
        }
   # Calculate weighted scores for each destination
    score = {
        "Paris": (
            internal_default_weights["luxury"] * user_preferences["luxury"] +      # Paris emphasizes luxury
            internal_default_weights["adventure"] * user_preferences["adventure"] +
            internal_default_weights["budget"] * user_preferences["budget"]
        ),
        "Bangkok": (
            internal_default_weights["budget"] * user_preferences["budget"] * 2 +  # Bangkok emphasizes budget
            internal_default_weights["luxury"] * user_preferences["luxury"] +
            internal_default_weights["adventure"] * user_preferences["adventure"]
        ),
        "New York": (
            internal_default_weights["luxury"] * user_preferences["luxury"] * 1.5 +  # NYC emphasizes luxury and adventure
            internal_default_weights["adventure"] * user_preferences["adventure"] * 1.5 +
            internal_default_weights["budget"] * user_preferences["budget"]
        )
    }
    
    # Select the destination with the highest calculated score
    recommendation = max(score, key=score.get)
    return recommendation

# Tool 2
@tool("Reasoning tool to adjust preference weights based on user feedback.")
def update_weights_on_feedback(destination: str, feedback: int, adjustment_factor: float) -> dict:
    """
    Analyze collected feedback and adjust internal preference weights based on user feedback for better future recommendations.

    Args:        
        destination (str): The destination recommended ('New York', 'Bangkok' or 'Paris')
        feedback (int): Feedback score; 1 = Satisfied, -1 = dissatisfied
        adjustment_factor (int): The adjustment factor between 0 and 1 that will be used to adjust the internal weights.
                                 Value will be used as (1 - adjustment_factor) for dissatisfied feedback and (1 + adjustment_factor)
                                 for satisfied feedback.
    Returns:
        dict: Adjusted internal weights
    """
    internal_default_weights = {
        "budget": 0.33,    # Weight for budget-related preferences
        "luxury": 0.33,    # Weight for luxury-related preferences
        "adventure": 0.33  # Weight for adventure-related preferences
    }

    # Define primary and secondary characteristics for each destination
    destination_characteristics = {
        "Paris": {
            "primary": "luxury",
            "secondary": "adventure"
        },
        "Bangkok": {
            "primary": "budget",
            "secondary": "adventure"
        },
        "New York": {
            "primary": "luxury",
            "secondary": "adventure"
        }
    }

    # Get the characteristics for the given destination
    dest_chars = destination_characteristics.get(destination, {})
    primary_feature = dest_chars.get("primary")
    secondary_feature = dest_chars.get("secondary")

    # adjustment_factor = 0.2  # How much to adjust weights by

    if feedback == -1:  # Negative feedback
        # Decrease weights for the destination's characteristics
        if primary_feature:
            internal_default_weights[primary_feature] *= (1 - adjustment_factor)
        if secondary_feature:
            internal_default_weights[secondary_feature] *= (1 - adjustment_factor/2)
            
    elif feedback == 1:  # Positive feedback
        # Increase weights for the destination's characteristics
        if primary_feature:
            internal_default_weights[primary_feature] *= (1 + adjustment_factor)
        if secondary_feature:
            internal_default_weights[secondary_feature] *= (1 + adjustment_factor/2)

    # Normalize weights to ensure they sum up to 1
    total_weight = sum(internal_default_weights.values())
    for key in internal_default_weights:
        internal_default_weights[key] = round(internal_default_weights[key] / total_weight, 2)

    # Ensure weights sum to exactly 1.0 after rounding
    adjustment = 1.0 - sum(internal_default_weights.values())
    if adjustment != 0:
        # Add any rounding difference to the largest weight
        max_key = max(internal_default_weights, key=internal_default_weights.get)
        internal_default_weights[max_key] = round(internal_default_weights[max_key] + adjustment, 2)

    return internal_default_weights

# Tool 3
@tool("User feedback emulator tool")
def feedback_emulator(destination: str) -> int:
    """
    Given a destination recommendation (such as 'New York' or 'Bangkok') this tool will emulate to provide
    a user feedback as 1 (satisfied) or -1 (dissatisfied)
    """
    import random
    feedback = random.choice([-1, 1])
    return feedback

Once, the tools are defined, we will declare three CrewAI Agents each of which will use one of the tools above. The `meta_agent` is basically the agent that will perform meta-reasoning using the emulated user feedback and the previously recommended destination to update the internal weights using an `adjustment_factor`. 

Note that here, the model assigns an adjustment factor dynamically to adjust the internal system weights (which is `{"budget": 0.33, "luxury": 0.33, "adventure": 0.33}` in the beginning), i.e. we are not hard coding the adjustment factor. Although, the nature of user feedback in this example is limited to "satisfied" or "dissatisfied" (1 or -1), feedback can be of various forms and may contain more details, in which case your AI Agent may adjust different values to the adjustment_factor. More contextual feedback with details will help the model perform better meta-reasoning on it's previous responses.

In [None]:
from crewai import Agent, Task, Crew
from typing import Dict, Union
import random

# Utility functions
def process_recommendation_output(output: str) -> str:
    """Extract the clean destination string from the agent's output."""
    # Handle various ways the agent might format the destination
    for city in ["Paris", "Bangkok", "New York"]:
        if city.lower() in output.lower():
            return city
    return output.strip()

def process_feedback_output(output: Union[Dict, str]) -> int:
    """Extract the feedback value from the agent's output."""
    if isinstance(output, dict):
        return output.get('feedback', 0)
    try:
        # Try to parse as integer if it's a string
        return int(output)
    except (ValueError, TypeError):
        return 0

def generate_random_preferences():
    # Generate 3 random numbers and normalize them
    values = [random.random() for _ in range(3)]
    total = sum(values)
    
    return {
        "budget": round(values[0]/total, 2),
        "luxury": round(values[1]/total, 2),
        "adventure": round(values[2]/total, 2)
    }

# Initial shared state for weights, preferences, and results
state = {
    "weights": {"budget": 0.33, "luxury": 0.33, "adventure": 0.33},
    "preferences": generate_random_preferences()
}

# Agents
preference_agent = Agent(
    name="Preference Agent",
    role="Travel destination recommender",
    goal="Provide the best travel destination based on user preferences and weights.",
    backstory="An AI travel expert adept at understanding user preferences.",
    verbose=True,
    llm='gpt-4o-mini',
    tools=[recommend_destination]
)

feedback_agent = Agent(
    name="Feedback Agent",
    role="Simulated feedback provider",
    goal="Provide simulated feedback for the recommended travel destination.",
    backstory="An AI that mimics user satisfaction or dissatisfaction for travel recommendations.",
    verbose=True,
    llm='gpt-4o-mini',
    tools=[feedback_emulator]
)

meta_agent = Agent(
    name="Meta-Reasoning Agent",
    role="Preference weight adjuster",
    goal="Reflect on feedback and adjust the preference weights to improve future recommendations.",
    backstory="An AI optimizer that learns from user experiences to fine-tune recommendation preferences.",
    verbose=True,
    llm='gpt-4o-mini',
    tools=[update_weights_on_feedback]
)


# Tasks with data passing
generate_recommendation = Task(
    name="Generate Recommendation",
    agent=preference_agent,
    description=(
        f"Use the recommend_destination tool with these preferences: {state['preferences']}\n"
        "Return only the destination name as a simple string (Paris, Bangkok, or New York)."
    ),
    expected_output="A destination name as a string",
    output_handler=process_recommendation_output
)

simulate_feedback = Task(
    name="Simulate User Feedback",
    agent=feedback_agent,
    description=(
        "Use the feedback_emulator tool with the destination from the previous task.\n"
        "Instructions:\n"
        "1. Get the destination string from the previous task\n"
        "2. Pass it directly to the feedback_emulator tool\n"
        "3. Return the feedback value (1 or -1)\n\n"
        "IMPORTANT: Pass the destination as a plain string, not a dictionary."
    ),
    expected_output="An integer feedback value: 1 or -1",
    context=[generate_recommendation],
    output_handler=process_feedback_output
)

adjust_weights = Task(
    name="Adjust Weights Based on Feedback",
    agent=meta_agent,
    description=(
        "Use the update_weights_on_feedback tool with:\n"
        "1. destination: Get from first task's output (context[0])\n"
        "2. feedback: Get from second task's output (context[1])\n"
        "3. adjustment_factor: a number betweek 0 and 1 that will be used to adjust internal weights based on feedback\n\n"
        "Ensure all inputs are in their correct types (string for destination, integer for feedback)."
    ),
    expected_output="Updated weights as a dictionary",
    context=[generate_recommendation, simulate_feedback]
)

# Crew Definition
crew = Crew(
    agents=[preference_agent, feedback_agent, meta_agent],
    tasks=[generate_recommendation, simulate_feedback, adjust_weights],
    verbose=False
)

# Execute the workflow
result = crew.kickoff()
print("\nFinal Results:", result)

Overriding of current TracerProvider is not allowed


[1m[95m# Agent:[00m [1m[92mTravel destination recommender[00m
[95m## Task:[00m [92mUse the recommend_destination tool with these preferences: {'budget': 0.04, 'luxury': 0.02, 'adventure': 0.94}
Return only the destination name as a simple string (Paris, Bangkok, or New York).[00m


[1m[95m# Agent:[00m [1m[92mTravel destination recommender[00m
[95m## Thought:[00m [92mI need to analyze the user's preferences which heavily favor adventure and very little for budget and luxury.[00m
[95m## Using tool:[00m [92mRecommend travel destination based on preferences.[00m
[95m## Tool Input:[00m [92m
"{\"user_preferences\": {\"budget\": 0.04, \"luxury\": 0.02, \"adventure\": 0.94}}"[00m
[95m## Tool Output:[00m [92m
New York[00m


[1m[95m# Agent:[00m [1m[92mTravel destination recommender[00m
[95m## Final Answer:[00m [92m
New York[00m


[1m[95m# Agent:[00m [1m[92mSimulated feedback provider[00m
[95m## Task:[00m [92mUse the feedback_emulator tool with t


# 2. Self Explanation - example

## ReflectiveTravelAgentWithSelfExplanation

The `ReflectiveTravelAgentWithSelfExplanation` class simulates a travel agent that not only recommends destinations based on user preferences but also explains the reasoning behind its recommendations. 

1. **Preference-Based Recommendations**: It takes user preferences (like budget, luxury, and adventure preferences) and calculates scores for different travel destinations by weighing those preferences. The destination with the highest score is recommended to the user.

2. **Self-Explanation**: For each recommendation, the agent generates a detailed self-explanation. This explanation outlines the factors that led to the recommendation, such as proximity to popular attractions, budget-friendly options, or the presence of adventure activities. The purpose is to provide transparency into how the decision was made, helping the user understand the reasoning process.

3. **Feedback Reflection**: The agent doesn't stop after making the recommendation. It actively reflects on user feedback (whether positive or negative). If the feedback is negative, it introspects on its decision-making process and adjusts the importance (weights) it assigns to user preferences for future recommendations. For instance, if a user dislikes a budget-friendly recommendation, the agent might reduce the emphasis it places on budget-related preferences.

4. **User Engagement**: The class also simulates a dialogue with the user. After giving the recommendation and the self-explanation, it collects feedback from the user, allowing for a more collaborative interaction. This feedback is then used to refine future recommendations, making the agent more adaptive and personalized.



In [1]:
import getpass
import os
__import__('pysqlite3')
import sys

sys.modules['sqlite3'] = sys.modules.pop('pysqlite3')
api_key = getpass.getpass(prompt="Enter OpenAI API Key: ")
os.environ["OPENAI_API_KEY"] = api_key

### 2.1 Transparency: Verbalizing Reasoning in Decisions

Lets use OpenAI SDK to see how a model can perform reasoning in the decisions it makes. Here, the agent generates explanations for its reasoning when recommending a travel itinerary. It uses GPT-4o-mini to generate self-explanations.

In [None]:
import openai

# Mock data for the travel recommendation
user_preferences = {
    "location": "Paris",
    "budget": 200,
    "preferences": ["proximity to attractions", "user ratings"],
}

# Input reasoning factors for the GPT model
reasoning_prompt = f"""
You are an AI-powered travel assistant. Explain your reasoning behind a hotel recommendation for a user traveling to {user_preferences['location']}.
Consider:
1. Proximity to popular attractions.
2. High ratings from similar travelers.
3. Competitive pricing within ${user_preferences['budget']} budget.
4. Preferences: {user_preferences['preferences']}.
Provide a clear, transparent self-explanation.
"""

response = openai.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {"role": "system", "content": "You are a reflective travel assistant."},
        {"role": "user", "content": reasoning_prompt},
    ]
)

# Print self-explanation
print("Agent Self-Explanation:")
print(response.choices[0].message.content)

Agent Self-Explanation:
Sure! When making a hotel recommendation for a user traveling to Paris, there are several key factors to consider based on the given preferences and criteria. Here's how I would arrive at a recommendation:

1. **Proximity to Popular Attractions**: Paris is renowned for its iconic landmarks, such as the Eiffel Tower, the Louvre, and Notre-Dame Cathedral. Staying near these attractions can significantly enhance a visitor's experience by reducing travel time and allowing for greater flexibility to explore at their leisure. Thus, I would prioritize hotels that are within walking distance or a short metro ride to these major sites.

2. **High Ratings from Similar Travelers**: User ratings provide insight into the quality of the hotel, service, cleanliness, and overall satisfaction from previous guests. To ensure a positive experience, I would look for hotels that have a high average rating (usually 4 stars and above) from travelers who have similar interests or prefe

#### Using Crew AI

Now we will perform the same/similar reasoning with a CrewAI Agent/Task combination.

In [8]:
from crewai import Agent, Task, Crew
from crewai.process import Process

travel_agent = Agent(
    role="Travel Advisor",
    goal="Provide hotel recommendations with transparent reasoning.",
    backstory="An AI travel advisor specializing in personalized travel planning."
)

recommendation_task = Task(
    name="Recommend hotel",
    description="""
    Recommend a hotel in Paris for a user with the following preferences:
    - Budget: $200 per night
    - Preferences: Proximity to attractions, high user ratings
    Provide a transparent explanation of the reasoning behind the recommendation.
    """,
    agent=travel_agent,
    expected_output="The name of the hotel with explanations"
)

travel_crew = Crew(
    agents=[travel_agent],
    tasks=[recommendation_task],
    process=Process.sequential,
    verbose=True
)

travel_crew.kickoff()

# Retrieve and print the output
output = recommendation_task.output
print("Hotel Recommendation and Explanation:")
print(output)

Overriding of current TracerProvider is not allowed


[1m[95m# Agent:[00m [1m[92mTravel Advisor[00m
[95m## Task:[00m [92m
    Recommend a hotel in Paris for a user with the following preferences:
    - Budget: $200 per night
    - Preferences: Proximity to attractions, high user ratings
    Provide a transparent explanation of the reasoning behind the recommendation.
    [00m


[1m[95m# Agent:[00m [1m[92mTravel Advisor[00m
[95m## Final Answer:[00m [92m
Based on your preferences of a budget of $200 per night, a desire for proximity to attractions, and high user ratings, I recommend the "Hôtel de la Bourdonnais."

This hotel is situated in the 7th arrondissement of Paris, just a short walk from iconic attractions such as the Eiffel Tower and the Champ de Mars. Its prime location allows you to explore one of the most beautiful areas in Paris conveniently, as you'll also find numerous restaurants, cafes, and shops nearby, enhancing your local experience.

The Hôtel de la Bourdonnais has excellent user ratings, often noted f

In [None]:
class ReflectiveTravelAgentWithSelfExplanation:
    def __init__(self):
        # Initialize the internal weights for user preferences (e.g., budget, luxury, adventure)
        self.preferences_weights = {
            "budget": 0.4,    # Weight for budget-related preferences
            "luxury": 0.3,    # Weight for luxury-related preferences
            "adventure": 0.3  # Weight for adventure-related preferences
        }

    def recommend_destination(self, user_preferences):
        """
        Recommend a destination based on user preferences and provide a self-explanation.

        Args:
            user_preferences (dict): User's preferences for different factors (e.g., budget, luxury, adventure)
        
        Returns:
            (str, str): Recommended destination and the self-explanation
        """
        # Score each destination by multiplying preference weights with user preferences
        score = {
            "Paris": (self.preferences_weights["luxury"] * user_preferences["luxury"] + 
                      self.preferences_weights["adventure"] * user_preferences["adventure"]),
            "Bangkok": (self.preferences_weights["budget"] * user_preferences["budget"] +
                        self.preferences_weights["adventure"] * user_preferences["adventure"]),
            "New York": (self.preferences_weights["luxury"] * user_preferences["luxury"] +
                         self.preferences_weights["budget"] * user_preferences["budget"])
        }
        
        # Choose the destination with the highest score
        recommendation = max(score, key=score.get)
        
        # Generate and return a self-explanation for the recommendation
        explanation = self.generate_self_explanation(recommendation, score[recommendation], user_preferences)
        
        return recommendation, explanation

    def generate_self_explanation(self, destination, score, user_preferences):
        """
        Generate a self-explanation for the recommended destination.
        
        Args:
            destination (str): The recommended destination
            score (float): The score assigned to the destination
            user_preferences (dict): The user's preferences used for the recommendation
        
        Returns:
            str: Self-explanation of the recommendation
        """
        # Start the explanation with the destination and its score
        explanation = (
            f"I recommended {destination} because it aligns with your preferences. "
            f"The destination scored {score:.2f} based on the following factors:\n"
        )
        
        # Customize the explanation for each destination based on user preferences
        if destination == "Paris":
            explanation += (
                "- High luxury offerings (aligned with your luxury preference).\n"
                "- Availability of adventure activities.\n"
            )
        elif destination == "Bangkok":
            explanation += (
                "- Budget-friendly options (aligned with your budget preference).\n"
                "- Availability of adventure experiences.\n"
            )
        elif destination == "New York":
            explanation += (
                "- Combination of luxury experiences and budget-friendly options.\n"
            )
        
        return explanation

    def reflect_on_feedback(self, destination, user_feedback):
        """
        Reflect on user feedback to improve decision-making in future recommendations.
        
        Args:
            destination (str): The destination that was recommended
            user_feedback (str): User feedback ('positive' or 'negative')
        """
        # If the user provides negative feedback, adjust the internal reasoning process
        if user_feedback == 'negative':
            print(f"User provided negative feedback for {destination}. Reflecting on reasoning...")
            
            # Example: If Bangkok was chosen and the user disliked it, reduce budget weight
            if destination == "Bangkok":
                print("Realizing that budget weight might have been overemphasized. Reconsidering weights...")
                self.preferences_weights["budget"] *= 0.9  # Reduce budget importance slightly

            # If Paris, reduce importance of luxury if feedback is negative
            elif destination == "Paris":
                print("Luxury might have been over-prioritized. Adjusting luxury weight...")
                self.preferences_weights["luxury"] *= 0.9

            # Normalize weights after adjustment to maintain balance
            total_weight = sum(self.preferences_weights.values())
            for key in self.preferences_weights:
                self.preferences_weights[key] /= total_weight  # Normalize weights

            print(f"Updated weights: {self.preferences_weights}\n")
        else:
            # Positive feedback indicates no changes are needed
            print(f"User provided positive feedback for {destination}. No changes needed.")

    def engage_with_user(self, recommendation, explanation):
        """
        Simulate user interaction by providing a self-explanation and inviting feedback.

        Args:
            recommendation (str): The recommended destination
            explanation (str): Self-explanation for the recommendation
        """
        # Show the recommendation and its explanation to the user
        print(f"Recommended destination: {recommendation}")
        print(f"Self-explanation: {explanation}")

        # Simulate user feedback (positive or negative)
        user_feedback = input(f"Did you like the recommendation for {recommendation}? (positive/negative): ")
        
        # Reflect on feedback and adjust the agent's reasoning process if needed
        self.reflect_on_feedback(recommendation, user_feedback)


### 2.2 Learning and Refinement: Using Self-Explanation to Identify Gaps

Just as before, lets use direct API calls with OpenAI sdk to see what self-explanation looks like. In this example, the agent learns from user feedback. If the user rejects a recommendation, the agent reviews its reasoning and identifies areas for improvement.

In [2]:
import openai

# User feedback indicating dissatisfaction
user_feedback = "The hotel you recommended was too far from public transport. I prefer locations closer to metro stations."

# Prompt for refinement based on feedback
refinement_prompt = f"""
You are an AI travel assistant that reflects on feedback to improve recommendations.
Here is the user feedback: 
"{user_feedback}"

- Review your previous recommendation and identify the oversight in your reasoning.
- Update your reasoning process to include aspects that were missed.
- Provide the refined steps that you will use to recommend hotels.
"""

response = openai.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {"role": "system", "content": "You are a self-improving travel assistant."},
        {"role": "user", "content": refinement_prompt},
    ]
)

# Print refined explanation
print("Refined Self-Explanation:")
print(response.choices[0].message.content)


Refined Self-Explanation:
Thank you for your feedback, which is invaluable for improving my recommendations. Here’s how I will address your concerns:

### Reflection on Oversight
The oversight in my previous recommendation was primarily focused on factors like amenities, price, and general area appeal but did not adequately prioritize proximity to public transport, particularly metro stations. This is critical, especially for travelers who rely heavily on public transit for mobility.

### Updated Reasoning Process
To ensure better recommendations in the future, I will include the following aspects in my reasoning process:

1. **Proximity to Public Transport**: Make logical priority for metro stations, tram stops, and bus stations. Understanding that easy access to public transportation is often a priority for travelers.

2. **Accessibility Mapping**: Utilize mapping tools to assess the distance from hotels to public transport stations, aiming for locations within a reasonable walking d

#### Using CrewAI

Just like before, we can have an Agent/Task pair with CrewAI framework whose job is to learn and refine future responses based on previous responses and user feedback

In [11]:
from crewai import Agent, Task, Crew
from crewai.process import Process

reflective_travel_agent = Agent(
    role="Self-Improving Travel Advisor",
    goal="Refine hotel recommendations based on user feedback to improve decision-making.",
    backstory="""
    A reflective AI travel assistant that learns from user feedback. 
    When a user highlights an issue with a recommendation, it revisits its reasoning,
    identifies overlooked factors, and updates its decision process accordingly.
    """
)

user_feedback = "The hotel you recommended was too far from public transport. I prefer locations closer to metro stations."

feedback_task = Task(
    description=f"""
    Reflect on user feedback:
    "{user_feedback}"

    - Identify any oversight in your previous reasoning process.
    - Update your reasoning process to include aspects that were missed.
    - Provide the refined steps that you will use to recommend hotels.
    """,
    expected_output="""
    A refined explanation that acknowledges the oversight, includes missed factors,
    and provides a revised steps to recommend hotels tailored to the user's feedback.
    """,
    agent=reflective_travel_agent
)

travel_feedback_crew = Crew(
    agents=[reflective_travel_agent],
    tasks=[feedback_task],
    process=Process.sequential,
    verbose=True
)

print("Running the Self-Improving Travel Assistant...\n")
response = travel_feedback_crew.kickoff()

print("Refined Self-Explanation:")
print(response)

Overriding of current TracerProvider is not allowed


Running the Self-Improving Travel Assistant...

[1m[95m# Agent:[00m [1m[92mSelf-Improving Travel Advisor[00m
[95m## Task:[00m [92m
    Reflect on user feedback:
    "The hotel you recommended was too far from public transport. I prefer locations closer to metro stations."

    - Identify any oversight in your previous reasoning process.
    - Update your reasoning process to include aspects that were missed.
    - Provide the refined steps that you will use to recommend hotels.
    [00m


[1m[95m# Agent:[00m [1m[92mSelf-Improving Travel Advisor[00m
[95m## Final Answer:[00m [92m
Upon reflecting on the user feedback regarding the hotel recommendation that was too far from public transport, I recognize the oversight in my previous reasoning process. I failed to prioritize proximity to metro stations, which is essential for users who depend on public transport for convenience and accessibility during their travels.

In revisiting my approach, I identified several factors

### 2.3. User Engagement and Collaboration: Enabling Interactive Explanations

With direct API calls with OpenAI sdk to see User Engagement and Collaboration works. In this example, the agent provides explanations for its decisions and engages users to refine suggestions interactively.

In [6]:
import openai

# Initial recommendation and explanation
initial_recommendation = "I recommend Hotel Lumière in Paris for its proximity to the Eiffel Tower, high ratings, and budget-friendly price."

# User query for clarification
user_query = "Why did you prioritize proximity to attractions over public transport access?"

# Engage GPT-4o-mini to respond interactively
interactive_prompt = f"""
You are an AI travel assistant facilitating an interactive dialogue with a user.
Here is your initial recommendation: "{initial_recommendation}"
The user asks: "{user_query}"
Respond by explaining your reasoning and inviting the user to clarify their preferences further.
"""

response = openai.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {"role": "system", "content": "You are a collaborative AI travel assistant."},
        {"role": "user", "content": interactive_prompt},
    ]
)

# Print interactive response
print("Interactive Response:")
print(response.choices[0].message.content)


Interactive Response:
I prioritized proximity to attractions like the Eiffel Tower because it can significantly enhance the overall travel experience, especially for first-time visitors who want to make the most of iconic sights. Being close to major landmarks can save time and add convenience, allowing for more spontaneous exploration and enjoyment of the area.

However, I understand that public transport access is also crucial for getting around the city efficiently and reaching farther attractions. If your focus is more on ease of travel or exploring beyond just the immediate neighborhood, I’d love to hear more about your preferences! Are you looking for a balance between attractions and transport, or do you have a particular area in mind that you’d like to explore?


#### Using CrewAI

Just like before, we can have an Agent/Task pair with CrewAI framework whose job is to interact with the users by asking clarifying questions about their preferences.

In [10]:
from crewai import Agent, Task, Crew
from crewai.process import Process

# Step 1: Define the Collaborative Agent
collaborative_travel_agent = Agent(
    role="Collaborative AI Travel Assistant",
    goal="""
    Engage in an interactive dialogue with the user to clarify hotel recommendations.
    Explain reasoning for prioritizing certain factors and invite the user to share their preferences.
    """,
    backstory="""
    An AI travel assistant that values user input and ensures recommendations are well-aligned with user needs.
    It provides clear explanations for its decisions and encourages collaborative planning.
    """
)

# Step 2: Define the Task for Clarification Dialogue
initial_recommendation = "I recommend Hotel Lumière in Paris for its proximity to the Eiffel Tower, high ratings, and budget-friendly price."
user_query = "Why did you prioritize proximity to attractions over public transport access?"

interactive_task = Task(
    description=f"""
    Facilitate an interactive dialogue with the user.

    - Here is the initial recommendation: "{initial_recommendation}"
    - The user has asked: "{user_query}"

    Respond by:
    1. Explaining the reasoning behind prioritizing proximity to attractions.
    2. Inviting the user to clarify whether proximity to public transport is more important.
    """,
    expected_output="""
    A clear and polite response explaining the reasoning and inviting the user to share further input.
    """,
    agent=collaborative_travel_agent
)

# Step 3: Assemble the Crew
interactive_crew = Crew(
    agents=[collaborative_travel_agent],
    tasks=[interactive_task],
    process=Process.sequential,
    verbose=True
)

# Step 4: Run the Crew and Output the Results
print("Starting Interactive Dialogue with User...\n")
result = interactive_crew.kickoff()

print("Final Interactive Response:")
print(result)


Overriding of current TracerProvider is not allowed


Starting Interactive Dialogue with User...

[1m[95m# Agent:[00m [1m[92mCollaborative AI Travel Assistant[00m
[95m## Task:[00m [92m
    Facilitate an interactive dialogue with the user.

    - Here is the initial recommendation: "I recommend Hotel Lumière in Paris for its proximity to the Eiffel Tower, high ratings, and budget-friendly price."
    - The user has asked: "Why did you prioritize proximity to attractions over public transport access?"

    Respond by:
    1. Explaining the reasoning behind prioritizing proximity to attractions.
    2. Inviting the user to clarify whether proximity to public transport is more important.
    [00m


[1m[95m# Agent:[00m [1m[92mCollaborative AI Travel Assistant[00m
[95m## Final Answer:[00m [92m
Thank you for your insightful question! I prioritized proximity to attractions, like the Eiffel Tower, because many travelers to Paris often want to maximize their sightseeing experience. Being close to major landmarks can save time and

# 3. Self Modeling - example

The `ReflectiveTravelAgentWithSelfModeling` class represents a sophisticated travel recommendation system that utilizes **self-modeling** to enhance its decision-making and adaptability. 

### 1. **Initialization:**
   - **Self-Model and Knowledge Base:** The agent starts with an internal model that includes its goals and a knowledge base. 
     - **Goals:** Initially, the goals are set to provide personalized recommendations, optimize user satisfaction, and not prioritize eco-friendly options by default.
     - **Knowledge Base:** It contains information about various travel destinations, including their ratings, costs, luxury levels, and sustainability. This base also tracks user preferences.

### 2. **Updating Goals:**
   - **Adapting to Preferences:** When new user preferences are provided, the agent can update its goals accordingly. For example, if the user prefers eco-friendly options, the agent will adjust its goals to prioritize recommending sustainable travel options. Similarly, if the user’s budget changes, the agent will refocus on cost-effective recommendations.

### 3. **Updating Knowledge Base:**
   - **Incorporating Feedback:** After receiving feedback from users, the agent updates its knowledge base. If the feedback is positive, the agent increases the rating of the recommended destination. If the feedback is negative, the rating is decreased. This helps the agent refine its recommendations based on real user experiences.

### 4. **Making Recommendations:**
   - **Calculating Scores:** The agent evaluates each destination by calculating a score based on its rating and, if eco-friendly options are a goal, it adjusts the score by adding the sustainability rating.
   - **Selecting the Best Destination:** The destination with the highest score is recommended to the user. This process ensures that the recommendation aligns with both user preferences and the agent’s goals.

### 5. **Engaging with the User:**
   - **Providing Recommendations:** The agent presents the recommended destination to the user and asks for feedback.
   - **Feedback Handling:** The feedback (positive or negative) is used to update the knowledge base, which helps improve future recommendations. 



In [None]:
class ReflectiveTravelAgentWithSelfModeling:
    def __init__(self):
        # Initialize the agent with a self-model that includes goals and a knowledge base
        self.self_model = {
            "goals": {
                "personalized_recommendations": True,
                "optimize_user_satisfaction": True,
                "eco_friendly_options": False  # Default: Not prioritizing eco-friendly options
            },
            "knowledge_base": {
                "destinations": {
                    "Paris": {"rating": 4.8, "cost": 2000, "luxury": 0.9, "sustainability": 0.3},
                    "Bangkok": {"rating": 4.5, "cost": 1500, "luxury": 0.7, "sustainability": 0.6},
                    "Barcelona": {"rating": 4.7, "cost": 1800, "luxury": 0.8, "sustainability": 0.7}
                },
                "user_preferences": {}
            }
        }

    def update_goals(self, new_preferences):
        """Update the agent's goals based on new user preferences."""
        if new_preferences.get("eco_friendly"):
            self.self_model["goals"]["eco_friendly_options"] = True
            print("Updated goal: Prioritize eco-friendly travel options.")
        if new_preferences.get("adjust_budget"):
            print("Updated goal: Adjust travel options based on new budget constraints.")
    
    def update_knowledge_base(self, feedback):
        """Update the agent's knowledge base based on user feedback."""
        destination = feedback["destination"]
        if feedback["positive"]:
            # Increase rating for positive feedback
            self.self_model["knowledge_base"]["destinations"][destination]["rating"] += 0.1
            print(f"Positive feedback received for {destination}; rating increased.")
        else:
            # Decrease rating for negative feedback
            self.self_model["knowledge_base"]["destinations"][destination]["rating"] -= 0.2
            print(f"Negative feedback received for {destination}; rating decreased.")
    
    def recommend_destination(self, user_preferences):
        """Recommend a destination based on user preferences and the agent's self-model."""
        # Store user preferences in the agent's self-model
        self.self_model["knowledge_base"]["user_preferences"] = user_preferences
        
        # Update agent's goals based on new preferences
        if user_preferences.get("eco_friendly"):
            self.update_goals(user_preferences)
        
        # Calculate scores for each destination
        best_destination = None
        highest_score = 0
        for destination, info in self.self_model["knowledge_base"]["destinations"].items():
            score = info["rating"]
            if self.self_model["goals"]["eco_friendly_options"]:
                # Boost score for eco-friendly options if that goal is prioritized
                score += info["sustainability"]
            
            # Update the best destination if current score is higher
            if score > highest_score:
                best_destination = destination
                highest_score = score
        
        return best_destination

    def engage_with_user(self, destination):
        """Simulate user engagement by providing the recommendation and receiving feedback."""
        print(f"Recommended destination: {destination}")
        # Simulate receiving user feedback (e.g., through input in a real application)
        feedback = input(f"Did you like the recommendation of {destination}? (yes/no): ").strip().lower()
        positive_feedback = feedback == "yes"
        return {"destination": destination, "positive": positive_feedback}




The provided code snippet is designed to simulate the usage of the `ReflectiveTravelAgentWithSelfModeling` class. 

### 1. **Creating an Instance of the Agent:**
   ```python
   agent = ReflectiveTravelAgentWithSelfModeling()
   ```
   - **Purpose:** Initializes a new instance of the `ReflectiveTravelAgentWithSelfModeling` class.
   - **Outcome:** This instance represents a travel agent equipped with self-modeling capabilities, including goal management and a knowledge base.

### 2. **Setting User Preferences:**
   ```python
   user_preferences = {
       "budget": 0.6,            # Moderate budget constraint
       "luxury": 0.4,            # Moderate preference for luxury
       "adventure": 0.7,         # High preference for adventure
       "eco_friendly": True      # User prefers eco-friendly options
   }
   ```
   - **Purpose:** Defines a set of preferences provided by the user.
   - **Outcome:** These preferences indicate that the user has a moderate budget, moderate luxury preferences, a high interest in adventure, and a strong preference for eco-friendly options.

### 3. **Getting a Recommendation:**
   ```python
   recommendation = agent.recommend_destination(user_preferences)
   ```
   - **Purpose:** Requests a travel destination recommendation from the agent based on the provided user preferences.
   - **Outcome:** The agent processes the preferences, updates its goals if necessary (e.g., prioritizing eco-friendly options), and selects the best destination to recommend.

### 4. **Engaging with the User:**
   ```python
   feedback = agent.engage_with_user(recommendation)
   ```
   - **Purpose:** Simulates interaction with the user by presenting the recommendation and gathering feedback.
   - **Outcome:** The user provides feedback on the recommended destination, which is used to evaluate the effectiveness of the recommendation.

### 5. **Updating the Knowledge Base:**
   ```python
   agent.update_knowledge_base(feedback)
   ```
   - **Purpose:** Updates the agent’s knowledge base with the feedback received from the user.
   - **Outcome:** The agent adjusts its knowledge base by modifying ratings or other attributes based on whether the feedback was positive or negative. This update helps improve future recommendations by refining the agent's understanding of user preferences and destination qualities.

### Summary:
In essence, this code snippet demonstrates how the `ReflectiveTravelAgentWithSelfModeling` class operates in a simulated environment. It initializes the agent, sets user preferences, obtains a recommendation, engages the user for feedback, and updates the agent’s knowledge base based on that feedback. This simulation helps illustrate the agent’s self-modeling capabilities and its ability to adapt and improve recommendations over time.

In [None]:
# Simulating agent usage
if __name__ == "__main__":
    # Create an instance of the reflective travel agent with self-modeling
    agent = ReflectiveTravelAgentWithSelfModeling()
    
    # Example user preferences including a focus on eco-friendly options
    user_preferences = {
        "budget": 0.6,            # Moderate budget constraint
        "luxury": 0.4,            # Moderate preference for luxury
        "adventure": 0.7,         # High preference for adventure
        "eco_friendly": True      # User prefers eco-friendly options
    }
    
    # Get the recommended destination based on user preferences
    recommendation = agent.recommend_destination(user_preferences)
    
    # Engage with the user to provide feedback on the recommendation
    feedback = agent.engage_with_user(recommendation)
    
    # Update the knowledge base with the user feedback
    agent.update_knowledge_base(feedback)


Updated goal: Prioritize eco-friendly travel options.
Recommended destination: Barcelona
Did you like the recommendation of Barcelona? (yes/no): no
Negative feedback received for Barcelona; rating decreased.
