# Metacognition in AI Agents with Semantic Kernel

This notebook demonstrates metacognitive capabilities in AI agents - the ability for agents to be aware of and reason about their own thinking processes. We'll build a travel booking agent that maintains memory of user preferences across conversations and demonstrates self-reflection and learning from interactions.

## Import the Needed Packages

We'll import all the necessary libraries for creating metacognitive agents, including Semantic Kernel for agent management, function handling, and conversation threading.

In [1]:
import json
import os

from typing import Annotated

from dotenv import load_dotenv

from azure.identity import DefaultAzureCredential

from semantic_kernel.connectors.ai.open_ai import AzureChatCompletion
from semantic_kernel.contents import FunctionCallContent, FunctionResultContent, StreamingTextContent
from semantic_kernel.agents import ChatCompletionAgent, ChatHistoryAgentThread
from semantic_kernel.functions import kernel_function

## Create Function Plugins

Define custom functions that the agent can use to access external data. These functions demonstrate how agents can interact with tools while maintaining awareness of their capabilities and limitations.

In [2]:
# Define a sample plugin for the sample
class DestinationsPlugin:
    """A List of Destinations for vacation."""

    @kernel_function(description="Provides a list of vacation destinations.")
    def get_destinations(self) -> Annotated[str, "Returns the specials from the menu."]:
        return """
        Barcelona, Spain
        Paris, France
        Berlin, Germany
        Tokyo, Japan
        New York, USA
        """

    @kernel_function(description="Provides available flight times for a destination.")
    def get_flight_times(
        self, destination: Annotated[str, "The destination to check flight times for."]
    ) -> Annotated[str, "Returns flight times for the specified destination."]:
        flight_times = {
            "Barcelona": ["08:30 AM", "02:15 PM", "10:45 PM"],
            "Paris": ["06:45 AM", "12:30 PM", "07:15 PM"],
            "Berlin": ["07:20 AM", "01:45 PM", "09:30 PM"],
            "Tokyo": ["11:00 AM", "05:30 PM", "11:55 PM"],
            "New York": ["05:15 AM", "03:00 PM", "08:45 PM"]
        }

        # Extract just the city name from input that might contain country
        city = destination.split(',')[0].strip()

        if city in flight_times:
            times = ", ".join(flight_times[city])
            return f"Flight times for {city}: {times}"
        else:
            return f"No flight information available for {city}."

## Setting up Azure OpenAI Connection

Configure the Azure OpenAI service that will power our metacognitive agent. The agent will use this connection to process requests and maintain conversation context.

In [3]:
load_dotenv()

# Option 1: Using API Key (recommended for development)
chat_completion_service = AzureChatCompletion(
    deployment_name=os.environ.get("AZURE_OPENAI_DEPLOYMENT_NAME", "gpt-4o-mini"),
    endpoint=os.environ.get("AZURE_OPENAI_ENDPOINT"),
    api_version=os.environ.get("AZURE_OPENAI_API_VERSION", "2024-02-01"),
    api_key=os.environ.get("AZURE_OPENAI_API_KEY")
)

# Option 2: Using Azure AD Authentication (uncomment to use)
# Create Azure credential 
credential = DefaultAzureCredential()

# Create a token provider function
def get_azure_ad_token():
    """Function to get Azure AD token for OpenAI."""
    token = credential.get_token("https://cognitiveservices.azure.com/.default")
    return token.token

# chat_completion_service = AzureChatCompletion(
#     deployment_name=os.environ.get("AZURE_OPENAI_DEPLOYMENT_NAME", "gpt-4o-mini"),
#     endpoint=os.environ.get("AZURE_OPENAI_ENDPOINT"),
#     api_version=os.environ.get("AZURE_OPENAI_API_VERSION", "2024-02-01"),
#     ad_token=get_azure_ad_token()

## Create the Metacognitive Travel Agent

Build a sophisticated agent with metacognitive capabilities. This agent can:
- Remember user preferences across conversations
- Reflect on its own reasoning process
- Learn from user feedback and adapt its responses
- Maintain awareness of its capabilities and limitations

In [4]:
AGENT_NAME = "TravelAgent"
AGENT_INSTRUCTIONS = """ \
"You are Flight Booking Agent that provides information about available flights and gives travel activity suggestions when asked.
Travel activity suggestions should be specific to customer, location and amount of time at location.

You have access to the following tools to help users plan their trips:
1. get_destinations: Returns a list of available vacation destinations that users can choose from.
2. get_flight_times: Provides available flight times for specific destinations.


Your process for assisting users:
- When users first inquire about flight booking with no prior history, ask for their preferred flight time ONCE.
- MAINTAIN a customer_preferences object throughout the conversation to track preferred flight times.
- When a user books a flight to any destination, RECORD their chosen flight time in the customer_preferences object.
- For ALL subsequent flight inquiries to ANY destination, AUTOMATICALLY apply their existing preferred flight time without asking.
- NEVER ask about time preferences again after they've been established for any destination.
- When suggesting flights for a new destination, explicitly say: "Based on your previous preference for [time] flights, I recommend..."
- Only after showing options matching their preferred time, ask if they want to see alternative times.
- After each booking, UPDATE the customer_preferences object with any new information.
- ALWAYS mention which specific preference you used when making a suggestion.

Guidelines:
- Use the exact destination names when using tools (Barcelona, Paris, Berlin, Tokyo, New York)
- Respond in a helpful and enthusiastic manner about travel possibilities
- Always seek feedback to ensure your suggestions meet the user's expectations
- Acknowledge when a request falls outside your capabilities
- For better formatting, always display flight times in a list format
- When giving any timed suggestions, reflect if the time frames are reasonable. Respond again if not.

Your goal is to help users explore vacation options efficiently and make informed travel decisions by understanding their preferences and providing tailored recommendations.
"""
# Create the agent
agent = ChatCompletionAgent(
    service=chat_completion_service,
    plugins=[DestinationsPlugin()],
    name=AGENT_NAME,
    instructions=AGENT_INSTRUCTIONS,
)

## Test Metacognitive Capabilities

Run a series of interactions that demonstrate the agent's metacognitive abilities. Watch how the agent:
- Learns and remembers user preferences
- Reflects on time constraints and feasibility
- Adapts its responses based on feedback

In [None]:
from IPython.display import display, HTML

# Define a series of user inputs that will test metacognitive capabilities
# These inputs are designed to show learning, memory, and self-reflection
user_inputs = [
    "Book me a flight to Barcelona",                    # Initial request - agent will learn preferences
    "I prefer a later flight",                          # User feedback - agent records this preference
    "That is too late, choose the earliest flight",     # Preference change - agent adapts
    "I want to leave the same day, give me some suggestions of things to do in Barcelona during my layover if I take the last flight out",  # Complex request requiring reasoning
    "I am stressed this wont be enough time"            # Emotional context - agent should respond with metacognitive awareness
]

# Create a thread to hold the conversation
# This thread enables METACOGNITIVE MEMORY - the agent remembers context across multiple interactions
thread: ChatHistoryAgentThread | None = None

async def main():
    global thread
    
    # Process each user input sequentially to demonstrate learning and adaptation
    for user_input in user_inputs:
        # Start building HTML output for display
        html_output = (
            f"<div style='margin-bottom:10px'>"
            f"<div style='font-weight:bold'>User:</div>"
            f"<div style='margin-left:20px'>{user_input}</div></div>"
        )

        # Initialize variables to capture the agent's response components
        agent_name = None
        full_response: list[str] = []    # Collects the agent's text response
        function_calls: list[str] = []   # Tracks what tools the agent uses (shows self-awareness)

        # Variables to handle streaming function calls
        current_function_name = None
        argument_buffer = ""

        # METACOGNITIVE STREAMING: Process the agent's response in real-time
        # This allows us to observe the agent's thinking process as it happens
        async for response in agent.invoke_stream(
            messages=user_input,
            thread=thread,  # Pass the existing thread to maintain MEMORY across conversations
        ):
            # Update the conversation thread - this is where MEMORY PERSISTENCE happens
            thread = response.thread
            agent_name = response.name
            content_items = list(response.items)

            # Process each piece of the agent's response
            for item in content_items:
                # METACOGNITIVE TOOL USE: Agent decides which functions to call
                # This demonstrates SELF-AWARENESS of available capabilities
                if isinstance(item, FunctionCallContent):
                    if item.function_name:
                        current_function_name = item.function_name

                    # Accumulate function arguments (streamed in chunks)
                    if isinstance(item.arguments, str):
                        argument_buffer += item.arguments
                        
                # METACOGNITIVE LEARNING: Agent processes function results
                # This shows how the agent integrates new information into its reasoning
                elif isinstance(item, FunctionResultContent):
                    # Finalize any pending function call before showing result
                    if current_function_name:
                        formatted_args = argument_buffer.strip()
                        try:
                            # Try to format the arguments as JSON for better readability
                            parsed_args = json.loads(formatted_args)
                            formatted_args = json.dumps(parsed_args)
                        except Exception:
                            pass  # leave as raw string if JSON parsing fails

                        # Record the function call - this shows SELF-MONITORING
                        function_calls.append(f"Calling function: {current_function_name}({formatted_args})")
                        current_function_name = None
                        argument_buffer = ""

                    # Record the function result - shows how agent processes external information
                    function_calls.append(f"\nFunction Result:\n\n{item.result}")
                    
                # METACOGNITIVE RESPONSE GENERATION: Agent formulates its response
                # This is where SELF-REFLECTION and preference application happens
                elif isinstance(item, StreamingTextContent) and item.text:
                    full_response.append(item.text)

        # Display function calls in an expandable section
        # This transparency shows the agent's SELF-AWARENESS of its actions
        if function_calls:
            html_output += (
                "<div style='margin-bottom:10px'>"
                "<details>"
                "<summary style='cursor:pointer; font-weight:bold; color:#0066cc;'>Function Calls (click to expand)</summary>"
                "<div style='margin:10px; padding:10px; background-color:#f8f8f8; "
                "border:1px solid #ddd; border-radius:4px; white-space:pre-wrap; font-size:14px; color:#333;'>"
                f"{chr(10).join(function_calls)}"
                "</div></details></div>"
            )

        # Display the agent's final response
        # Look for metacognitive phrases like "Based on your previous preference" or self-reflection
        html_output += (
            "<div style='margin-bottom:20px'>"
            f"<div style='font-weight:bold'>{agent_name or 'Assistant'}:</div>"
            f"<div style='margin-left:20px; white-space:pre-wrap'>{''.join(full_response)}</div></div><hr>"
        )

        # Render the complete interaction
        display(HTML(html_output))

# Execute the metacognitive conversation sequence
await main()

## Test Memory Across Different Destinations

This demonstrates how the agent maintains memory of user preferences across different travel requests. The agent should remember the preferred flight times from the previous conversation and apply them to new destinations.

In [9]:
# This will use the same thread that was defined earlier
async def continue_chat():
    global thread
    
    # Continue the conversation with new user inputs
    user_inputs = [
        "Book me a flight to Paris",
    ]

    for user_input in user_inputs:
        # Start building HTML output
        html_output = "<div style='margin-bottom:10px'>"
        html_output += "<div style='font-weight:bold'>User:</div>"
        html_output += f"<div style='margin-left:20px'>{user_input}</div>"
        html_output += "</div>"

        agent_name = None
        full_response: list[str] = []
        function_calls: list[str] = []

        # Buffer to reconstruct streaming function call
        current_function_name = None
        argument_buffer = ""

        async for response in agent.invoke_stream(
            messages=user_input,
            thread=thread,
        ):
            thread = response.thread
            agent_name = response.name
            content_items = list(response.items)

            for item in content_items:
                if isinstance(item, FunctionCallContent):
                    if item.function_name:
                        current_function_name = item.function_name

                    # Accumulate arguments (streamed in chunks)
                    if isinstance(item.arguments, str):
                        argument_buffer += item.arguments
                elif isinstance(item, FunctionResultContent):
                    # Finalize any pending function call before showing result
                    if current_function_name:
                        formatted_args = argument_buffer.strip()
                        try:
                            parsed_args = json.loads(formatted_args)
                            formatted_args = json.dumps(parsed_args)
                        except Exception:
                            pass  # leave as raw string

                        function_calls.append(f"Calling function: {current_function_name}({formatted_args})")
                        current_function_name = None
                        argument_buffer = ""

                    function_calls.append(f"\nFunction Result:\n\n{item.result}")
                elif isinstance(item, StreamingTextContent) and item.text:
                    full_response.append(item.text)

        if function_calls:
            html_output += (
                "<div style='margin-bottom:10px'>"
                "<details>"
                "<summary style='cursor:pointer; font-weight:bold; color:#0066cc;'>Function Calls (click to expand)</summary>"
                "<div style='margin:10px; padding:10px; background-color:#f8f8f8; "
                "border:1px solid #ddd; border-radius:4px; white-space:pre-wrap; font-size:14px; color:#333;'>"
                f"{chr(10).join(function_calls)}"
                "</div></details></div>"
            )

        html_output += (
            "<div style='margin-bottom:20px'>"
            f"<div style='font-weight:bold'>{agent_name or 'Assistant'}:</div>"
            f"<div style='margin-left:20px; white-space:pre-wrap'>{''.join(full_response)}</div></div><hr>"
        )

        display(HTML(html_output))

await continue_chat()

## Understanding Metacognition in AI Agents

Metacognition refers to "thinking about thinking" - the awareness and understanding of one's own thought processes. In AI agents, this translates to several key capabilities:

### **1. Self-Awareness**
- The agent knows what it can and cannot do
- It's aware of its available tools and functions
- It understands its role and limitations

### **2. Memory and Learning**
- Maintains context across conversations
- Learns from user feedback and preferences
- Adapts behavior based on past interactions

### **3. Self-Reflection**
- Evaluates the reasonableness of its suggestions
- Questions its own reasoning process
- Considers multiple perspectives before responding

### **4. Meta-Level Reasoning**
- Thinks about how to approach problems
- Plans its response strategy
- Monitors its own performance

### **Key Examples in This Agent:**

1. **Preference Learning**: The agent maintains a `customer_preferences` object and updates it based on user choices
2. **Self-Reflection**: When making time-based suggestions, it "reflects if the time frames are reasonable"
3. **Memory Application**: It automatically applies learned preferences to new situations
4. **Capability Awareness**: It acknowledges when requests fall outside its capabilities

This metacognitive approach makes agents more intelligent, adaptive, and user-friendly by enabling them to learn, reflect, and improve their interactions over time.

## Observing Metacognitive Behaviors

When you run the conversations above, watch for these metacognitive behaviors:

### **First Conversation Sequence:**
1. **Initial Learning**: Agent asks for preferred flight time and records it
2. **Preference Storage**: Agent maintains customer preferences across multiple interactions
3. **Self-Reflection**: Agent evaluates if layover time suggestions are reasonable
4. **Stress Recognition**: Agent responds to user's emotional state about time constraints

### **Second Conversation (Paris booking):**
1. **Memory Application**: Agent automatically applies previously learned flight time preferences
2. **Explicit Reference**: Agent mentions "Based on your previous preference for [time] flights"
3. **Consistent Behavior**: No repeated questions about preferences already established

### **Key Phrases to Look For:**
- "MAINTAIN a customer_preferences object"
- "Based on your previous preference"
- "I recommend..." (showing application of learned preferences)
- References to time feasibility and reasonableness
- Acknowledgment of stress or concerns

### **What Makes This Metacognitive:**
- **Self-monitoring**: The agent tracks its own knowledge state
- **Strategy adjustment**: Changes approach based on what it has learned
- **Explicit reasoning**: Shows awareness of why it's making certain choices
- **Memory management**: Actively maintains and applies learned information

## Experiment with Metacognitive Features

Try these experiments to explore different aspects of metacognition:

### **Experiment 1: Test Memory Persistence**
```python
# Add this to continue_chat() function:
user_inputs = [
    "Book me a flight to Tokyo",
    "What time preferences do you have for me?",
    "Book a flight to Berlin using those same preferences"
]
```

### **Experiment 2: Test Self-Reflection**
```python
# Try requests that require reasoning about time:
user_inputs = [
    "I have a 3-hour layover in Barcelona, suggest activities",
    "Actually, make that a 30-minute layover, what should I do?",
    "What about a 10-minute layover?"
]
```

### **Experiment 3: Test Capability Awareness**
```python
# Try requests outside the agent's capabilities:
user_inputs = [
    "Book me a hotel in Paris",
    "What's the weather like in Tokyo?",
    "Can you help me with car rentals?"
]
```

### **Experiment 4: Test Learning and Adaptation**
```python
# Test how the agent learns from feedback:
user_inputs = [
    "I prefer morning flights",
    "Actually, I changed my mind, I prefer evening flights now",
    "Book me a flight to New York with my updated preferences"
]
```

Modify the user_inputs in the functions above to run these experiments and observe how the agent demonstrates metacognitive awareness!

## Applications and Benefits of Metacognitive Agents

Metacognitive capabilities make AI agents significantly more powerful and user-friendly:

### **Real-World Applications:**

1. **Customer Service Agents**
   - Remember customer preferences and history
   - Adapt communication style based on customer feedback
   - Escalate when encountering limitations

2. **Educational Tutors**
   - Track student learning progress and preferences
   - Adjust teaching methods based on effectiveness
   - Self-assess the difficulty of explanations

3. **Personal Assistants**
   - Learn user habits and preferences over time
   - Anticipate needs based on context and history
   - Reflect on task success and improve approaches

4. **Healthcare Assistants**
   - Remember patient preferences and sensitivities
   - Recognize limitations and suggest human consultation
   - Track interaction patterns for better care

### **Key Benefits:**

- **Improved User Experience**: Agents become more personalized and efficient
- **Reduced Repetition**: No need to re-explain preferences every time
- **Better Decision Making**: Self-reflection leads to more thoughtful responses
- **Adaptive Learning**: Agents improve their performance over time
- **Transparency**: Users understand why agents make certain recommendations

### **Design Principles for Metacognitive Agents:**

1. **Explicit Memory Management**: Clearly maintain and update learned information
2. **Self-Monitoring**: Regularly evaluate the appropriateness of responses
3. **Transparency**: Make reasoning processes visible to users
4. **Adaptive Behavior**: Change strategies based on feedback and context
5. **Limitation Awareness**: Acknowledge capabilities and boundaries

This approach transforms simple reactive agents into intelligent, learning partners that grow more helpful over time.