# Multimodal Multi-Agent Framework with Azure OpenAI Assistant API

This notebook demonstrates a collaborative multi-agent system with:
- **User Proxy Assistant** (orchestrator)
- **DALL-E Assistant** (image generation)
- **Vision Assistant** (image analysis)

Based on the architecture described in the Azure OpenAI Assistant API documentation.

## 1. Setup and Installation

First, let's install the required packages:

In [None]:
!pip install openai pillow requests python-dotenv

## 2. Environment Configuration

Set up your Azure OpenAI credentials:

In [None]:
import os
from dotenv import load_dotenv

# Load environment variables
load_dotenv()

# Azure OpenAI Configuration
AZURE_OPENAI_ENDPOINT = os.getenv("AZURE_OPENAI_ENDPOINT", "your-endpoint-here")
AZURE_OPENAI_API_KEY = os.getenv("AZURE_OPENAI_API_KEY", "your-api-key-here")
API_VERSION = "2024-02-15-preview"
ASSISTANT_DEPLOYMENT_NAME = "gpt-4-1106-preview"  # Change to your deployment name

print(f"Endpoint: {AZURE_OPENAI_ENDPOINT[:50]}...")
print(f"API Key: {'*' * len(AZURE_OPENAI_API_KEY) if AZURE_OPENAI_API_KEY != 'your-api-key-here' else 'Not set'}")

## 3. Multi-Agent Framework Implementation

Let's import the framework class:

In [None]:
# Import the framework from our module
from multiagent_framework import MultiAgentFramework

# Alternatively, you can copy the class definition here if you prefer

## 4. Initialize the Multi-Agent Framework

Create an instance of the framework with your Azure OpenAI credentials:

In [None]:
# Initialize the framework
framework = MultiAgentFramework(
    azure_endpoint=AZURE_OPENAI_ENDPOINT,
    api_key=AZURE_OPENAI_API_KEY,
    api_version=API_VERSION,
    assistant_deployment_name=ASSISTANT_DEPLOYMENT_NAME
)

print("Framework initialized successfully!")

## 5. Start the Conversation

Initialize the main conversation thread:

In [None]:
# Start the main conversation thread
thread_id = framework.start_conversation()
print(f"Main thread ID: {thread_id}")

## 6. Example 1: Simple Image Generation and Analysis

Let's test the basic functionality with a simple request:

In [None]:
# Simple image generation request
user_query = "Generate an image of a futuristic city skyline at night with neon lights"

print("=" * 60)
print("EXAMPLE 1: Simple Image Generation")
print("=" * 60)

response = framework.send_user_message(user_query)
print("\nResponse received!")

## 7. Example 2: Complex Multi-Agent Workflow

Now let's test the full multi-agent collaboration with image generation and analysis:

In [None]:
# Complex multi-agent workflow
user_query = """
Create a beautiful landscape image of a sunset over mountains, 
then analyze it for quality and suggest improvements. 
Based on the analysis, create an improved version of the image.
"""

print("=" * 60)
print("EXAMPLE 2: Multi-Agent Collaboration")
print("=" * 60)

response = framework.send_user_message(user_query)
print("\nMulti-agent workflow completed!")

## 8. Example 3: Interactive Conversation

Let's have an interactive conversation with the agents:

In [None]:
# Interactive conversation loop
print("=" * 60)
print("EXAMPLE 3: Interactive Conversation")
print("Type 'quit' to exit")
print("=" * 60)

while True:
    user_input = input("\nYou: ")
    
    if user_input.lower() in ['quit', 'exit', 'stop']:
        print("Ending conversation...")
        break
    
    if user_input.strip() == "":
        continue
    
    try:
        response = framework.send_user_message(user_input)
    except KeyboardInterrupt:
        print("\nConversation interrupted.")
        break
    except Exception as e:
        print(f"Error: {e}")

## 9. Agent Communication Analysis

Let's examine how agents communicate with each other:

In [None]:
# Display agent thread information
print("=" * 60)
print("AGENT THREAD ANALYSIS")
print("=" * 60)

print(f"Main thread ID: {framework.main_thread.id if framework.main_thread else 'Not created'}")
print("\nAgent threads:")

for agent_name, info in framework.agents_threads.items():
    thread_id = info['thread'].id if info['thread'] else 'Not created'
    print(f"  {agent_name}: {thread_id}")

print(f"\nTotal assistants created: {len(framework.assistants)}")
for name, assistant in framework.assistants.items():
    print(f"  {name}: {assistant.id}")

## 10. Advanced Features and Customization

Let's explore some advanced features:

In [None]:
# Test specific agent functions directly
print("=" * 60)
print("DIRECT AGENT FUNCTION TESTING")
print("=" * 60)

# Test direct image generation
print("\n1. Testing direct image generation:")
image_result = framework.generate_image(
    "A serene Japanese garden with cherry blossoms and a small pond",
    "1024x1024"
)
print(f"Result: {image_result[:100]}...")

# Test direct message to agent
print("\n2. Testing direct message to DALL-E assistant:")
dalle_response = framework.send_message_to_agent(
    "dalle_assistant",
    "Create an abstract art piece with vibrant colors"
)
print(f"DALL-E Response: {dalle_response[:100]}...")

## 11. Performance Monitoring

Let's add some performance monitoring:

In [None]:
import time

# Performance test
print("=" * 60)
print("PERFORMANCE MONITORING")
print("=" * 60)

start_time = time.time()

# Test a simple request
test_query = "Generate a simple cartoon character"
response = framework.send_user_message(test_query)

end_time = time.time()
duration = end_time - start_time

print(f"\nRequest completed in {duration:.2f} seconds")
print(f"Response length: {len(response)} characters")

## 12. Error Handling and Debugging

Let's test error handling capabilities:

In [None]:
# Test error handling
print("=" * 60)
print("ERROR HANDLING TESTING")
print("=" * 60)

# Test invalid agent name
print("1. Testing invalid agent name:")
error_response = framework.send_message_to_agent(
    "invalid_agent",
    "This should fail"
)
print(f"Response: {error_response}")

# Test with very long prompt
print("\n2. Testing with very long prompt:")
long_prompt = "Create an image of " + "a very detailed scene " * 100
try:
    long_response = framework.generate_image(long_prompt[:1000])  # Truncate to reasonable length
    print(f"Long prompt result: {long_response[:100]}...")
except Exception as e:
    print(f"Error with long prompt: {e}")

## 13. Cleanup and Resource Management

Always clean up resources when done:

In [None]:
# Cleanup resources
print("=" * 60)
print("CLEANUP AND RESOURCE MANAGEMENT")
print("=" * 60)

# Show current resource usage
print(f"Assistants to cleanup: {len(framework.assistants)}")
print(f"Active threads: {sum(1 for info in framework.agents_threads.values() if info['thread'])}")

# Perform cleanup
framework.cleanup()

print("\n✅ Cleanup completed!")
print("All assistants have been deleted from Azure OpenAI.")

## 14. Summary and Next Steps

This notebook demonstrates a complete multimodal multi-agent framework using Azure OpenAI Assistant API.

### Key Features Demonstrated:
1. **Multi-Agent Architecture**: User Proxy, DALL-E, and Vision assistants
2. **Persistent Threading**: Agents maintain conversation context
3. **Function Calling**: Agents can call specialized functions
4. **Inter-Agent Communication**: Agents collaborate to solve complex tasks
5. **Error Handling**: Robust error management and debugging
6. **Resource Management**: Proper cleanup of Azure resources

### Possible Extensions:
- Add more specialized agents (e.g., text analysis, code generation)
- Implement agent scheduling and task queuing
- Add persistent storage for conversation history
- Integrate with external APIs and services
- Create a web interface for easier interaction
- Add metrics and analytics for agent performance

### Best Practices:
- Always clean up resources after use
- Monitor API usage and costs
- Implement proper error handling
- Use environment variables for sensitive data
- Test with various input types and edge cases