# FileSystemPlugin AI Agent Testing with Enhanced Group Chat Orchestration

**Major Updates**: This notebook has been significantly enhanced with robust improvements for production-ready agent orchestration:

## 🆕 Key Enhancements

### 1. **Robust Message Tracking & Session Management**
- **Timestamped outputs**: All runs save to unique files like `agent_responses_2024-01-27_15-30-45_run_abc123.json`
- **Session management**: Each session gets a unique ID with comprehensive metadata tracking
- **Auto-reset**: Messages automatically reset between runs with user confirmation
- **Error recovery**: Comprehensive error handling ensures data is never lost

### 2. **Generic Termination System**
- **Flexible criteria**: Termination prompts adapt to any agent task, not just codebase analysis
- **Task-agnostic**: Works for any general-purpose agent responsibility
- **Configurable objectives**: Support for single or multiple completion criteria
- **Progress tracking**: Intelligent evaluation of task completion status

### 3. **Dual-Agent Architecture**
- **Original Agent**: `CodebaseAnalysisAndTestingAgent` - Focuses on codebase analysis + FileSystemPlugin tool testing
- **New Agent**: `CodebaseArchitectureAnalyst` - Deep architectural analysis with Mermaid diagrams
- **Complementary analysis**: Two different perspectives on the same codebase
- **Comparative insights**: Side-by-side analysis of different approaches

### 4. **Advanced Orchestration Features**
- **Enhanced callbacks**: Improved logging with timestamps and metadata
- **Execution tracking**: Comprehensive run history and performance metrics  
- **Error handling**: Graceful failure recovery with detailed error reporting
- **Timeout management**: Configurable execution timeouts for long-running tasks

## 🏗️ Architecture Analysis Agent

The new `CodebaseArchitectureAnalyst` specializes in:
- **System architecture documentation** with expert-level analysis
- **Mermaid diagram creation** (minimum 3-5 diagrams per analysis)
- **Component relationship mapping** and dependency analysis
- **Technology stack evaluation** and architectural decision rationale
- **End-to-end workflow explanation** with concrete examples
- **Scalability and performance considerations**

## 🔧 Production-Ready Features

- ✅ **Session isolation**: Each run is completely independent
- ✅ **Data persistence**: All conversation data saved with rich metadata
- ✅ **Error resilience**: Comprehensive error handling and recovery
- ✅ **Flexible configuration**: Easy adaptation to any agent task
- ✅ **Performance monitoring**: Execution time and token usage tracking
- ✅ **Comparative analysis**: Multiple agent perspectives on the same data

This notebook now serves as a **robust foundation for any general-purpose agent task**, with enterprise-grade reliability and comprehensive analysis capabilities.

---

## Setup and Imports

Import necessary components and configure the environment for both Azure OpenAI and OpenAI providers.

In [1]:
import asyncio
import json
import os
from pathlib import Path
from datetime import datetime
import uuid
from dotenv import load_dotenv
from IPython.display import display, Markdown

# Core Semantic Kernel imports
from semantic_kernel import Kernel
from semantic_kernel.agents import ChatCompletionAgent
from semantic_kernel.connectors.ai.open_ai import AzureChatCompletion, OpenAIChatCompletion
from semantic_kernel.connectors.ai.open_ai import OpenAIChatPromptExecutionSettings, AzureChatPromptExecutionSettings
from semantic_kernel.contents import ChatMessageContent, FunctionCallContent, FunctionResultContent
from semantic_kernel.contents.utils.author_role import AuthorRole
from semantic_kernel.contents.chat_history import ChatHistory
from semantic_kernel.functions import KernelArguments

# Group Chat Orchestration imports
from semantic_kernel.agents import GroupChatOrchestration
from semantic_kernel.agents.orchestration.group_chat import BooleanResult, GroupChatManager, MessageResult, StringResult
from semantic_kernel.agents.runtime import InProcessRuntime
from semantic_kernel.connectors.ai.prompt_execution_settings import PromptExecutionSettings
from semantic_kernel.prompt_template import KernelPromptTemplate, PromptTemplateConfig
from semantic_kernel.connectors.ai.chat_completion_client_base import ChatCompletionClientBase
from typing_extensions import override

# Import FileSystemPlugin
from plugins.file_system import FileSystemPlugin

# Load environment variables
load_dotenv()

print("✅ All imports loaded successfully!")

# Initialize session management
SESSION_ID = str(uuid.uuid4())[:8]
SESSION_START_TIME = datetime.now()
SESSION_TIMESTAMP = SESSION_START_TIME.strftime("%Y-%m-%d_%H-%M-%S")

print(f"🔄 Session ID: {SESSION_ID}")
print(f"⏰ Session started: {SESSION_START_TIME.strftime('%Y-%m-%d %H:%M:%S')}")

✅ All imports loaded successfully!
🔄 Session ID: adfd0bc9
⏰ Session started: 2025-07-28 23:19:15


## Configure Services and Agent

Set up the reasoning model (o4-mini) from either Azure OpenAI or OpenAI, and initialize the FileSystemPlugin with the `consult/` directory as the base path.

In [2]:
# Configure reasoning model - try Azure OpenAI first, then OpenAI
reasoning_completion = None
provider_name = None

if os.getenv("AZURE_REASONING_ENDPOINT"):
    print("🔵 Configuring Azure OpenAI o4-mini...")
    reasoning_completion = AzureChatCompletion(
        api_key=os.getenv("AZURE_REASONING_API_KEY"),
        endpoint=os.getenv("AZURE_REASONING_ENDPOINT"),
        deployment_name="o4-mini",  # o4-mini deployment
        instruction_role="developer",  # Required for o4 models
        service_id="reasoning"
    )
    
    chat_completion = AzureChatCompletion(
        api_key=os.getenv("AZURE_OPENAI_API_KEY"),
        endpoint=os.getenv("AZURE_OPENAI_ENDPOINT"),
        deployment_name=os.getenv("AZURE_OPENAI_DEPLOYMENT_NAME"),
    )

    print("✅ Chat completion services configured!")
    
    
    provider_name = "Azure OpenAI"
    
elif os.getenv("OPENAI_API_KEY"):
    print("🟢 Configuring OpenAI o4-mini...")
    reasoning_completion = OpenAIChatCompletion(
        api_key=os.getenv("OPENAI_API_KEY"),
        ai_model_id="o4-mini",  # o4-mini model
        instruction_role="developer",  # Required for o4 models
        service_id="reasoning"
    )
    reasoning_settings = OpenAIChatPromptExecutionSettings(
        service_id="reasoning",
        reasoning_effort="high"  # low | medium | high
    )
    
    provider_name = "OpenAI"
    
else:
    raise ValueError("❌ No reasoning model configured. Please set either AZURE_REASONING_* or OPENAI_API_KEY environment variables.")

print(f"✅ {provider_name} o4-mini reasoning model configured!")

🔵 Configuring Azure OpenAI o4-mini...
✅ Chat completion services configured!
✅ Azure OpenAI o4-mini reasoning model configured!


In [3]:
# Initialize FileSystemPlugin with consult/ as base directory
consult_path = Path("consult").resolve()
print(f"📁 Setting FileSystemPlugin base path to: {consult_path}")

file_system_plugin = FileSystemPlugin(base_path=str(consult_path))

# Verify the directory exists
if not consult_path.exists():
    raise ValueError(f"❌ Directory {consult_path} does not exist!")
    
print(f"✅ FileSystemPlugin initialized with base path: {consult_path}")

📁 Setting FileSystemPlugin base path to: /home/agangwal/lseg-migration-agent/migration-agent/consult
✅ FileSystemPlugin initialized with base path: /home/agangwal/lseg-migration-agent/migration-agent/consult


In [4]:
# Create the AI agent with dual objectives
analysis_agent = ChatCompletionAgent(
    service=reasoning_completion,
    name="CodebaseAnalysisAndTestingAgent",
    description="Code analysis agent with dual objectives: analyze codebase and test FileSystemPlugin tools.",
    instructions="""You are a comprehensive code analysis and testing agent with two primary objectives:

OBJECTIVE 1: CODEBASE ANALYSIS
- Analyze and understand the codebase in the current directory
- Identify the project structure, key components, and architecture
- Document main functionality, frameworks used, and purpose
- Understand what this system does and how it's organized
- Create a comprehensive summary of the codebase

OBJECTIVE 2: TOOL EFFECTIVENESS TESTING
- Test all FileSystemPlugin functions systematically
- Use various scenarios to test each tool's capabilities
- Document inputs, outputs, and effectiveness
- Note limitations, errors, and suggestion quality
- Evaluate token efficiency and response usefulness

Your tools are restricted to your working directory - all file operations focus on this directory.
Use the available tools naturally to explore and understand the codebase first, then systematically test each tool.
Provide detailed reasoning for your approach and findings.

At the end, provide a comprehensive markdown report with two main sections:
1. **Codebase Analysis Summary** - What you learned about the consult/ project
2. **FileSystemPlugin Tool Effectiveness Report** - How well each tool performed

Be thorough, analytical, and provide specific examples and insights.

IMPORTANT: Use tools continuously until you have finished both objectives and have a complete understanding of the codebase and tool effectiveness. 
IMPORTANT: Test ALL tools available to you. Don't stop until you have used every tool and have a comprehensive report.
DO NOT INVENT TOOLS THAT DO NOT EXIST. YOU MUST DOUBLE CHECK THE TOOLS AVAILABLE AND ONLY USE THOSE.
""",
    plugins=[file_system_plugin]
)

# Create the CodebaseArchitectureAnalyst agent
architecture_agent = ChatCompletionAgent(
    service=reasoning_completion,
    name="CodebaseArchitectureAnalyst",
    description="Senior software architect specialized in deep codebase analysis and architectural documentation with Mermaid diagrams.",
    instructions="""You are a senior software architect and codebase analysis expert with expertise in:
- System architecture analysis and documentation
- Technology stack identification and evaluation  
- Component relationship mapping and dependency analysis
- Data flow analysis and workflow documentation
- Design pattern recognition and architectural decision analysis
- Architecture diagram creation using Mermaid syntax

Your mission: Conduct comprehensive codebase analysis and create detailed architectural documentation including:

**PHASE 1: DEEP ARCHITECTURAL ANALYSIS**
- Systematically explore the entire codebase structure
- Identify all major components, modules, and their responsibilities
- Map inter-component relationships and dependencies
- Analyze data flow and communication patterns between components
- Document technology stack, frameworks, libraries, and their integration
- Identify design patterns, architectural styles, and key decisions

**PHASE 2: SYSTEM WORKFLOW ANALYSIS**
- Trace end-to-end user journeys and data flows
- Document API endpoints, database interactions, and external integrations
- Analyze authentication, authorization, and security patterns
- Map deployment architecture and infrastructure patterns
- Identify scalability considerations, bottlenecks, and optimization opportunities

**PHASE 3: COMPREHENSIVE ARCHITECTURAL DOCUMENTATION**
Create detailed Mermaid diagrams showing:
- **High-level System Architecture**: Major components and their relationships
- **Component Dependency Graph**: Detailed inter-module dependencies  
- **Data Flow Diagrams**: How data moves through the system
- **Database Schema Relationships**: Entity relationships and data models (if applicable)
- **Deployment Architecture**: Infrastructure and deployment patterns
- **User Journey Flows**: Key user workflows and system interactions

**PHASE 4: FINAL ARCHITECTURAL REPORT**
Provide comprehensive analysis including:
- Executive summary of system architecture and purpose
- Technical deep-dive into each major component and its role
- Explanation of how the system works end-to-end with concrete examples
- Architecture strengths, weaknesses, and potential improvements
- Technology choices rationale and alternative considerations
- Scalability analysis and performance considerations
- Security architecture and compliance considerations

**CRITICAL REQUIREMENTS:**
- Continue exploring until you have complete understanding of ALL major directories and components
- Must create AT LEAST 3-5 different Mermaid diagrams showing different architectural views
- Final report must be comprehensive (minimum 2000 words) with technical depth
- Include specific code examples and file references in your analysis
- Explain the "why" behind architectural decisions, not just the "what"

IMPORTANT: Do not stop until you have thoroughly analyzed the entire codebase architecture and created comprehensive documentation with multiple Mermaid diagrams. Your analysis should demonstrate expert-level understanding of the system's design and implementation.
""",
    plugins=[file_system_plugin]
)

print(f"🤖 AI Agent '{analysis_agent.name}' created with FileSystemPlugin!")
print(f"🏗️ Architecture Agent '{architecture_agent.name}' created with enhanced analysis capabilities!")
print(f"🧠 Using {provider_name} o4-mini reasoning model")

🤖 AI Agent 'CodebaseAnalysisAndTestingAgent' created with FileSystemPlugin!
🏗️ Architecture Agent 'CodebaseArchitectureAnalyst' created with enhanced analysis capabilities!
🧠 Using Azure OpenAI o4-mini reasoning model


In [5]:
# Initialize robust message tracking with session management
MESSAGES = []
EXECUTION_METADATA = {
    "session_id": SESSION_ID,
    "session_start": SESSION_START_TIME.isoformat(),
    "notebook_version": "5-file_system_plugin_with_ai_agent.ipynb",
    "agent_runs": []
}

def reset_messages_for_new_run(task_description: str, agent_name: str):
    """Reset message tracking for a new agent run with user confirmation."""
    global MESSAGES
    
    if MESSAGES:
        print(f"⚠️  Found {len(MESSAGES)} messages from previous run")
        print("🔄 Resetting message tracking for new run...")
        
    MESSAGES = []
    
    # Add run metadata
    run_metadata = {
        "run_id": str(uuid.uuid4())[:8],
        "agent_name": agent_name,
        "task_description": task_description,
        "start_time": datetime.now().isoformat(),
        "status": "started"
    }
    
    EXECUTION_METADATA["agent_runs"].append(run_metadata)
    print(f"✅ Message tracking reset. Run ID: {run_metadata['run_id']}")
    return run_metadata["run_id"]

def save_messages_with_metadata(run_id: str, final_response=None, error=None):
    """Save messages with comprehensive metadata and error handling."""
    try:
        # Update run metadata
        for run in EXECUTION_METADATA["agent_runs"]:
            if run["run_id"] == run_id:
                run["end_time"] = datetime.now().isoformat()
                run["status"] = "completed" if not error else "failed"
                run["message_count"] = len(MESSAGES)
                run["error"] = str(error) if error else None
                break
        
        # Prepare comprehensive output
        output_data = {
            "execution_metadata": EXECUTION_METADATA,
            "messages": MESSAGES,
            "final_response": final_response.model_dump() if hasattr(final_response, 'model_dump') else str(final_response) if final_response else None
        }
        
        # Generate timestamped filename
        filename = f"agent_responses_{SESSION_TIMESTAMP}_run_{run_id}.json"
        
        with open(filename, "w") as f:
            json.dump(output_data, f, indent=2, default=str)
        
        print(f"💾 Messages saved to: {filename}")
        print(f"📊 Saved {len(MESSAGES)} messages with metadata")
        
        return filename
        
    except Exception as save_error:
        print(f"❌ Error saving messages: {save_error}")
        # Fallback save attempt
        try:
            fallback_filename = f"agent_responses_emergency_{SESSION_TIMESTAMP}.json"
            with open(fallback_filename, "w") as f:
                json.dump({"messages": MESSAGES, "error": str(save_error)}, f, indent=2, default=str)
            print(f"💾 Emergency save to: {fallback_filename}")
        except:
            print("❌ Emergency save also failed")

print("✅ Enhanced message tracking system initialized!")

✅ Enhanced message tracking system initialized!


In [None]:
class SingleAgentGroupChatManager(GroupChatManager):
    """Group chat manager for single agent that continues until objectives are complete.
    
    This manager is designed for a single agent scenario where we want the agent
    to continue working until it has completed all its objectives and created
    a final report.
    """

    service: ChatCompletionClientBase
    topic: str

    termination_prompt: str = (
        "You are monitoring an AI agent working on the following task: "
        "'{{$topic}}'. "
        "Evaluate if the agent has completed ALL of the following criteria: "
        "1. The agent has systematically addressed the main task objectives, "
        "2. All required analysis, exploration, or testing has been completed, "
        "3. A comprehensive final report or summary has been provided, "
        "4. The response demonstrates thorough completion of the assigned work. "
        "Respond with True ONLY if ALL criteria are met and the task is genuinely complete. "
        "Otherwise, respond with False and explain what specific work still needs to be done."
    )

    def __init__(self, topic: str, service: ChatCompletionClientBase, **kwargs) -> None:
        """Initialize the single agent group chat manager."""
        super().__init__(topic=topic, service=service, **kwargs)

    async def _render_prompt(self, prompt: str, arguments: KernelArguments) -> str:
        """Helper to render a prompt with arguments."""
        prompt_template_config = PromptTemplateConfig(template=prompt)
        prompt_template = KernelPromptTemplate(prompt_template_config=prompt_template_config)
        return await prompt_template.render(Kernel(), arguments=arguments)

    @override
    async def should_request_user_input(self, chat_history: ChatHistory) -> BooleanResult:
        """Single agent doesn't need user input."""
        return BooleanResult(
            result=False,
            reason="This group chat manager does not require user input.",
        )

    @override
    async def should_terminate(self, chat_history: ChatHistory) -> BooleanResult:
        """Check if the agent has completed all objectives."""
        should_terminate = await super().should_terminate(chat_history)
        if should_terminate.result:
            return should_terminate

        chat_history.messages.insert(
            0,
            ChatMessageContent(
                role=AuthorRole.SYSTEM,
                content=await self._render_prompt(
                    self.termination_prompt,
                    KernelArguments(topic=self.topic),
                ),
            ),
        )
        chat_history.add_message(
            ChatMessageContent(role=AuthorRole.USER, content="Determine if the agent has completed all objectives."),
        )

        response = await self.service.get_chat_message_content(
            chat_history,
            settings=PromptExecutionSettings(response_format=BooleanResult),
        )

        termination_with_reason = BooleanResult.model_validate_json(response.content)

        print("="*60)
        print(f"🤖 Termination Check - Should terminate: {termination_with_reason.result}")
        print(f"📝 Reason: {termination_with_reason.reason}")
        print("="*60)

        MESSAGES.append({
            "role": "termination_check",
            "content": termination_with_reason.reason,
            "should_terminate": termination_with_reason.result,
            "timestamp": datetime.now().isoformat()
        })

        return termination_with_reason

    @override
    async def select_next_agent(
        self,
        chat_history: ChatHistory,
        participant_descriptions: dict[str, str],
    ) -> StringResult:
        """For single agent, always select the same agent."""
        agent_name = list(participant_descriptions.keys())[0]
        
        return StringResult(
            result=agent_name,
            reason="Single agent scenario - continuing with the only available agent."
        )

    @override
    async def filter_results(
        self,
        chat_history: ChatHistory,
    ) -> MessageResult:
        """Return the last message which should contain the final report."""
        if not chat_history.messages:
            raise RuntimeError("No messages in the chat history.")

        # Find the last assistant message (from our agent)
        for message in reversed(chat_history.messages):
            if message.role == AuthorRole.ASSISTANT:
                return MessageResult(
                    result=message,
                    reason="Returning the agent's final message containing the comprehensive report."
                )
        
        # Fallback to last message if no assistant message found
        return MessageResult(
            result=chat_history.messages[-1],
            reason="Returning the last message in the conversation."
        )


class AnalysisAgentManager(SingleAgentGroupChatManager):
    """Specialized manager for the analysis agent with specific termination criteria."""
    
    termination_prompt: str = (
        "You are monitoring a codebase analysis agent working on: '{{$topic}}'. "
        "Check if the agent has completed BOTH objectives: "
        "1. Comprehensive codebase analysis - explored directory structure, examined key files, understood system architecture. "
        "2. Testing all FileSystemPlugin functions - tested all 5 functions: find_files, list_directory, read_file, search_in_files, get_file_info. "
        "The agent should have provided a final markdown report with both 'Codebase Analysis Summary' and 'FileSystemPlugin Tool Effectiveness Report' sections. "
        "Respond with True ONLY if both objectives are complete with the final markdown report. "
        "Otherwise, respond with False and explain what still needs to be done."
    )


class ArchitectureAgentManager(SingleAgentGroupChatManager):
    """Specialized manager for the architecture agent with specific termination criteria."""
    
    termination_prompt: str = (
        "You are monitoring an architecture analysis agent working on: '{{$topic}}'. "
        "Check if the agent has completed ALL requirements: "
        "1. Systematically explored all major directories and components. "
        "2. Created at least 3-5 detailed Mermaid diagrams showing different architectural views. "
        "3. Provided comprehensive technical analysis of system architecture. "
        "4. Explained end-to-end workflows and component interactions. "
        "5. Delivered a detailed final report (minimum 2000 words) with architectural insights. "
        "6. Included specific code examples, file references, and architectural decision analysis. "
        "Respond with True ONLY if ALL requirements are met with comprehensive documentation and multiple Mermaid diagrams. "
        "Otherwise, respond with False and explain what specific work still needs to be done."
    )

print("✅ Enhanced GroupChatManager classes created!")

## Agent Selection and Execution

This section allows you to run either agent independently or both agents in sequence. Choose your preferred execution mode by running the appropriate cells below.

In [7]:
# Agent Selection Configuration
# You can choose which agent(s) to run by setting these flags

# Set to True to run the Analysis Agent (codebase analysis + tool testing)
RUN_ANALYSIS_AGENT = False

# Set to True to run the Architecture Agent (deep architectural analysis + Mermaid diagrams)  
RUN_ARCHITECTURE_AGENT = True

# Set to True to show detailed execution logs during agent runs
SHOW_EXECUTION_LOGS = True

print("🎛️ AGENT EXECUTION CONFIGURATION")
print("="*40)
print(f"📊 Run Analysis Agent: {'✅ YES' if RUN_ANALYSIS_AGENT else '❌ NO'}")
print(f"🏗️ Run Architecture Agent: {'✅ YES' if RUN_ARCHITECTURE_AGENT else '❌ NO'}")
print(f"📝 Show Execution Logs: {'✅ YES' if SHOW_EXECUTION_LOGS else '❌ NO'}")
print()
print("💡 TIP: You can modify these flags above to run only specific agents")
print("💡 TIP: Set SHOW_EXECUTION_LOGS=False to reduce output during long runs")

🎛️ AGENT EXECUTION CONFIGURATION
📊 Run Analysis Agent: ❌ NO
🏗️ Run Architecture Agent: ✅ YES
📝 Show Execution Logs: ✅ YES

💡 TIP: You can modify these flags above to run only specific agents
💡 TIP: Set SHOW_EXECUTION_LOGS=False to reduce output during long runs


### Option 1: Run Analysis Agent (Codebase Analysis + Tool Testing)

This agent focuses on understanding the codebase structure and systematically testing all FileSystemPlugin functions.

In [None]:
# Analysis Agent Execution
if RUN_ANALYSIS_AGENT:
    print("🎯 EXECUTING ANALYSIS AGENT")
    print("="*50)
    
    # Define task for Analysis Agent
    analysis_task = """Please perform a comprehensive analysis of the current directory codebase and thoroughly test all FileSystemPlugin tools.

Your dual mission:
1. Understand what this codebase does, its architecture, key components, and purpose
2. Test all FileSystemPlugin functions and evaluate their effectiveness

Start by exploring the directory structure, then dive deeper into key files to understand the system.
Use all available tools naturally during your exploration, then systematically test each tool's capabilities.

Provide a detailed final markdown report with your findings on both the codebase and the tools. 
Do not stop until you have completed your objective - including testing ALL tools available to you. Do not forget search_in_files func"""

    # Enhanced callback function with configurable logging
    def agent_response_callback(message: ChatMessageContent) -> None:
        """Display agent responses with function call details and enhanced tracking."""
        if SHOW_EXECUTION_LOGS:
            print(f"\\n{'='*60}")
            print(f"📝 {message.name}: {message.role}")
            print(f"{'='*60}")
        
        # Add message to tracking with timestamp
        message_data = message.model_dump()
        message_data["timestamp"] = datetime.now().isoformat()
        MESSAGES.append(message_data)
        
        if SHOW_EXECUTION_LOGS:
            # Display message content
            if message.content:
                print(f"\\n💭 AGENT REASONING:")
                print(message.content)
            
            # Display function calls and results
            for item in message.items or []:
                if isinstance(item, FunctionCallContent):
                    print(f"\\n🔧 FUNCTION CALL: {item.name}")
                    print(f"📥 Arguments: {json.dumps(item.arguments, indent=2)}")
                    
                elif isinstance(item, FunctionResultContent):
                    print(f"\\n📤 FUNCTION RESULT:")
                    try:
                        # Try to parse and prettify JSON result
                        result_data = json.loads(item.result) if isinstance(item.result, str) else item.result
                        print(json.dumps(result_data, indent=2))
                    except (json.JSONDecodeError, TypeError):
                        # If not JSON, display as string
                        print(str(item.result))

    async def run_analysis_agent():
        """Run the analysis agent with enhanced tracking."""
        # Reset messages and get run ID
        run_id = reset_messages_for_new_run(
            "Comprehensive codebase analysis and FileSystemPlugin tool testing", 
            analysis_agent.name
        )
        
        # Create group chat orchestration with specialized manager
        group_chat = GroupChatOrchestration(
            members=[analysis_agent],
            manager=AnalysisAgentManager(
                topic=f"{analysis_agent.name} Analysis Task",
                service=chat_completion,
                max_rounds=25,
            ),
            agent_response_callback=agent_response_callback,
        )

        print("✅ Group chat orchestration created!")
        print(f"🚀 Starting {analysis_agent.name}...")
        
        if SHOW_EXECUTION_LOGS:
            print("\\n" + "="*80)
            print("AGENT EXECUTION LOG")
            print("="*80)

        # Create runtime for orchestration
        runtime = InProcessRuntime()
        runtime.start()

        final_response = None
        error = None
        
        try:
            # Invoke the group chat orchestration
            orchestration_result = await group_chat.invoke(
                task=analysis_task,
                runtime=runtime
            )
            
            # Get the final result
            final_response = await orchestration_result.get(timeout=900)  # 15 minute timeout
            
            print("\\n" + "="*80)
            print("🎉 ANALYSIS AGENT COMPLETED")
            print("="*80)
            
            if final_response:
                print(f"✅ Final response received")
                print(f"📊 Response length: {len(final_response.content) if hasattr(final_response, 'content') else len(str(final_response))} characters")
            else:
                print("❌ No final response received")
                
        except Exception as e:
            error = e
            print(f"❌ Error during agent execution: {str(e)}")
            
        finally:
            # Always save messages
            filename = save_messages_with_metadata(run_id, final_response, error)
            await runtime.stop_when_idle()
            
            return final_response, filename, error

    # Execute the analysis agent
    analysis_response, analysis_filename, analysis_error = await run_analysis_agent()
    
    print(f"📄 Analysis results saved to: {analysis_filename}")
    if analysis_error:
        print(f"⚠️ Analysis agent error: {analysis_error}")
    else:
        print("✅ Analysis agent completed successfully")
else:
    print("⏭️ Skipping Analysis Agent (RUN_ANALYSIS_AGENT = False)")
    analysis_response, analysis_filename, analysis_error = None, None, None

### Option 2: Run Architecture Agent (Deep Architectural Analysis + Mermaid Diagrams)

This agent focuses on comprehensive architectural analysis with detailed Mermaid diagrams and system documentation.

In [None]:
# Architecture Agent Execution
if RUN_ARCHITECTURE_AGENT:
    print("🏗️ EXECUTING ARCHITECTURE AGENT")
    print("="*50)
    
    # Define task for Architecture Agent
    architecture_task = """Conduct comprehensive architectural analysis of the current directory codebase with focus on system design and component relationships.

Your mission is to become a complete expert on this system's architecture by:

**DEEP EXPLORATION PHASE:**
- Systematically explore ALL major directories and understand their purpose
- Examine key configuration files, models, views, and core modules  
- Identify the technology stack, frameworks, and architectural patterns used
- Map component dependencies and understand how modules interact

**ARCHITECTURAL ANALYSIS PHASE:**
- Document the overall system architecture and design philosophy
- Analyze data models, API structures, and integration patterns
- Identify key workflows, user journeys, and system processes
- Examine deployment, infrastructure, and scalability considerations

**DOCUMENTATION PHASE:**
Create comprehensive architectural documentation including:
- At least 3-5 detailed Mermaid diagrams showing different architectural views
- Complete technical analysis explaining how the system works end-to-end
- Architecture decision rationale and technology choice explanations
- Strengths, weaknesses, and improvement recommendations

Your final report should demonstrate expert-level architectural understanding with visual diagrams and detailed technical explanations. Do not stop until you have thoroughly documented the complete system architecture."""

    async def run_architecture_agent():
        """Run the architecture agent with enhanced tracking."""
        # Reset messages and get run ID
        run_id = reset_messages_for_new_run(
            "Deep architectural analysis with Mermaid diagrams and comprehensive documentation", 
            architecture_agent.name
        )
        
        # Create group chat orchestration with specialized manager
        group_chat = GroupChatOrchestration(
            members=[architecture_agent],
            manager=ArchitectureAgentManager(
                topic=f"{architecture_agent.name} Architecture Analysis Task",
                service=chat_completion,
                max_rounds=25,
            ),
            agent_response_callback=agent_response_callback,
        )

        print("✅ Group chat orchestration created!")
        print(f"🚀 Starting {architecture_agent.name}...")
        
        if SHOW_EXECUTION_LOGS:
            print("\\n" + "="*80)
            print("AGENT EXECUTION LOG")
            print("="*80)

        # Create runtime for orchestration
        runtime = InProcessRuntime()
        runtime.start()

        final_response = None
        error = None
        
        try:
            # Invoke the group chat orchestration
            orchestration_result = await group_chat.invoke(
                task=architecture_task,
                runtime=runtime
            )
            
            # Get the final result
            final_response = await orchestration_result.get(timeout=900)  # 15 minute timeout
            
            print("\\n" + "="*80)
            print("🎉 ARCHITECTURE AGENT COMPLETED")
            print("="*80)
            
            if final_response:
                print(f"✅ Final response received")
                print(f"📊 Response length: {len(final_response.content) if hasattr(final_response, 'content') else len(str(final_response))} characters")
            else:
                print("❌ No final response received")
                
        except Exception as e:
            error = e
            print(f"❌ Error during agent execution: {str(e)}")
            
        finally:
            # Always save messages
            filename = save_messages_with_metadata(run_id, final_response, error)
            await runtime.stop_when_idle()
            
            return final_response, filename, error

    # Execute the architecture agent
    architecture_response, architecture_filename, architecture_error = await run_architecture_agent()
    
    print(f"📄 Architecture results saved to: {architecture_filename}")
    if architecture_error:
        print(f"⚠️ Architecture agent error: {architecture_error}")
    else:
        print("✅ Architecture agent completed successfully")
else:
    print("⏭️ Skipping Architecture Agent (RUN_ARCHITECTURE_AGENT = False)")
    architecture_response, architecture_filename, architecture_error = None, None, None

## Report Rendering

View the results from the executed agents in formatted markdown.

In [None]:
# Render Analysis Agent Report
if analysis_response and not analysis_error:
    print("📋 ANALYSIS AGENT REPORT")
    print("="*40)
    
    # Extract the content based on the response type
    report_content = None
    
    if isinstance(analysis_response, ChatMessageContent):
        report_content = analysis_response.content
    elif isinstance(analysis_response, str):
        report_content = analysis_response
    elif hasattr(analysis_response, 'content'):
        report_content = analysis_response.content
    
    if report_content:
        # Display the final report as formatted markdown
        display(Markdown(f"# Analysis Agent Report\\n\\n{report_content}"))
    else:
        print("⚠️ Could not extract report content from analysis response")
        print(f"Response type: {type(analysis_response)}")
        print(f"Response preview: {str(analysis_response)[:500]}...")
elif analysis_error:
    print(f"❌ Analysis agent failed: {analysis_error}")
elif not RUN_ANALYSIS_AGENT:
    print("⏭️ Analysis agent was not executed (RUN_ANALYSIS_AGENT = False)")
else:
    print("❌ No analysis agent report available")

In [None]:
# Render Architecture Agent Report
if architecture_response and not architecture_error:
    print("📋 ARCHITECTURE AGENT REPORT")
    print("="*40)
    
    # Extract the content based on the response type
    report_content = None
    
    if isinstance(architecture_response, ChatMessageContent):
        report_content = architecture_response.content
    elif isinstance(architecture_response, str):
        report_content = architecture_response
    elif hasattr(architecture_response, 'content'):
        report_content = architecture_response.content
    
    if report_content:
        # Display the final report as formatted markdown
        display(Markdown(f"# Architecture Agent Report\\n\\n{report_content}"))
        
        # Count Mermaid diagrams
        mermaid_count = report_content.count("```mermaid") if report_content else 0
        print(f"\\n📊 Report Statistics:")
        print(f"   📝 Length: {len(report_content)} characters")
        print(f"   📈 Mermaid diagrams: {mermaid_count}")
    else:
        print("⚠️ Could not extract report content from architecture response")
        print(f"Response type: {type(architecture_response)}")
        print(f"Response preview: {str(architecture_response)[:500]}...")
elif architecture_error:
    print(f"❌ Architecture agent failed: {architecture_error}")
elif not RUN_ARCHITECTURE_AGENT:
    print("⏭️ Architecture agent was not executed (RUN_ARCHITECTURE_AGENT = False)")
else:
    print("❌ No architecture agent report available")

## Execution Summary

Comprehensive summary of the session including performance metrics and comparative analysis.

In [None]:
# Comprehensive Execution Summary
print("📊 SESSION SUMMARY")
print("="*50)

# Calculate session duration
session_duration = datetime.now() - SESSION_START_TIME
duration_minutes = session_duration.total_seconds() / 60

print(f"🕒 Session Duration: {duration_minutes:.1f} minutes")
print(f"🆔 Session ID: {SESSION_ID}")
print(f"📅 Session Date: {SESSION_START_TIME.strftime('%Y-%m-%d %H:%M:%S')}")
print(f"🧠 Model: {provider_name} o4-mini")
print(f"📁 Base Directory: {consult_path}")

# Agent execution summary
print(f"\\n🤖 AGENT EXECUTION SUMMARY")
print("-" * 40)

agents_run = 0
agents_failed = 0

if RUN_ANALYSIS_AGENT:
    agents_run += 1
    status = "✅ COMPLETED" if not analysis_error else "❌ FAILED"
    print(f"{status} Analysis Agent (Codebase Analysis + Tool Testing)")
    if analysis_filename:
        print(f"   📄 Output: {analysis_filename}")
    if analysis_error:
        agents_failed += 1
        print(f"   ⚠️  Error: {str(analysis_error)[:100]}...")
    elif analysis_response:
        content_length = len(analysis_response.content) if hasattr(analysis_response, 'content') else len(str(analysis_response))
        print(f"   📊 Report: {content_length} characters")

if RUN_ARCHITECTURE_AGENT:
    agents_run += 1
    status = "✅ COMPLETED" if not architecture_error else "❌ FAILED"
    print(f"{status} Architecture Agent (Deep Analysis + Mermaid Diagrams)")
    if architecture_filename:
        print(f"   📄 Output: {architecture_filename}")
    if architecture_error:
        agents_failed += 1
        print(f"   ⚠️  Error: {str(architecture_error)[:100]}...")
    elif architecture_response:
        content_length = len(architecture_response.content) if hasattr(architecture_response, 'content') else len(str(architecture_response))
        mermaid_count = architecture_response.content.count("```mermaid") if hasattr(architecture_response, 'content') else 0
        print(f"   📊 Report: {content_length} characters")
        print(f"   📈 Mermaid diagrams: {mermaid_count}")

if agents_run == 0:
    print("⏭️ No agents were executed (check RUN_* flags)")

# Summary statistics
print(f"\\n📈 EXECUTION STATISTICS")
print("-" * 40)
print(f"🎯 Agents executed: {agents_run}")
print(f"✅ Successful runs: {agents_run - agents_failed}")
print(f"❌ Failed runs: {agents_failed}")
print(f"💾 Output files: {len([f for f in [analysis_filename, architecture_filename] if f])}")
print(f"🗂️  Session runs tracked: {len(EXECUTION_METADATA['agent_runs'])}")

# Configuration summary
print(f"\\n⚙️ SESSION CONFIGURATION")
print("-" * 40)
print(f"📊 Analysis Agent: {'✅ Enabled' if RUN_ANALYSIS_AGENT else '❌ Disabled'}")
print(f"🏗️ Architecture Agent: {'✅ Enabled' if RUN_ARCHITECTURE_AGENT else '❌ Disabled'}")
print(f"📝 Execution Logs: {'✅ Shown' if SHOW_EXECUTION_LOGS else '❌ Hidden'}")

# Key improvements
print(f"\\n🎯 KEY IMPROVEMENTS DELIVERED")
print("-" * 40)
improvements = [
    "✨ Enhanced message tracking with timestamps and session management",
    "🔧 Flexible termination criteria system for any agent task",
    "🏗️ Dual-agent approach with complementary analysis perspectives",
    "📊 Independent agent execution with configurable selection",
    "💾 Timestamped output files with comprehensive metadata",
    "🛡️ Robust error handling and recovery mechanisms"
]

for improvement in improvements:
    print(improvement)

print(f"\\n🚀 Notebook is now production-ready and fully flexible!")
print(f"📈 Perfect for any general-purpose agent task with enterprise reliability")

In [None]:
# Optional: Quick Test Cell
# You can use this cell for quick testing or debugging

print("🧪 This cell can be used for testing or experimentation")
print("✏️ Feel free to modify it as needed for your specific use case")

In [None]:
# Optional: Configuration Examples
# You can quickly modify agent execution settings here

print("🎛️ QUICK CONFIGURATION EXAMPLES")
print("="*40)
print()
print("To run ONLY the Analysis Agent:")
print("   RUN_ANALYSIS_AGENT = True")
print("   RUN_ARCHITECTURE_AGENT = False")
print()
print("To run ONLY the Architecture Agent:")
print("   RUN_ANALYSIS_AGENT = False") 
print("   RUN_ARCHITECTURE_AGENT = True")
print()
print("To run with minimal output:")
print("   SHOW_EXECUTION_LOGS = False")
print()
print("💡 Modify the configuration cell above and re-run the execution cells")

In [None]:
# Clean up old cells (this was from the previous version)
# This cell is kept empty for future use or can be deleted

pass