<a href="https://colab.research.google.com/github/ShaliniAnandaPhD/Neuron/blob/main/Tutorial_5_Basic_Monitoring_Watching_Your_Agents_Work.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In previous tutorials, you've built agents that can communicate, remember, and make intelligent decisions. Now we're adding comprehensive monitoring so you can see exactly what's happening inside your agent systems in real-time.

 What you'll build:

 • NeuroMonitor system for real-time agent observation

 • Performance metrics collection and analysis

 • Event logging and trace visualization

 • Health monitoring with alerts and notifications

 • Interactive debugging dashboard with live updates

 Why this matters:

 Production AI systems need observability to debug issues, optimize performance, and ensure reliability. Understanding how to monitor distributed agent systems is crucial for building scalable, maintainable AI applications.

 By the end, you'll understand:

 • How to instrument agent systems for observability

 • Real-time monitoring and alerting patterns

 • Performance bottleneck identification

 • Event correlation and trace analysis

 • Building production-ready monitoring dashboards

In [5]:
print("Tutorial 5: Basic Monitoring - Watching Your Agents Work")
print("=" * 58)
print()
print("Building comprehensive observability for intelligent agent systems...")
print()

Tutorial 5: Basic Monitoring - Watching Your Agents Work

Building comprehensive observability for intelligent agent systems...



In [6]:
# Essential imports
import uuid
import time
import threading
import queue
import json
from abc import ABC, abstractmethod
from dataclasses import dataclass, field
from typing import Any, Dict, List, Optional, Set, Callable, Tuple
from enum import Enum
from collections import defaultdict, deque
import copy
from datetime import datetime, timedelta
import statistics

In [10]:
# Import our foundation from previous tutorials
AgentID = str
MessageID = str

class MessagePriority(Enum):
    LOW = 1
    NORMAL = 2
    HIGH = 3
    URGENT = 4

@dataclass
class Message:
    id: MessageID
    sender: AgentID
    recipients: List[AgentID]
    content: Any
    priority: MessagePriority = MessagePriority.NORMAL
    metadata: Dict[str, Any] = field(default_factory=dict)
    created_at: float = field(default_factory=time.time)

    @classmethod
    def create(cls, sender: AgentID, recipients: List[AgentID], content: Any,
               priority: MessagePriority = MessagePriority.NORMAL) -> 'Message':
        return cls(
            id=str(uuid.uuid4()),
            sender=sender,
            recipients=recipients,
            content=content,
            priority=priority
        )

class EventType(Enum):
    """
    Different types of events that can occur in the agent system

    This provides a structured way to categorize and filter
    system events for monitoring and debugging.
    """
    AGENT_STARTED = "agent_started"
    AGENT_STOPPED = "agent_stopped"
    MESSAGE_SENT = "message_sent"
    MESSAGE_RECEIVED = "message_received"
    MESSAGE_PROCESSED = "message_processed"
    ERROR_OCCURRED = "error_occurred"
    PERFORMANCE_ALERT = "performance_alert"
    CUSTOM_EVENT = "custom_event"

class AlertLevel(Enum):
    """Alert severity levels for monitoring system"""
    INFO = 1
    WARNING = 2
    ERROR = 3
    CRITICAL = 4

@dataclass
class MonitoringEvent:
    """
    A single monitoring event with context and metadata

    Events are the basic unit of observability - they capture
    what happened, when, and in what context.
    """
    id: str
    event_type: EventType
    agent_id: AgentID
    timestamp: float = field(default_factory=time.time)
    data: Dict[str, Any] = field(default_factory=dict)
    alert_level: AlertLevel = AlertLevel.INFO

    @classmethod
    def create(cls, event_type: EventType, agent_id: AgentID,
               data: Dict[str, Any] = None, alert_level: AlertLevel = AlertLevel.INFO) -> 'MonitoringEvent':
        """Factory method to create monitoring events"""
        return cls(
            id=str(uuid.uuid4()),
            event_type=event_type,
            agent_id=agent_id,
            data=data or {},
            alert_level=alert_level
        )

@dataclass
class PerformanceMetrics:
    """
    Performance metrics for an agent or system component
    """
    component_id: str
    timestamp: float = field(default_factory=time.time)
    messages_processed: int = 0
    processing_time_total: float = 0.0
    processing_time_avg: float = 0.0
    messages_sent: int = 0
    messages_received: int = 0
    errors_total: int = 0
    queue_size: int = 0
    uptime_seconds: float = 0.0

    def calculate_derived_metrics(self):
        """Calculate derived metrics from base measurements"""
        if self.messages_processed > 0:
            self.processing_time_avg = self.processing_time_total / self.messages_processed

class NeuroMonitor:
    """
    Comprehensive monitoring system for agent networks

    NeuroMonitor provides real-time observability into agent behavior,
    performance metrics, and system health.
    """

    def __init__(self, max_events: int = 1000):
        self.events: deque = deque(maxlen=max_events)
        self.monitored_agents: Dict[AgentID, 'MonitoredAgent'] = {}
        self.metrics: Dict[str, PerformanceMetrics] = {}
        self.event_lock = threading.Lock()
        self.metrics_lock = threading.Lock()
        self.running = False
        self.monitor_thread = None

        # Statistics
        self.system_stats = {
            'start_time': time.time(),
            'total_events': 0,
            'events_by_type': defaultdict(int),
            'alerts_by_level': defaultdict(int)
        }

        print(f"🔍 NeuroMonitor initialized with max {max_events} events")

    def start(self):
        """Start the monitoring system"""
        if self.running:
            print("⚠️  NeuroMonitor is already running")
            return

        self.running = True
        self.monitor_thread = threading.Thread(
            target=self._monitoring_loop,
            daemon=True,
            name="NeuroMonitor"
        )
        self.monitor_thread.start()
        print("▶️  NeuroMonitor started")

    def stop(self):
        """Stop the monitoring system"""
        if not self.running:
            return

        self.running = False
        if self.monitor_thread:
            self.monitor_thread.join(timeout=2.0)
        print("⏹️  NeuroMonitor stopped")

    def register_agent(self, agent: 'MonitoredAgent'):
        """Register an agent for monitoring"""
        try:
            with self.event_lock:
                self.monitored_agents[agent.id] = agent

                with self.metrics_lock:
                    self.metrics[agent.id] = PerformanceMetrics(component_id=agent.id)

                print(f"📊 Registered agent for monitoring: {agent.name}")

            # Set monitor after registration to avoid deadlock
            agent._set_monitor(self)

        except Exception as e:
            print(f"❌ Error registering agent {agent.name}: {e}")
            # Remove from monitored agents if registration failed
            with self.event_lock:
                if agent.id in self.monitored_agents:
                    del self.monitored_agents[agent.id]
            raise

    def log_event(self, event: MonitoringEvent):
        """Log a monitoring event"""
        with self.event_lock:
            self.events.append(event)

            # Update statistics
            self.system_stats['total_events'] += 1
            self.system_stats['events_by_type'][event.event_type.value] += 1
            self.system_stats['alerts_by_level'][event.alert_level.value] += 1

        # Print important events
        if event.alert_level in [AlertLevel.WARNING, AlertLevel.ERROR, AlertLevel.CRITICAL]:
            level_emoji = {"WARNING": "⚠️", "ERROR": "❌", "CRITICAL": "🚨"}
            emoji = level_emoji.get(event.alert_level.name, "📊")
            agent_name = "Unknown"
            if event.agent_id in self.monitored_agents:
                agent_name = self.monitored_agents[event.agent_id].name

            print(f"{emoji} {event.alert_level.name}: {agent_name} - {event.event_type.value}")

    def update_metrics(self, agent_id: AgentID, metrics_update: Dict[str, Any]):
        """Update performance metrics for an agent"""
        with self.metrics_lock:
            if agent_id not in self.metrics:
                self.metrics[agent_id] = PerformanceMetrics(component_id=agent_id)

            current_metrics = self.metrics[agent_id]
            for key, value in metrics_update.items():
                if hasattr(current_metrics, key):
                    setattr(current_metrics, key, value)

            current_metrics.timestamp = time.time()
            current_metrics.calculate_derived_metrics()

    def get_recent_events(self, count: int = 50) -> List[MonitoringEvent]:
        """Get recent events"""
        with self.event_lock:
            return list(self.events)[-count:]

    def get_system_health(self) -> Dict[str, Any]:
        """Get overall system health summary"""
        with self.metrics_lock:
            active_agents = len(self.monitored_agents)
            total_messages = sum(m.messages_processed for m in self.metrics.values())
            total_errors = sum(m.errors_total for m in self.metrics.values())

            avg_processing_times = [m.processing_time_avg for m in self.metrics.values() if m.processing_time_avg > 0]
            system_avg_processing = statistics.mean(avg_processing_times) if avg_processing_times else 0.0

            uptime = time.time() - self.system_stats['start_time']

            return {
                'system_uptime_seconds': uptime,
                'active_agents': active_agents,
                'total_events': self.system_stats['total_events'],
                'total_messages_processed': total_messages,
                'total_errors': total_errors,
                'error_rate': total_errors / max(total_messages, 1),
                'average_processing_time': system_avg_processing,
                'events_by_type': dict(self.system_stats['events_by_type'])
            }

    def get_agent_metrics(self, agent_id: AgentID) -> Optional[PerformanceMetrics]:
        """Get current metrics for a specific agent"""
        with self.metrics_lock:
            return self.metrics.get(agent_id)

    def _monitoring_loop(self):
        """Main monitoring loop"""
        print("🔄 NeuroMonitor background loop started")

        while self.running:
            try:
                # Collect metrics from all agents
                for agent_id, agent in self.monitored_agents.items():
                    try:
                        agent_metrics = agent.get_monitoring_metrics()
                        self.update_metrics(agent_id, agent_metrics)

                        # Check for performance alerts
                        metrics = self.metrics[agent_id]
                        if metrics.processing_time_avg > 0.5:  # Alert if avg > 0.5s
                            event = MonitoringEvent.create(
                                event_type=EventType.PERFORMANCE_ALERT,
                                agent_id=agent_id,
                                data={'avg_processing_time': metrics.processing_time_avg},
                                alert_level=AlertLevel.WARNING
                            )
                            self.log_event(event)

                    except Exception as e:
                        print(f"❌ Error collecting metrics for {agent_id[:8]}...: {e}")

                time.sleep(2.0)  # Check every 2 seconds

            except Exception as e:
                print(f"❌ Error in monitoring loop: {e}")
                time.sleep(1.0)

        print("🛑 NeuroMonitor background loop stopped")

class MonitoredAgent:
    """
    Base class for agents that can be monitored by NeuroMonitor
    """

    def __init__(self, agent_id: Optional[AgentID] = None, name: str = ""):
        self.id = agent_id or str(uuid.uuid4())
        self.name = name or self.__class__.__name__

        # Message processing
        self._message_queue = queue.Queue()
        self._stop_event = threading.Event()
        self._processing_thread = None

        # Monitoring integration
        self._monitor: Optional[NeuroMonitor] = None
        self._start_time = time.time()
        self._monitoring_metrics = {
            'messages_processed': 0,
            'processing_time_total': 0.0,
            'messages_sent': 0,
            'messages_received': 0,
            'errors_total': 0
        }
        self._metrics_lock = threading.Lock()
        self._running = False

        print(f"🤖 Initialized MonitoredAgent: {self.name} ({self.id[:8]}...)")

    def _set_monitor(self, monitor: NeuroMonitor):
        """Set the monitor for this agent"""
        self._monitor = monitor

        # Log initialization event (but don't block if monitor is busy)
        if self._monitor:
            try:
                event = MonitoringEvent.create(
                    event_type=EventType.AGENT_STARTED,
                    agent_id=self.id,
                    data={'agent_name': self.name, 'agent_type': self.__class__.__name__}
                )
                # Use a separate thread to avoid blocking
                threading.Thread(
                    target=lambda: self._monitor.log_event(event),
                    daemon=True
                ).start()
            except Exception as e:
                print(f"⚠️  Warning: Could not log agent registration event: {e}")

    def start(self):
        """Start the agent with monitoring"""
        if self._running:
            print(f"⚠️  Agent {self.name} is already running")
            return

        self._stop_event.clear()
        self._processing_thread = threading.Thread(
            target=self._processing_loop,
            daemon=True,
            name=f"MonitoredAgent-{self.name}"
        )
        self._processing_thread.start()
        self._running = True

        print(f"▶️  MonitoredAgent {self.name} started")

        if self._monitor:
            event = MonitoringEvent.create(
                event_type=EventType.AGENT_STARTED,
                agent_id=self.id,
                data={'action': 'started'}
            )
            self._monitor.log_event(event)

    def stop(self):
        """Stop the agent with monitoring"""
        if not self._running:
            return

        self._stop_event.set()

        if self._processing_thread:
            self._processing_thread.join(timeout=2.0)

        self._running = False
        print(f"⏹️  MonitoredAgent {self.name} stopped")

        if self._monitor:
            event = MonitoringEvent.create(
                event_type=EventType.AGENT_STOPPED,
                agent_id=self.id,
                data={'uptime': time.time() - self._start_time}
            )
            self._monitor.log_event(event)

    def receive_message(self, message: Message):
        """Receive a message with monitoring"""
        self._message_queue.put(message)

        with self._metrics_lock:
            self._monitoring_metrics['messages_received'] += 1

        print(f"📬 {self.name} received: {str(message.content)[:50]}{'...' if len(str(message.content)) > 50 else ''}")

        if self._monitor:
            event = MonitoringEvent.create(
                event_type=EventType.MESSAGE_RECEIVED,
                agent_id=self.id,
                data={
                    'message_id': message.id,
                    'sender': message.sender,
                    'content_length': len(str(message.content))
                }
            )
            self._monitor.log_event(event)

    def send_message(self, recipients: List[AgentID], content: Any) -> Message:
        """Send a message with monitoring"""
        message = Message.create(
            sender=self.id,
            recipients=recipients,
            content=content
        )

        with self._metrics_lock:
            self._monitoring_metrics['messages_sent'] += 1

        print(f"📤 {self.name} sent: {str(content)[:50]}{'...' if len(str(content)) > 50 else ''}")

        if self._monitor:
            event = MonitoringEvent.create(
                event_type=EventType.MESSAGE_SENT,
                agent_id=self.id,
                data={
                    'message_id': message.id,
                    'recipients': recipients,
                    'content_length': len(str(content))
                }
            )
            self._monitor.log_event(event)

        return message

    def process_message(self, message: Message):
        """Process a message - override this in subclasses"""
        print(f"🎯 {self.name} processing: {message.content}")

        # Simulate processing
        time.sleep(0.01)

        # Echo response
        response = self.send_message(
            recipients=[message.sender],
            content=f"Echo: {message.content}"
        )

        if self._monitor:
            event = MonitoringEvent.create(
                event_type=EventType.MESSAGE_PROCESSED,
                agent_id=self.id,
                data={
                    'input_message_id': message.id,
                    'output_message_id': response.id
                }
            )
            self._monitor.log_event(event)

    def _processing_loop(self):
        """Main message processing loop with monitoring"""
        print(f"🔄 {self.name} processing loop started")

        while not self._stop_event.is_set():
            try:
                message = self._message_queue.get(timeout=0.1)

                start_time = time.time()
                self.process_message(message)
                processing_time = time.time() - start_time

                with self._metrics_lock:
                    self._monitoring_metrics['messages_processed'] += 1
                    self._monitoring_metrics['processing_time_total'] += processing_time

                self._message_queue.task_done()

                print(f"✅ {self.name} processed message in {processing_time:.3f}s")

            except queue.Empty:
                continue
            except Exception as e:
                print(f"❌ Error processing message in {self.name}: {e}")

                with self._metrics_lock:
                    self._monitoring_metrics['errors_total'] += 1

                if self._monitor:
                    event = MonitoringEvent.create(
                        event_type=EventType.ERROR_OCCURRED,
                        agent_id=self.id,
                        data={
                            'error_type': type(e).__name__,
                            'error_message': str(e)
                        },
                        alert_level=AlertLevel.ERROR
                    )
                    self._monitor.log_event(event)

        print(f"🛑 {self.name} processing loop stopped")

    def get_monitoring_metrics(self) -> Dict[str, Any]:
        """Get current monitoring metrics for this agent"""
        with self._metrics_lock:
            metrics = self._monitoring_metrics.copy()

        uptime = time.time() - self._start_time
        queue_size = self._message_queue.qsize()

        metrics.update({
            'uptime_seconds': uptime,
            'queue_size': queue_size,
            'timestamp': time.time()
        })

        return metrics

class SmartMonitoredAgent(MonitoredAgent):
    """
    A sophisticated monitored agent with personality-based behaviors
    """

    def __init__(self, agent_id: Optional[AgentID] = None, name: str = "",
                 personality: str = "balanced"):
        super().__init__(agent_id, name)
        self.personality = personality
        self.conversation_count = 0

        print(f"🧠 SmartMonitoredAgent personality: {personality}")

    def process_message(self, message: Message):
        """Smart message processing with personality"""
        content = str(message.content).lower()
        self.conversation_count += 1

        print(f"🎯 {self.name} ({self.personality}) processing: {message.content}")

        # Determine response based on personality
        response = self._generate_response(content)

        # Add personality-based delay
        if self.personality == "thoughtful":
            time.sleep(0.05)
        elif self.personality == "quick":
            time.sleep(0.01)
        else:
            time.sleep(0.02)

        # Send response
        response_msg = self.send_message(
            recipients=[message.sender],
            content=response
        )

        # Log processing
        if self._monitor:
            event = MonitoringEvent.create(
                event_type=EventType.MESSAGE_PROCESSED,
                agent_id=self.id,
                data={
                    'personality': self.personality,
                    'conversation_count': self.conversation_count,
                    'input_message_id': message.id,
                    'output_message_id': response_msg.id
                }
            )
            self._monitor.log_event(event)

    def _generate_response(self, content: str) -> str:
        """Generate personality-based responses"""
        if self.personality == "friendly":
            if "hello" in content:
                return "Hello there! It's wonderful to meet you! 😊"
            elif "?" in content:
                return "That's such an interesting question! I love chatting!"
            else:
                return "That's really cool! Thanks for sharing!"

        elif self.personality == "analytical":
            if "hello" in content:
                return "Hello. I'm ready to analyze topics you'd like to discuss."
            elif "?" in content:
                return "Let me analyze this systematically. Based on patterns..."
            else:
                return "Interesting. Let me process this information."

        elif self.personality == "helpful":
            if "hello" in content:
                return "Hello! How can I help you today?"
            elif "?" in content:
                return "I'd be happy to help answer that!"
            else:
                return "Thanks! Is there anything I can help you with?"

        elif self.personality == "thoughtful":
            if "hello" in content:
                return "Hello... *pauses thoughtfully* Good to meet you."
            elif "?" in content:
                return "Hmm, that's profound. Let me think carefully..."
            else:
                return "I see... *considers deeply* Much to reflect upon."

        elif self.personality == "quick":
            if "hello" in content:
                return "Hi! Quick chat?"
            elif "?" in content:
                return "Fast answer: depends on context!"
            else:
                return "Got it! Next?"

        else:  # balanced
            if "hello" in content:
                return "Hello! Nice to meet you."
            elif "?" in content:
                return "That's a good question. Let me think about it."
            else:
                return "Thanks for sharing that."

# =============================================================================
# INITIALIZATION COMPLETE
# =============================================================================

print("🔧 Tutorial 5 initialization complete!")
print("✅ All classes loaded successfully:")
print("   - EventType and AlertLevel enums")
print("   - MonitoringEvent and PerformanceMetrics classes")
print("   - NeuroMonitor comprehensive monitoring system")
print("   - MonitoredAgent base class with monitoring")
print("   - SmartMonitoredAgent with personality behaviors")
print()
print("🚀 Ready to start monitoring intelligent agent systems!")
print()


🔧 Tutorial 5 initialization complete!
✅ All classes loaded successfully:
   - EventType and AlertLevel enums
   - MonitoringEvent and PerformanceMetrics classes
   - NeuroMonitor comprehensive monitoring system
   - MonitoredAgent base class with monitoring
   - SmartMonitoredAgent with personality behaviors

🚀 Ready to start monitoring intelligent agent systems!



In [11]:
# DEMO SECTION: Let's monitor our agents in real-time!
# =============================================================================

print("=" * 60)
print("🚀 Tutorial 5: Basic Monitoring - Watching Your Agents Work")
print("=" * 60)
print()

# Step 1: Create the monitoring system
print("📝 Step 1: Creating the NeuroMonitor system...")
monitor = NeuroMonitor(max_events=500)
monitor.start()
print()

🚀 Tutorial 5: Basic Monitoring - Watching Your Agents Work

📝 Step 1: Creating the NeuroMonitor system...
🔍 NeuroMonitor initialized with max 500 events
🔄 NeuroMonitor background loop started
▶️  NeuroMonitor started



In [12]:
# Step 2: Create monitored agents with different personalities
print("📝 Step 2: Creating monitored agents with different personalities...")

# Create agents
alice = SmartMonitoredAgent(name="Alice", personality="friendly")
bob = SmartMonitoredAgent(name="Bob", personality="analytical")
charlie = SmartMonitoredAgent(name="Charlie", personality="helpful")
diana = SmartMonitoredAgent(name="Diana", personality="thoughtful")
erik = SmartMonitoredAgent(name="Erik", personality="quick")

agents = [alice, bob, charlie, diana, erik]

# Register agents with monitor (with safety checks)
print("   Registering agents with monitor...")
for i, agent in enumerate(agents, 1):
    try:
        print(f"   🔄 Registering {agent.name}...")
        monitor.register_agent(agent)
        print(f"   ✅ Registered {agent.name} ({i}/{len(agents)})")
    except Exception as e:
        print(f"   ❌ Failed to register {agent.name}: {e}")
        continue

# Small delay for registration
time.sleep(0.5)

# Start all agents
print("   Starting agent processing threads...")
for i, agent in enumerate(agents, 1):
    agent.start()
    print(f"   ▶️  Started {agent.name} ({i}/{len(agents)})")
    time.sleep(0.1)

# Verify agents are running
print("   Verifying agent status...")
running_count = 0
for agent in agents:
    if agent._running:
        running_count += 1
        print(f"   ✅ {agent.name} is running")
    else:
        print(f"   ❌ {agent.name} failed to start")

print(f"   Successfully started {running_count}/{len(agents)} agents")
print()


📝 Step 2: Creating monitored agents with different personalities...
🤖 Initialized MonitoredAgent: Alice (dec8ba13...)
🧠 SmartMonitoredAgent personality: friendly
🤖 Initialized MonitoredAgent: Bob (e0ba6081...)
🧠 SmartMonitoredAgent personality: analytical
🤖 Initialized MonitoredAgent: Charlie (bb465d88...)
🧠 SmartMonitoredAgent personality: helpful
🤖 Initialized MonitoredAgent: Diana (785bbe44...)
🧠 SmartMonitoredAgent personality: thoughtful
🤖 Initialized MonitoredAgent: Erik (fc5d3552...)
🧠 SmartMonitoredAgent personality: quick
   Registering agents with monitor...
   🔄 Registering Alice...
📊 Registered agent for monitoring: Alice
   ✅ Registered Alice (1/5)
   🔄 Registering Bob...
📊 Registered agent for monitoring: Bob
   ✅ Registered Bob (2/5)
   🔄 Registering Charlie...
📊 Registered agent for monitoring: Charlie
   ✅ Registered Charlie (3/5)
   🔄 Registering Diana...
📊 Registered agent for monitoring: Diana
   ✅ Registered Diana (4/5)
   🔄 Registering Erik...
📊 Registered agent f

In [13]:
# Step 3: Simulate conversations to generate monitoring data
print("📝 Step 3: Simulating conversations to generate monitoring data...")

conversations = [
    ("Hello everyone!", [alice, bob, charlie]),
    ("What do you think about AI?", [alice, diana]),
    ("Can someone help me?", [bob, charlie]),
    ("How are you doing?", [alice, diana, erik]),
    ("Thanks and goodbye!", [alice, bob, charlie, diana, erik])
]

for i, (message_content, target_agents) in enumerate(conversations, 1):
    print(f"--- Conversation {i} ---")
    print(f"Broadcasting: '{message_content}'")

    for agent in target_agents:
        test_msg = Message.create(
            sender="user_simulator",
            recipients=[agent.id],
            content=message_content
        )
        agent.receive_message(test_msg)

    time.sleep(1.0)
    print()


📝 Step 3: Simulating conversations to generate monitoring data...
--- Conversation 1 ---
Broadcasting: 'Hello everyone!'
📬 Alice received: Hello everyone!
📬 Bob received: Hello everyone!
📬 Charlie received: Hello everyone!
🎯 Bob (analytical) processing: Hello everyone!
🎯 Alice (friendly) processing: Hello everyone!
🎯 Charlie (helpful) processing: Hello everyone!
📤 Bob sent: Hello. I'm ready to analyze topics you'd like to d...
✅ Bob processed message in 0.020s
📤 Alice sent: Hello there! It's wonderful to meet you! 😊
✅ Alice processed message in 0.021s
📤 Charlie sent: Hello! How can I help you today?
✅ Charlie processed message in 0.020s

--- Conversation 2 ---
Broadcasting: 'What do you think about AI?'
📬 Alice received: What do you think about AI?
📬 Diana received: What do you think about AI?
🎯 Alice (friendly) processing: What do you think about AI?
🎯 Diana (thoughtful) processing: What do you think about AI?
📤 Alice sent: That's such an interesting question! I love chatti...
✅ Alice

In [14]:
# Step 4: Wait for metrics collection and check system health
print("📝 Step 4: Analyzing system performance...")
print("   Waiting for metrics collection...")
time.sleep(3.0)

# Get system health
health = monitor.get_system_health()
print("   System Health Overview:")
for key, value in health.items():
    if isinstance(value, float):
        if 'time' in key or 'rate' in key:
            print(f"     {key}: {value:.3f}")
        else:
            print(f"     {key}: {value:.2f}")
    elif isinstance(value, dict):
        print(f"     {key}:")
        for sub_key, sub_value in value.items():
            print(f"       {sub_key}: {sub_value}")
    else:
        print(f"     {key}: {value}")

print()


📝 Step 4: Analyzing system performance...
   Waiting for metrics collection...
   System Health Overview:
     system_uptime_seconds: 54.629
     active_agents: 5
     total_events: 55
     total_messages_processed: 15
     total_errors: 0
     error_rate: 0.000
     average_processing_time: 0.024
     events_by_type:
       agent_started: 10
       message_received: 15
       message_sent: 15
       message_processed: 15



In [15]:
# Step 5: Individual agent metrics
print("📝 Step 5: Individual agent performance analysis...")
for agent in agents:
    metrics = monitor.get_agent_metrics(agent.id)
    if metrics:
        print(f"   {agent.name} ({agent.personality}):")
        print(f"     Messages processed: {metrics.messages_processed}")
        print(f"     Messages sent: {metrics.messages_sent}")
        print(f"     Average processing time: {metrics.processing_time_avg:.3f}s")
        print(f"     Queue size: {metrics.queue_size}")
        print(f"     Errors: {metrics.errors_total}")
        print(f"     Uptime: {metrics.uptime_seconds/60:.1f} minutes")

print()


📝 Step 5: Individual agent performance analysis...
   Alice (friendly):
     Messages processed: 4
     Messages sent: 4
     Average processing time: 0.020s
     Queue size: 0
     Errors: 0
     Uptime: 0.9 minutes
   Bob (analytical):
     Messages processed: 3
     Messages sent: 3
     Average processing time: 0.021s
     Queue size: 0
     Errors: 0
     Uptime: 0.9 minutes
   Charlie (helpful):
     Messages processed: 3
     Messages sent: 3
     Average processing time: 0.021s
     Queue size: 0
     Errors: 0
     Uptime: 0.9 minutes
   Diana (thoughtful):
     Messages processed: 3
     Messages sent: 3
     Average processing time: 0.050s
     Queue size: 0
     Errors: 0
     Uptime: 0.9 minutes
   Erik (quick):
     Messages processed: 2
     Messages sent: 2
     Average processing time: 0.010s
     Queue size: 0
     Errors: 0
     Uptime: 0.9 minutes



In [16]:
# Step 6: Event analysis
print("📝 Step 6: Event analysis...")
recent_events = monitor.get_recent_events(20)
print(f"   Recent events: {len(recent_events)}")

event_summary = defaultdict(int)
for event in recent_events:
    event_summary[event.event_type.value] += 1

print("   Event breakdown:")
for event_type, count in event_summary.items():
    print(f"     {event_type}: {count}")

print()

📝 Step 6: Event analysis...
   Recent events: 20
   Event breakdown:
     message_processed: 8
     message_sent: 7
     message_received: 5



In [17]:
# Step 7: Test error conditions
print("📝 Step 7: Testing error monitoring...")

# Simulate high load
print("   Simulating high load on Alice...")
for i in range(10):
    load_msg = Message.create(
        sender="load_tester",
        recipients=[alice.id],
        content=f"Load test message {i+1}"
    )
    alice.receive_message(load_msg)

time.sleep(2.0)

# Check for alerts
recent_events = monitor.get_recent_events(10)
alerts = [e for e in recent_events if e.alert_level != AlertLevel.INFO]
print(f"   Performance alerts generated: {len(alerts)}")

print()

📝 Step 7: Testing error monitoring...
   Simulating high load on Alice...
📬 Alice received: Load test message 1
📬 Alice received: Load test message 2
📬 Alice received: Load test message 3
📬 Alice received: Load test message 4
📬 Alice received: Load test message 5
📬 Alice received: Load test message 6
📬 Alice received: Load test message 7
📬 Alice received: Load test message 8
📬 Alice received: Load test message 9
📬 Alice received: Load test message 10
🎯 Alice (friendly) processing: Load test message 1
📤 Alice sent: That's really cool! Thanks for sharing!
✅ Alice processed message in 0.020s
🎯 Alice (friendly) processing: Load test message 2
📤 Alice sent: That's really cool! Thanks for sharing!
✅ Alice processed message in 0.020s
🎯 Alice (friendly) processing: Load test message 3
📤 Alice sent: That's really cool! Thanks for sharing!
✅ Alice processed message in 0.020s
🎯 Alice (friendly) processing: Load test message 4
📤 Alice sent: That's really cool! Thanks for sharing!
✅ Alice processed

In [19]:
# Step 8: Visualize if possible
print("📝 Step 8: Creating monitoring visualization...")

try:
    import plotly.graph_objects as go
    from plotly.subplots import make_subplots

    # Prepare data
    agent_names = []
    messages_processed = []
    processing_times = []
    personalities = []

    for agent in agents:
        metrics = monitor.get_agent_metrics(agent.id)
        if metrics:
            agent_names.append(agent.name)
            messages_processed.append(metrics.messages_processed)
            processing_times.append(metrics.processing_time_avg)
            personalities.append(agent.personality)

    # Create dashboard
    fig = make_subplots(
        rows=2, cols=2,
        subplot_titles=(
            'Message Processing by Agent',
            'Processing Time by Personality',
            'System Health Score',
            'Event Distribution'
        ),
        specs=[[{"type": "xy"}, {"type": "xy"}],
               [{"type": "xy"}, {"type": "domain"}]]  # domain type for pie chart
    )

    # Messages processed
    fig.add_trace(
        go.Bar(
            x=agent_names,
            y=messages_processed,
            name='Messages',
            marker_color='lightblue'
        ),
        row=1, col=1
    )

    # Processing times
    fig.add_trace(
        go.Scatter(
            x=personalities,
            y=processing_times,
            mode='markers',
            name='Processing Time',
            marker=dict(size=10, color='red')
        ),
        row=1, col=2
    )

    # System health (mock trend)
    health_times = [i for i in range(10)]
    health_scores = [95 + i * 0.5 for i in range(10)]
    fig.add_trace(
        go.Scatter(
            x=health_times,
            y=health_scores,
            mode='lines+markers',
            name='Health %',
            line=dict(color='green')
        ),
        row=2, col=1
    )

    # Event distribution
    event_counts = health['events_by_type']
    fig.add_trace(
        go.Pie(
            labels=list(event_counts.keys()),
            values=list(event_counts.values()),
            name="Events"
        ),
        row=2, col=2
    )

    # Update layout
    fig.update_layout(
        title_text="NeuroMonitor Dashboard - Real-time Agent Monitoring",
        height=800,
        showlegend=True,
        template='plotly_white'
    )

    # Update axes
    fig.update_xaxes(title_text="Agents", row=1, col=1)
    fig.update_yaxes(title_text="Message Count", row=1, col=1)

    fig.update_xaxes(title_text="Personality", row=1, col=2)
    fig.update_yaxes(title_text="Processing Time (s)", row=1, col=2)

    fig.update_xaxes(title_text="Time", row=2, col=1)
    fig.update_yaxes(title_text="Health Score %", row=2, col=1)

    fig.show()

    print("   ✅ Monitoring dashboard created!")
    print("   📊 The dashboard shows:")
    print("      - Message processing capacity per agent")
    print("      - Processing time by personality type")
    print("      - System health trend over time")
    print("      - Distribution of event types")
    print()

except ImportError:
    print("   ⚠️  Plotly not available - skipping visualization")
    print("   💡 To see monitoring visualizations, install plotly: pip install plotly")
    print("   📊 Monitoring summary:")
    print(f"      Total events: {health['total_events']}")
    print(f"      Active agents: {health['active_agents']}")
    print(f"      System uptime: {health['system_uptime_seconds']/60:.1f} minutes")
    print()



📝 Step 8: Creating monitoring visualization...


   ✅ Monitoring dashboard created!
   📊 The dashboard shows:
      - Message processing capacity per agent
      - Processing time by personality type
      - System health trend over time
      - Distribution of event types



In [20]:
# Step 9: Performance analysis
print("📝 Step 9: Performance analysis and optimization insights...")

# Find best and worst performers
best_performer = None
worst_performer = None
best_time = float('inf')
worst_time = 0.0

for agent in agents:
    metrics = monitor.get_agent_metrics(agent.id)
    if metrics and metrics.messages_processed > 0:
        avg_time = metrics.processing_time_avg
        if avg_time < best_time:
            best_time = avg_time
            best_performer = agent
        if avg_time > worst_time:
            worst_time = avg_time
            worst_performer = agent

if best_performer and worst_performer:
    print(f"   🏆 Best performer: {best_performer.name} ({best_performer.personality})")
    print(f"      Average processing time: {best_time:.3f}s")
    print(f"   🐌 Slowest performer: {worst_performer.name} ({worst_performer.personality})")
    print(f"      Average processing time: {worst_time:.3f}s")

    if worst_time > 0:
        speedup = worst_time / best_time
        print(f"   📈 Performance difference: {speedup:.1f}x")

print()


📝 Step 9: Performance analysis and optimization insights...
   🏆 Best performer: Erik (quick)
      Average processing time: 0.010s
   🐌 Slowest performer: Diana (thoughtful)
      Average processing time: 0.050s
   📈 Performance difference: 4.9x



In [22]:
#Step 10: Cleanup and final summary
print("📝 Step 10: Final analysis and cleanup...")

# Final metrics
final_health = monitor.get_system_health()
print("   Final System Statistics:")
print(f"     Runtime: {final_health['system_uptime_seconds']/60:.1f} minutes")
print(f"     Total events: {final_health['total_events']}")
print(f"     Total messages: {final_health['total_messages_processed']}")
print(f"     Error rate: {final_health['error_rate']:.2%}")
print(f"     Avg processing time: {final_health['average_processing_time']:.3f}s")

# Calculate throughput
total_runtime = final_health['system_uptime_seconds']
total_messages = final_health['total_messages_processed']
if total_runtime > 0:
    throughput = total_messages / total_runtime
    print(f"     System throughput: {throughput:.1f} messages/second")

print()

# Stop everything
print("   Stopping all agents and monitoring...")
for agent in agents:
    agent.stop()

monitor.stop()

print("✅ Tutorial 5 Complete!")
print()


📝 Step 10: Final analysis and cleanup...
   Final System Statistics:
     Runtime: 3.8 minutes
     Total events: 85
     Total messages: 25
     Error rate: 0.00%
     Avg processing time: 0.024s
     System throughput: 0.1 messages/second

   Stopping all agents and monitoring...
🛑 Alice processing loop stopped
⏹️  MonitoredAgent Alice stopped
🛑 Bob processing loop stopped
⏹️  MonitoredAgent Bob stopped
🛑 Charlie processing loop stopped
⏹️  MonitoredAgent Charlie stopped
🛑 Diana processing loop stopped
⏹️  MonitoredAgent Diana stopped
🛑 Erik processing loop stopped
⏹️  MonitoredAgent Erik stopped
🛑 NeuroMonitor background loop stopped
⏹️  NeuroMonitor stopped
✅ Tutorial 5 Complete!



In [23]:
# SUMMARY OF WHAT WE LEARNED
# =============================================================================

print("📚 WHAT WE LEARNED:")

print("=" * 40)

print("1. 🔍 Built a comprehensive monitoring system")

print("   - Event-driven monitoring with structured logging")

print("   - Performance metrics collection and analysis")

print("   - Real-time alerting with customizable thresholds")

print("   - Multi-agent system observability")

print()

print("2. 📊 Implemented production-ready monitoring patterns")

print("   - Automatic instrumentation of agent behaviors")

print("   - Health monitoring and trend analysis")

print("   - Performance bottleneck identification")

print("   - Error tracking and recovery monitoring")

print()

print("3. 🎯 Created intelligent monitored agents")

print("   - Personality-based behavior monitoring")

print("   - Conversation pattern analysis")

print("   - Resource utilization measurement")

print("   - Comparative performance analysis")

print()

print("4. 📈 Added interactive monitoring dashboards")

print("   - Real-time performance visualization")

print("   - System health trend monitoring")

print("   - Event correlation and analysis")

print("   - Agent performance comparison")

print()


📚 WHAT WE LEARNED:
1. 🔍 Built a comprehensive monitoring system
   - Event-driven monitoring with structured logging
   - Performance metrics collection and analysis
   - Real-time alerting with customizable thresholds
   - Multi-agent system observability

2. 📊 Implemented production-ready monitoring patterns
   - Automatic instrumentation of agent behaviors
   - Health monitoring and trend analysis
   - Performance bottleneck identification
   - Error tracking and recovery monitoring

3. 🎯 Created intelligent monitored agents
   - Personality-based behavior monitoring
   - Conversation pattern analysis
   - Resource utilization measurement
   - Comparative performance analysis

4. 📈 Added interactive monitoring dashboards
   - Real-time performance visualization
   - System health trend monitoring
   - Event correlation and analysis
   - Agent performance comparison



In [24]:
# COMMON ERRORS AND SOLUTIONS
# =============================================================================

print("⚠️  COMMON ERRORS AND SOLUTIONS:")

print("=" * 40)

print("1. 🐛 Missing monitoring events")

print("   Problem: Events not being logged or lost")

print("   Solution: Check if monitor.log_event() is being called")

print("   Solution: Verify agent is registered with monitor")

print("   Solution: Ensure monitoring system is started")

print()

print("2. 🐛 Performance alerts not triggering")

print("   Problem: Thresholds too high or metrics not collected")

print("   Solution: Lower alert thresholds for testing")

print("   Solution: Check metrics collection timing")

print("   Solution: Verify agent performance data updates")

print()

print("3. 🐛 Memory usage growing over time")

print("   Problem: Event storage growing without bounds")

print("   Solution: Already handled with deque(maxlen=...)")

print("   Solution: Monitor NeuroMonitor's own memory usage")

print("   Solution: Adjust max_events parameter as needed")

print()

print("4. 🐛 Monitoring affecting agent performance")

print("   Problem: Too much monitoring overhead")

print("   Solution: Use threading locks judiciously")

print("   Solution: Batch metrics updates when possible")

print("   Solution: Sample events rather than logging everything")

print()

print("5. 🐛 Agents not starting properly")

print("   Problem: Race conditions during startup")

print("   Solution: Register agents before starting them")

print("   Solution: Add delays between agent starts")

print("   Solution: Verify agent status after starting")

print()

print("6. 🐛 Visualization not displaying data")

print("   Problem: Empty or missing metrics data")

print("   Solution: Ensure agents have processed messages")

print("   Solution: Wait for metrics collection cycle")

print("   Solution: Check that Plotly is installed")

print()

print("🎉 Ready for Tutorial 6: Configuration Management!")

print("   Next we'll add flexible configuration systems...")

⚠️  COMMON ERRORS AND SOLUTIONS:
1. 🐛 Missing monitoring events
   Problem: Events not being logged or lost
   Solution: Check if monitor.log_event() is being called
   Solution: Verify agent is registered with monitor
   Solution: Ensure monitoring system is started

2. 🐛 Performance alerts not triggering
   Problem: Thresholds too high or metrics not collected
   Solution: Lower alert thresholds for testing
   Solution: Check metrics collection timing
   Solution: Verify agent performance data updates

3. 🐛 Memory usage growing over time
   Problem: Event storage growing without bounds
   Solution: Already handled with deque(maxlen=...)
   Solution: Monitor NeuroMonitor's own memory usage
   Solution: Adjust max_events parameter as needed

4. 🐛 Monitoring affecting agent performance
   Problem: Too much monitoring overhead
   Solution: Use threading locks judiciously
   Solution: Batch metrics updates when possible
   Solution: Sample events rather than logging everything

5. 🐛 Agent

---

🔒 **INTELLECTUAL PROPERTY & LICENSE NOTICE**

This tutorial and its contents — including code, architecture, narrative examples, and educational structure — are the intellectual property of **Shalini Ananda, PhD** and part of the **Neuron Framework** under a **Modified MIT License with Attribution**.

- Commercial use, redistribution, or derivative works **must** include clear and visible attribution to the original author.
- Use in products, consulting engagements, or educational materials **must reference this repository and author name.**
- Removal of author credit or misrepresentation of origin constitutes **a violation of the license and may trigger legal action.**
- You may **not white-label, obfuscate, or rebrand** this work without explicit, written permission.

Use of this tutorial in Colab or any other platform implies agreement with these terms.

📘 **License**: [LICENSE.md](../LICENSE.md)  
📌 **Notice**: [NOTICE.md](../NOTICE.md)  
🧠 **Author**: [Shalini Ananda, PhD](https://github.com/ShaliniAnandaPhD)

---