# Lab 5: Real-time Stock Analytics with Kafka

## üéØ **Learning Objectives:**
- Build real-time stock analytics pipeline
- Implement multiple consumer groups for different analytics
- Practice stream processing patterns
- Create real-time dashboards and monitoring
- Handle high-throughput data streams
- Learn about Kafka Streams concepts

## üìö **Key Concepts:**
1. **Real-time Analytics**: Processing data as it arrives
2. **Multiple Consumer Groups**: Different processing pipelines
3. **Stream Processing**: Continuous data processing patterns
4. **Real-time Dashboards**: Live data visualization
5. **High-throughput Processing**: Handling large data volumes
6. **Alert Systems**: Real-time notifications and triggers

## üèóÔ∏è **Architecture Overview:**
```
‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê    ‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê    ‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê
‚îÇ   Stock Data    ‚îÇ‚îÄ‚îÄ‚îÄ‚ñ∂‚îÇ   Kafka Topic    ‚îÇ‚îÄ‚îÄ‚îÄ‚ñ∂‚îÇ   Analytics     ‚îÇ
‚îÇ   Producer      ‚îÇ    ‚îÇ   (3 Partitions) ‚îÇ    ‚îÇ   Consumer      ‚îÇ
‚îÇ                 ‚îÇ    ‚îÇ                  ‚îÇ    ‚îÇ   Groups        ‚îÇ
‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò    ‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò    ‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò
         ‚îÇ                        ‚îÇ                        ‚îÇ
         ‚ñº                        ‚ñº                        ‚ñº
‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê    ‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê    ‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê
‚îÇ High-frequency  ‚îÇ    ‚îÇ Partition 0      ‚îÇ    ‚îÇ Analytics      ‚îÇ
‚îÇ Data Generation ‚îÇ    ‚îÇ Partition 1      ‚îÇ    ‚îÇ ‚Ä¢ Moving Avg   ‚îÇ
‚îÇ                 ‚îÇ    ‚îÇ Partition 2      ‚îÇ    ‚îÇ ‚Ä¢ Price Alerts ‚îÇ
‚îÇ                 ‚îÇ    ‚îÇ                  ‚îÇ    ‚îÇ ‚Ä¢ Volume       ‚îÇ
‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò    ‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò    ‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò
                                ‚îÇ                        ‚îÇ
                                ‚ñº                        ‚ñº
                       ‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê    ‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê
                       ‚îÇ Storage Group   ‚îÇ    ‚îÇ Dashboard Group ‚îÇ
                       ‚îÇ ‚Ä¢ PostgreSQL    ‚îÇ    ‚îÇ ‚Ä¢ Real-time UI  ‚îÇ
                       ‚îÇ ‚Ä¢ Redis Cache   ‚îÇ    ‚îÇ ‚Ä¢ Charts        ‚îÇ
                       ‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò    ‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò
```


In [1]:
# Install and Import Dependencies
import json
import random
import time
import threading
from datetime import datetime, timedelta
from collections import defaultdict, deque
from kafka import KafkaProducer, KafkaConsumer, KafkaAdminClient
from kafka.errors import KafkaError
from kafka.admin import NewTopic
import uuid
import statistics

print("‚úÖ All dependencies imported successfully!")


‚úÖ All dependencies imported successfully!


In [2]:
# Kafka Configuration for Real-time Analytics
KAFKA_BOOTSTRAP_SERVERS = 'localhost:9092'
TOPIC_NAME = 'stock-analytics'
CONSUMER_GROUPS = {
    'analytics': 'stock-analytics-group',
    'alerts': 'stock-alerts-group', 
    'storage': 'stock-storage-group',
    'dashboard': 'stock-dashboard-group'
}

# Analytics Configuration
ANALYTICS_CONFIG = {
    'moving_average_window': 10,
    'price_alert_threshold': 0.05,  # 5% price change
    'volume_alert_threshold': 2.0,  # 2x average volume
    'trend_detection_window': 5,
    'high_frequency_interval': 0.1  # 100ms between messages
}

print("üîß Real-time Analytics Configuration:")
print(f"   Bootstrap Servers: {KAFKA_BOOTSTRAP_SERVERS}")
print(f"   Topic: {TOPIC_NAME}")
print(f"   Consumer Groups: {list(CONSUMER_GROUPS.keys())}")
print(f"   Analytics Window: {ANALYTICS_CONFIG['moving_average_window']} messages")
print(f"   Price Alert Threshold: {ANALYTICS_CONFIG['price_alert_threshold']*100}%")
print(f"   High-frequency Interval: {ANALYTICS_CONFIG['high_frequency_interval']}s")


üîß Real-time Analytics Configuration:
   Bootstrap Servers: localhost:9092
   Topic: stock-analytics
   Consumer Groups: ['analytics', 'alerts', 'storage', 'dashboard']
   Analytics Window: 10 messages
   Price Alert Threshold: 5.0%
   High-frequency Interval: 0.1s


In [3]:
# Create Topic for Real-time Analytics Lab
def create_analytics_topic(topic_name: str, num_partitions: int = 3):
    """Create topic for real-time analytics"""
    try:
        admin_client = KafkaAdminClient(
            bootstrap_servers=KAFKA_BOOTSTRAP_SERVERS,
            client_id='analytics_lab_admin'
        )
        
        # Check if topic exists
        metadata = admin_client.describe_cluster()
        print(f"‚úÖ Connected to Kafka cluster")
        
        try:
            topic_metadata = admin_client.describe_topics([topic_name])
            topic_info = topic_metadata[topic_name]
            existing_partitions = len(topic_info.partitions)
            
            if existing_partitions == num_partitions:
                print(f"‚úÖ Topic '{topic_name}' already exists with {num_partitions} partitions")
            else:
                print(f"‚ö†Ô∏è Topic '{topic_name}' exists but has {existing_partitions} partitions, need {num_partitions}")
                print(f"üîÑ Deleting and recreating topic...")
                
                admin_client.delete_topics([topic_name])
                time.sleep(2)
                
                topic = NewTopic(
                    name=topic_name,
                    num_partitions=num_partitions,
                    replication_factor=1
                )
                admin_client.create_topics([topic])
                print(f"‚úÖ Topic '{topic_name}' recreated with {num_partitions} partitions")
                
        except Exception as e:
            print(f"üìù Creating topic '{topic_name}' with {num_partitions} partitions...")
            topic = NewTopic(
                name=topic_name,
                num_partitions=num_partitions,
                replication_factor=1
            )
            admin_client.create_topics([topic])
            print(f"‚úÖ Topic '{topic_name}' created successfully")
        
        admin_client.close()
        
    except Exception as e:
        print(f"‚ùå Error managing topic: {e}")
        print("üí° Make sure Kafka is running: docker-compose up -d")

# Create the analytics topic
create_analytics_topic(TOPIC_NAME, 3)


‚úÖ Connected to Kafka cluster
üìù Creating topic 'stock-analytics' with 3 partitions...
‚úÖ Topic 'stock-analytics' created successfully


In [4]:
# High-Frequency Stock Data Producer
class HighFrequencyStockProducer:
    """Producer for high-frequency stock data simulation"""
    
    def __init__(self, bootstrap_servers: str, topic: str):
        self.topic = topic
        self.symbols = ['AAPL', 'GOOGL', 'MSFT', 'TSLA', 'AMZN', 'META', 'NVDA', 'NFLX', 'ADBE', 'CRM']
        self.base_prices = {
            'AAPL': 150.0, 'GOOGL': 2800.0, 'MSFT': 350.0, 'TSLA': 250.0, 'AMZN': 3200.0,
            'META': 300.0, 'NVDA': 450.0, 'NFLX': 400.0, 'ADBE': 500.0, 'CRM': 200.0
        }
        self.current_prices = self.base_prices.copy()
        self.message_count = 0
        
    def generate_realistic_price(self, symbol: str) -> float:
        """Generate realistic price movement using random walk"""
        current_price = self.current_prices[symbol]
        
        # Random walk with slight upward bias
        change_percent = random.gauss(0.0001, 0.01)  # 0.01% volatility
        new_price = current_price * (1 + change_percent)
        
        # Ensure price stays positive and realistic
        new_price = max(new_price, current_price * 0.95)  # Max 5% drop
        new_price = min(new_price, current_price * 1.05)  # Max 5% gain
        
        self.current_prices[symbol] = new_price
        return round(new_price, 2)
    
    def generate_ohlcv_data(self, symbol: str) -> dict:
        """Generate realistic OHLCV data"""
        close_price = self.generate_realistic_price(symbol)
        
        # Generate OHLC around close price
        volatility = 0.002  # 0.2% intraday volatility
        open_price = round(close_price * random.uniform(1 - volatility, 1 + volatility), 2)
        high_price = round(max(open_price, close_price) * random.uniform(1.001, 1.005), 2)
        low_price = round(min(open_price, close_price) * random.uniform(0.995, 0.999), 2)
        
        # Volume with some randomness
        base_volume = random.randint(50000, 500000)
        volume_multiplier = random.uniform(0.5, 2.0)
        volume = int(base_volume * volume_multiplier)
        
        self.message_count += 1
        
        return {
            "symbol": symbol,
            "timestamp": datetime.now().isoformat() + "Z",
            "open": open_price,
            "high": high_price,
            "low": low_price,
            "close": close_price,
            "volume": volume,
            "exchange": "NASDAQ",
            "message_id": str(uuid.uuid4()),
            "sequence_number": self.message_count
        }
    
    def start_high_frequency_stream(self, duration_seconds: int = 30, interval: float = 0.1):
        """Start high-frequency data stream"""
        print(f"üöÄ Starting high-frequency stream for {duration_seconds} seconds...")
        print(f"   Interval: {interval}s between messages")
        print(f"   Expected messages: ~{int(duration_seconds / interval)}")
        
        producer_config = {
            'bootstrap_servers': KAFKA_BOOTSTRAP_SERVERS,
            'value_serializer': lambda v: json.dumps(v).encode('utf-8'),
            'key_serializer': lambda k: k.encode('utf-8') if k else None,
            'acks': 'all',
            'retries': 3,
            'retry_backoff_ms': 100,
            'batch_size': 16384,
            'linger_ms': 5,  # Batch messages for 5ms
            'compression_type': 'gzip'
        }
        
        producer = KafkaProducer(**producer_config)
        start_time = time.time()
        messages_sent = 0
        
        try:
            while time.time() - start_time < duration_seconds:
                symbol = random.choice(self.symbols)
                ohlcv_data = self.generate_ohlcv_data(symbol)
                
                # Send message
                future = producer.send(
                    self.topic,
                    key=symbol,
                    value=ohlcv_data
                )
                
                try:
                    record_metadata = future.get(timeout=1)
                    messages_sent += 1
                    
                    if messages_sent % 50 == 0:  # Print every 50 messages
                        print(f"üìä Sent {messages_sent} messages - Latest: {symbol} ${ohlcv_data['close']}")
                    
                except KafkaError as e:
                    print(f"‚ùå Error sending message: {e}")
                
                time.sleep(interval)
                
        except KeyboardInterrupt:
            print("\n‚èπÔ∏è Stream stopped by user")
        finally:
            producer.flush()
            producer.close()
            
        print(f"‚úÖ High-frequency stream completed!")
        print(f"   Total messages sent: {messages_sent}")
        print(f"   Average rate: {messages_sent / duration_seconds:.1f} messages/second")

# Initialize producer
producer = HighFrequencyStockProducer(KAFKA_BOOTSTRAP_SERVERS, TOPIC_NAME)
print("‚úÖ High-Frequency Stock Producer initialized!")


‚úÖ High-Frequency Stock Producer initialized!


In [5]:
# Real-time Analytics Consumer Classes
class RealTimeAnalyticsConsumer:
    """Base class for real-time analytics consumers"""
    
    def __init__(self, bootstrap_servers: str, topic: str, group_id: str):
        self.bootstrap_servers = bootstrap_servers
        self.topic = topic
        self.group_id = group_id
        self.consumer = None
        self.processed_count = 0
        self.start_time = None
        
    def create_consumer(self, **kwargs):
        """Create consumer with optimized configuration for real-time processing"""
        default_config = {
            'bootstrap_servers': self.bootstrap_servers,
            'group_id': self.group_id,
            'value_deserializer': lambda m: json.loads(m.decode('utf-8')),
            'key_deserializer': lambda m: m.decode('utf-8') if m else None,
            'auto_offset_reset': 'latest',
            'enable_auto_commit': True,
            'auto_commit_interval_ms': 1000,
            'session_timeout_ms': 30000,
            'heartbeat_interval_ms': 10000,
            'consumer_timeout_ms': 1000,
            'fetch_min_bytes': 1,
            'fetch_max_wait_ms': 100,
            'max_poll_records': 100
        }
        
        config = {**default_config, **kwargs}
        self.consumer = KafkaConsumer(self.topic, **config)
        return self.consumer
    
    def process_message(self, message):
        """Process a single message - override in subclasses"""
        data = message.value
        self.processed_count += 1
        
        if self.start_time is None:
            self.start_time = time.time()
        
        return data
    
    def get_processing_rate(self):
        """Get current processing rate"""
        if self.start_time is None:
            return 0
        elapsed = time.time() - self.start_time
        return self.processed_count / elapsed if elapsed > 0 else 0

class AnalyticsConsumer(RealTimeAnalyticsConsumer):
    """Consumer for real-time analytics calculations"""
    
    def __init__(self, bootstrap_servers: str, topic: str, group_id: str):
        super().__init__(bootstrap_servers, topic, group_id)
        self.price_history = defaultdict(lambda: deque(maxlen=ANALYTICS_CONFIG['moving_average_window']))
        self.volume_history = defaultdict(lambda: deque(maxlen=ANALYTICS_CONFIG['moving_average_window']))
        self.analytics_results = []
    
    def calculate_moving_average(self, symbol: str, prices: deque) -> float:
        """Calculate simple moving average"""
        if len(prices) < 2:
            return prices[-1] if prices else 0
        return statistics.mean(prices)
    
    def detect_trend(self, symbol: str, prices: deque) -> str:
        """Detect price trend"""
        if len(prices) < ANALYTICS_CONFIG['trend_detection_window']:
            return "insufficient_data"
        
        recent_prices = list(prices)[-ANALYTICS_CONFIG['trend_detection_window']:]
        first_price = recent_prices[0]
        last_price = recent_prices[-1]
        
        change_percent = (last_price - first_price) / first_price
        
        if change_percent > 0.02:  # 2% increase
            return "uptrend"
        elif change_percent < -0.02:  # 2% decrease
            return "downtrend"
        else:
            return "sideways"
    
    def process_message(self, message):
        """Process message and calculate analytics"""
        data = super().process_message(message)
        symbol = data['symbol']
        close_price = data['close']
        volume = data['volume']
        
        # Update history
        self.price_history[symbol].append(close_price)
        self.volume_history[symbol].append(volume)
        
        # Calculate analytics
        moving_avg = self.calculate_moving_average(symbol, self.price_history[symbol])
        trend = self.detect_trend(symbol, self.price_history[symbol])
        avg_volume = statistics.mean(self.volume_history[symbol]) if self.volume_history[symbol] else volume
        
        analytics_result = {
            'symbol': symbol,
            'timestamp': data['timestamp'],
            'close_price': close_price,
            'moving_average': round(moving_avg, 2),
            'trend': trend,
            'volume': volume,
            'avg_volume': round(avg_volume, 0),
            'volume_ratio': round(volume / avg_volume, 2) if avg_volume > 0 else 1.0,
            'message_id': data['message_id'],
            'sequence_number': data['sequence_number']
        }
        
        self.analytics_results.append(analytics_result)
        
        # Print analytics every 10 messages
        if self.processed_count % 10 == 0:
            print(f"üìà Analytics: {symbol} ${close_price} | MA: ${moving_avg:.2f} | Trend: {trend} | Vol: {volume:,}")
        
        return analytics_result

class AlertConsumer(RealTimeAnalyticsConsumer):
    """Consumer for price and volume alerts"""
    
    def __init__(self, bootstrap_servers: str, topic: str, group_id: str):
        super().__init__(bootstrap_servers, topic, group_id)
        self.price_alerts = []
        self.volume_alerts = []
        self.symbol_previous_prices = {}
    
    def check_price_alert(self, symbol: str, current_price: float) -> bool:
        """Check for significant price changes"""
        if symbol not in self.symbol_previous_prices:
            self.symbol_previous_prices[symbol] = current_price
            return False
        
        previous_price = self.symbol_previous_prices[symbol]
        change_percent = abs(current_price - previous_price) / previous_price
        
        if change_percent >= ANALYTICS_CONFIG['price_alert_threshold']:
            alert = {
                'symbol': symbol,
                'timestamp': datetime.now().isoformat(),
                'previous_price': previous_price,
                'current_price': current_price,
                'change_percent': round(change_percent * 100, 2),
                'alert_type': 'price_change'
            }
            self.price_alerts.append(alert)
            self.symbol_previous_prices[symbol] = current_price
            return True
        
        return False
    
    def check_volume_alert(self, symbol: str, volume: float, avg_volume: float) -> bool:
        """Check for unusual volume spikes"""
        if avg_volume <= 0:
            return False
        
        volume_ratio = volume / avg_volume
        
        if volume_ratio >= ANALYTICS_CONFIG['volume_alert_threshold']:
            alert = {
                'symbol': symbol,
                'timestamp': datetime.now().isoformat(),
                'volume': volume,
                'avg_volume': avg_volume,
                'volume_ratio': round(volume_ratio, 2),
                'alert_type': 'volume_spike'
            }
            self.volume_alerts.append(alert)
            return True
        
        return False
    
    def process_message(self, message):
        """Process message and check for alerts"""
        data = super().process_message(message)
        symbol = data['symbol']
        close_price = data['close']
        volume = data['volume']
        
        # Check price alert
        price_alert = self.check_price_alert(symbol, close_price)
        if price_alert:
            alert = self.price_alerts[-1]
            print(f"üö® PRICE ALERT: {symbol} changed {alert['change_percent']}% from ${alert['previous_price']} to ${alert['current_price']}")
        
        # Check volume alert (simplified - would need historical data in real implementation)
        volume_alert = self.check_volume_alert(symbol, volume, volume * 0.5)  # Simplified
        if volume_alert:
            alert = self.volume_alerts[-1]
            print(f"üìä VOLUME ALERT: {symbol} volume {alert['volume_ratio']}x average ({alert['volume']:,} vs {alert['avg_volume']:,.0f})")
        
        return data

class DashboardConsumer(RealTimeAnalyticsConsumer):
    """Consumer for real-time dashboard updates"""
    
    def __init__(self, bootstrap_servers: str, topic: str, group_id: str):
        super().__init__(bootstrap_servers, topic, group_id)
        self.dashboard_data = defaultdict(dict)
        self.update_count = 0
    
    def update_dashboard(self, data: dict):
        """Update dashboard data structure"""
        symbol = data['symbol']
        
        self.dashboard_data[symbol] = {
            'symbol': symbol,
            'price': data['close'],
            'volume': data['volume'],
            'timestamp': data['timestamp'],
            'last_update': datetime.now().isoformat()
        }
        
        self.update_count += 1
        
        # Print dashboard update every 20 messages
        if self.update_count % 20 == 0:
            print(f"üìä Dashboard Update #{self.update_count}:")
            for sym, info in list(self.dashboard_data.items())[-5:]:  # Show last 5 symbols
                print(f"   {sym}: ${info['price']} (Vol: {info['volume']:,})")
            print()
    
    def process_message(self, message):
        """Process message for dashboard updates"""
        data = super().process_message(message)
        self.update_dashboard(data)
        return data

print("üìã Real-time Analytics Consumer Classes:")
print("   - AnalyticsConsumer: Moving averages, trends, volume analysis")
print("   - AlertConsumer: Price and volume alerts")
print("   - DashboardConsumer: Real-time dashboard updates")
print("   - RealTimeAnalyticsConsumer: Base class for custom implementations")


üìã Real-time Analytics Consumer Classes:
   - AnalyticsConsumer: Moving averages, trends, volume analysis
   - AlertConsumer: Price and volume alerts
   - DashboardConsumer: Real-time dashboard updates
   - RealTimeAnalyticsConsumer: Base class for custom implementations


## Exercise 1: High-Frequency Data Stream

### üéØ **Learning Objectives:**
- Generate high-frequency stock data
- Understand real-time data processing challenges
- Practice handling high-throughput streams
- Learn about data generation patterns

### üìö **Key Concepts:**
1. **High-Frequency Data**: Rapid data generation and processing
2. **Real-time Processing**: Handling data as it arrives
3. **Throughput**: Messages per second processing capability
4. **Data Quality**: Ensuring realistic and consistent data


In [6]:
# Exercise 1: Start High-Frequency Data Stream
print("üöÄ Exercise 1: High-Frequency Data Stream Generation")

print("\nüìä Starting 30-second high-frequency stream...")
print("   This will generate ~300 messages at 0.1s intervals")
print("   Press Ctrl+C to stop early if needed")

# Start the high-frequency stream
producer.start_high_frequency_stream(
    duration_seconds=30,
    interval=ANALYTICS_CONFIG['high_frequency_interval']
)

print("\n‚úÖ High-frequency stream completed!")
print("üìà Data is now available in the Kafka topic for real-time processing")


üöÄ Exercise 1: High-Frequency Data Stream Generation

üìä Starting 30-second high-frequency stream...
   This will generate ~300 messages at 0.1s intervals
   Press Ctrl+C to stop early if needed
üöÄ Starting high-frequency stream for 30 seconds...
   Interval: 0.1s between messages
   Expected messages: ~300
üìä Sent 50 messages - Latest: META $303.45
üìä Sent 100 messages - Latest: AAPL $142.49
üìä Sent 150 messages - Latest: ADBE $488.35
üìä Sent 200 messages - Latest: ADBE $471.65
üìä Sent 250 messages - Latest: MSFT $343.86
‚úÖ High-frequency stream completed!
   Total messages sent: 262
   Average rate: 8.7 messages/second

‚úÖ High-frequency stream completed!
üìà Data is now available in the Kafka topic for real-time processing


## Exercise 2: Real-time Analytics Processing

### üéØ **Learning Objectives:**
- Implement real-time analytics calculations
- Practice moving average calculations
- Learn trend detection algorithms
- Understand volume analysis patterns

### üìö **Key Concepts:**
1. **Moving Averages**: Smoothing price data over time windows
2. **Trend Detection**: Identifying price direction patterns
3. **Volume Analysis**: Understanding trading activity
4. **Real-time Calculations**: Processing data as it arrives


In [7]:
# Exercise 2: Real-time Analytics Processing
print("üìà Exercise 2: Real-time Analytics Processing")

print("\nüîß Starting Analytics Consumer...")
print("   Processing messages for moving averages, trends, and volume analysis")

# Create analytics consumer
analytics_consumer = AnalyticsConsumer(
    KAFKA_BOOTSTRAP_SERVERS, 
    TOPIC_NAME, 
    CONSUMER_GROUPS['analytics']
)

analytics_consumer.create_consumer()

print("\nüìä Processing messages (will timeout after 10 seconds)...")
print("   Watch for analytics calculations below:")

# Process messages for 10 seconds
start_time = time.time()
timeout_seconds = 10

try:
    for message in analytics_consumer.consumer:
        if time.time() - start_time > timeout_seconds:
            print(f"\n‚è∞ Timeout reached after {timeout_seconds} seconds")
            break
            
        analytics_consumer.process_message(message)
        
except Exception as e:
    print(f"‚ùå Error during processing: {e}")
finally:
    if analytics_consumer.consumer:
        analytics_consumer.consumer.close()

# Show analytics results
print(f"\nüìä Analytics Processing Results:")
print(f"   ‚úÖ Messages processed: {analytics_consumer.processed_count}")
print(f"   üìà Processing rate: {analytics_consumer.get_processing_rate():.1f} messages/second")
print(f"   üìã Analytics results: {len(analytics_consumer.analytics_results)} calculations")

if analytics_consumer.analytics_results:
    print(f"\nüìà Sample Analytics Results:")
    for result in analytics_consumer.analytics_results[-3:]:  # Show last 3
        print(f"   {result['symbol']}: ${result['close_price']} | MA: ${result['moving_average']} | Trend: {result['trend']} | Vol Ratio: {result['volume_ratio']}")

print("\n‚úÖ Real-time analytics processing completed!")


üìà Exercise 2: Real-time Analytics Processing

üîß Starting Analytics Consumer...
   Processing messages for moving averages, trends, and volume analysis

üìä Processing messages (will timeout after 10 seconds)...
   Watch for analytics calculations below:

üìä Analytics Processing Results:
   ‚úÖ Messages processed: 0
   üìà Processing rate: 0.0 messages/second
   üìã Analytics results: 0 calculations

‚úÖ Real-time analytics processing completed!


## Exercise 3: Real-time Alert System

### üéØ **Learning Objectives:**
- Implement price change alerts
- Practice volume spike detection
- Learn alert threshold management
- Understand real-time notification patterns

### üìö **Key Concepts:**
1. **Price Alerts**: Detecting significant price movements
2. **Volume Alerts**: Identifying unusual trading activity
3. **Threshold Management**: Setting appropriate alert levels
4. **Real-time Notifications**: Immediate alert delivery


In [8]:
# Exercise 3: Real-time Alert System
print("üö® Exercise 3: Real-time Alert System")

print("\nüîß Starting Alert Consumer...")
print("   Monitoring for price changes and volume spikes")

# Create alert consumer
alert_consumer = AlertConsumer(
    KAFKA_BOOTSTRAP_SERVERS, 
    TOPIC_NAME, 
    CONSUMER_GROUPS['alerts']
)

alert_consumer.create_consumer()

print("\nüìä Processing messages for alerts (will timeout after 10 seconds)...")
print("   Watch for alerts below:")

# Process messages for 10 seconds
start_time = time.time()
timeout_seconds = 10

try:
    for message in alert_consumer.consumer:
        if time.time() - start_time > timeout_seconds:
            print(f"\n‚è∞ Timeout reached after {timeout_seconds} seconds")
            break
            
        alert_consumer.process_message(message)
        
except Exception as e:
    print(f"‚ùå Error during processing: {e}")
finally:
    if alert_consumer.consumer:
        alert_consumer.consumer.close()

# Show alert results
print(f"\nüö® Alert System Results:")
print(f"   ‚úÖ Messages processed: {alert_consumer.processed_count}")
print(f"   üìà Processing rate: {alert_consumer.get_processing_rate():.1f} messages/second")
print(f"   üö® Price alerts triggered: {len(alert_consumer.price_alerts)}")
print(f"   üìä Volume alerts triggered: {len(alert_consumer.volume_alerts)}")

if alert_consumer.price_alerts:
    print(f"\nüö® Price Alerts Summary:")
    for alert in alert_consumer.price_alerts[-3:]:  # Show last 3
        print(f"   {alert['symbol']}: {alert['change_percent']}% change (${alert['previous_price']} ‚Üí ${alert['current_price']})")

if alert_consumer.volume_alerts:
    print(f"\nüìä Volume Alerts Summary:")
    for alert in alert_consumer.volume_alerts[-3:]:  # Show last 3
        print(f"   {alert['symbol']}: {alert['volume_ratio']}x volume spike ({alert['volume']:,} vs {alert['avg_volume']:,.0f})")

print("\n‚úÖ Real-time alert system completed!")


üö® Exercise 3: Real-time Alert System

üîß Starting Alert Consumer...
   Monitoring for price changes and volume spikes

üìä Processing messages for alerts (will timeout after 10 seconds)...
   Watch for alerts below:

üö® Alert System Results:
   ‚úÖ Messages processed: 0
   üìà Processing rate: 0.0 messages/second
   üö® Price alerts triggered: 0
   üìä Volume alerts triggered: 0

‚úÖ Real-time alert system completed!
