# Online Hidden Markov Models for Real-Time Regime Detection

This notebook demonstrates how to use Online Hidden Markov Models for real-time market regime detection and streaming data processing.

## Learning Objectives
By the end of this tutorial, you will understand:
1. The differences between batch and online HMMs
2. How to configure and initialize an Online HMM
3. Real-time data processing and streaming updates
4. Adaptive parameter estimation with forgetting factors
5. Building real-time trading systems with Online HMMs
6. Performance monitoring and model adaptation

## Prerequisites
- Completion of "Introduction to HMM Finance" tutorial (recommended)
- Understanding of basic HMM concepts
- Familiarity with streaming data concepts

Let's start by importing the necessary libraries.

In [None]:
# Import necessary libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import yfinance as yf
from datetime import datetime, timedelta
import time
import warnings
warnings.filterwarnings('ignore')

# Import our HMM models
import sys
sys.path.append('../../')
from hidden_regime.models.base_hmm import HiddenMarkovModel
from hidden_regime.models.online_hmm import OnlineHMM, OnlineHMMConfig
from hidden_regime.data.loader import DataLoader

# Set up plotting style
plt.style.use('seaborn-v0_8')
sns.set_palette("husl")
plt.rcParams['figure.figsize'] = (12, 8)

print("‚úÖ Libraries imported successfully!")
print(f"Current time: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")

## 1. Understanding Online vs Batch Learning

### Batch HMM (Traditional Approach)
- **Full dataset training**: Requires entire historical dataset
- **Parameter stability**: Fixed parameters after training
- **Computational cost**: O(T¬≤) for Baum-Welch algorithm
- **Historical revision**: Past regime labels change with new data
- **Use case**: Historical analysis, backtesting

### Online HMM (Streaming Approach)
- **Incremental learning**: Updates with each new observation
- **Adaptive parameters**: Evolves with changing market conditions
- **Computational efficiency**: O(1) per new observation
- **Temporal consistency**: Stable historical classifications
- **Use case**: Real-time trading, live monitoring

Let's compare both approaches side by side.

In [None]:
# Load sample data for comparison
print("üìä Loading market data for comparison...")

# Download data for comparison
ticker = "AAPL"
start_date = "2023-01-01"
end_date = "2024-01-01"

data = yf.download(ticker, start=start_date, end=end_date, progress=False)
data['Log_Return'] = np.log(data['Adj Close'] / data['Adj Close'].shift(1))
data = data.dropna()

returns = data['Log_Return'].values
print(f"‚úÖ Downloaded {len(data)} days of {ticker} data")
print(f"Date range: {data.index[0].date()} to {data.index[-1].date()}")

# Split data for comparison
split_point = len(returns) // 2
train_returns = returns[:split_point]
test_returns = returns[split_point:]

print(f"\nData split:")
print(f"  Training period: {split_point} observations")
print(f"  Testing period: {len(test_returns)} observations")

# Demonstrate the difference in computational requirements
print(f"\n‚ö° Computational Complexity Comparison:")
print(f"  Batch HMM: O(T¬≤) = O({len(returns)}¬≤) = ~{len(returns)**2:,} operations")
print(f"  Online HMM: O(1) per observation = {len(test_returns)} total operations")
print(f"  Efficiency gain: ~{(len(returns)**2) / len(test_returns):,.0f}x faster for streaming")

## 2. Configuring Online HMM

The Online HMM requires careful configuration of several key parameters that control its adaptation behavior.

In [None]:
# Configure Online HMM with detailed explanations
print("‚öôÔ∏è CONFIGURING ONLINE HMM")
print("=" * 40)

# Create configuration with explanations
config = OnlineHMMConfig(
    # Model structure
    n_states=3,              # Number of market regimes (Bull, Bear, Sideways)
    n_features=1,            # Number of features (just returns for now)
    
    # Learning parameters
    forgetting_factor=0.995,  # How quickly to forget old data (0.99-0.999)
    adaptation_rate=0.05,     # How quickly to adapt parameters (0.01-0.1)
    
    # Memory management
    window_size=252,          # Rolling window size (1 year of trading days)
    min_observations=50,      # Minimum observations before making predictions
    
    # Numerical stability
    regularization=1e-6,      # Regularization for numerical stability
    convergence_threshold=1e-4, # Convergence threshold for initialization
    max_iterations=100,       # Maximum iterations for initialization
    
    # Random seed for reproducibility
    random_state=42
)

print(f"üìã Configuration Summary:")
print(f"  Model Structure:")
print(f"    - States: {config.n_states}")
print(f"    - Features: {config.n_features}")
print(f"  Learning Parameters:")
print(f"    - Forgetting factor: {config.forgetting_factor} (half-life: {np.log(0.5)/np.log(config.forgetting_factor):.0f} days)")
print(f"    - Adaptation rate: {config.adaptation_rate}")
print(f"  Memory Management:")
print(f"    - Window size: {config.window_size} observations")
print(f"    - Minimum observations: {config.min_observations}")

print(f"\nüéØ Parameter Interpretation:")
print(f"  Forgetting Factor ({config.forgetting_factor}):")
print(f"    - High value (>0.99): Long memory, slow adaptation")
print(f"    - Low value (<0.95): Short memory, fast adaptation")
print(f"    - Current setting: Balanced for financial markets")
print(f"  Adaptation Rate ({config.adaptation_rate}):")
print(f"    - High value (>0.1): Quick parameter updates, less stable")
print(f"    - Low value (<0.01): Slow parameter updates, more stable")
print(f"    - Current setting: Conservative for regime stability")

# Initialize the Online HMM
online_hmm = OnlineHMM(config)
print(f"\n‚úÖ Online HMM initialized and ready for training!")

## 3. Initial Training and Warm-up Period

Online HMMs need an initial training period to establish baseline parameters before they can adapt incrementally.

In [None]:
# Initial training phase
print("üöÄ INITIAL TRAINING PHASE")
print("=" * 40)

# Use first portion of data for initial training
init_size = max(config.min_observations, 100)  # Ensure we have enough data
init_data = train_returns[:init_size].reshape(-1, 1)

print(f"Training with {len(init_data)} initial observations...")
start_time = time.time()

try:
    # Fit the model with initial data
    online_hmm.fit(init_data)
    
    training_time = time.time() - start_time
    print(f"‚úÖ Initial training completed in {training_time:.3f} seconds")
    
    # Display initial model parameters
    print(f"\nüìä Initial Model Parameters:")
    
    # Get current parameters
    try:
        current_params = online_hmm.get_parameters()
        
        print(f"  Transition Matrix:")
        for i, row in enumerate(current_params['transition_matrix']):
            print(f"    State {i}: [{', '.join([f'{p:.3f}' for p in row])}]")
        
        print(f"  Emission Parameters:")
        for i, (mean, cov) in enumerate(zip(current_params['means'], current_params['covariances'])):
            std = np.sqrt(cov[0, 0])
            print(f"    State {i}: Œº={mean[0]:.4f}, œÉ={std:.4f}")
            
    except Exception as e:
        print(f"  Parameter extraction not available: {e}")
    
    # Test initial prediction capability
    test_obs = np.array([[train_returns[init_size]]])
    regime_info = online_hmm.get_current_regime_info()
    
    print(f"\nüéØ Initial Regime Detection Test:")
    print(f"  Current regime: {regime_info.get('most_likely_regime', 'Unknown')}")
    print(f"  Confidence: {regime_info.get('confidence', 0):.2%}")
    
    regime_probs = regime_info.get('regime_probabilities', [])
    if regime_probs:
        for i, prob in enumerate(regime_probs):
            print(f"  State {i} probability: {prob:.3f}")

except Exception as e:
    print(f"‚ùå Initial training failed: {e}")
    print(f"This might be due to insufficient data or numerical issues")
    print(f"Consider adjusting configuration parameters")
    raise

## 4. Streaming Data Processing

Now let's demonstrate the core capability of Online HMMs: processing streaming data one observation at a time.

In [None]:
# Simulate real-time streaming data processing
print("üåä STREAMING DATA PROCESSING SIMULATION")
print("=" * 50)

# Prepare streaming simulation
streaming_start = init_size
streaming_data = train_returns[streaming_start:]
streaming_dates = data.index[streaming_start:streaming_start + len(streaming_data)]

print(f"Simulating streaming processing of {len(streaming_data)} observations...")
print(f"Processing period: {streaming_dates[0].date()} to {streaming_dates[-1].date()}")

# Storage for streaming results
streaming_results = []
processing_times = []
parameter_evolution = {'means': [], 'covariances': [], 'transitions': []}

# Process each observation
for i, (new_return, date) in enumerate(zip(streaming_data[:100], streaming_dates[:100])):
    start_time = time.time()
    
    try:
        # Update model with new observation
        online_hmm.update(new_return)
        
        # Get current regime information
        regime_info = online_hmm.get_current_regime_info()
        
        # Record processing time
        processing_time = time.time() - start_time
        processing_times.append(processing_time)
        
        # Store results
        result = {
            'date': date,
            'return': new_return,
            'regime': regime_info.get('most_likely_regime', -1),
            'confidence': regime_info.get('confidence', 0),
            'processing_time_ms': processing_time * 1000,
            'regime_probs': regime_info.get('regime_probabilities', [0, 0, 0])
        }
        streaming_results.append(result)
        
        # Store parameter evolution (every 10 observations to save memory)
        if i % 10 == 0:
            try:
                params = online_hmm.get_parameters()
                parameter_evolution['means'].append(params['means'].copy())
                parameter_evolution['covariances'].append(params['covariances'].copy())
                parameter_evolution['transitions'].append(params['transition_matrix'].copy())
            except:
                pass  # Parameters might not be available
        
        # Print progress every 20 observations
        if (i + 1) % 20 == 0:
            avg_time = np.mean(processing_times[-20:]) * 1000
            regime = regime_info.get('most_likely_regime', 'Unknown')
            confidence = regime_info.get('confidence', 0)
            print(f"  Processed {i+1:3d} obs | {date.strftime('%Y-%m-%d')} | "
                  f"Regime: {regime} ({confidence:.1%}) | Avg: {avg_time:.2f}ms")
    
    except Exception as e:
        print(f"‚ùå Error processing observation {i}: {e}")
        break

# Performance summary
if processing_times:
    print(f"\nüìä STREAMING PERFORMANCE SUMMARY:")
    print(f"  Total observations processed: {len(processing_times)}")
    print(f"  Average processing time: {np.mean(processing_times)*1000:.2f} ms")
    print(f"  Median processing time: {np.median(processing_times)*1000:.2f} ms")
    print(f"  95th percentile: {np.percentile(processing_times, 95)*1000:.2f} ms")
    print(f"  Maximum processing time: {np.max(processing_times)*1000:.2f} ms")
    
    # Throughput calculation
    observations_per_second = 1 / np.mean(processing_times)
    print(f"  Theoretical throughput: {observations_per_second:.0f} obs/sec")
    
    if observations_per_second > 1000:
        print(f"  ‚úÖ Excellent performance for real-time trading!")
    elif observations_per_second > 100:
        print(f"  ‚úÖ Good performance for most applications")
    else:
        print(f"  ‚ö†Ô∏è May need optimization for high-frequency applications")

print(f"\n‚úÖ Streaming simulation completed successfully!")

## 5. Analyzing Streaming Results

Let's analyze the results from our streaming simulation to understand how the Online HMM adapts over time.

In [None]:
# Convert results to DataFrame for analysis
if streaming_results:
    results_df = pd.DataFrame(streaming_results)
    results_df.set_index('date', inplace=True)
    
    print(f"üìà STREAMING RESULTS ANALYSIS")
    print(f"=" * 40)
    
    # Basic statistics
    print(f"Analysis period: {len(results_df)} observations")
    print(f"Date range: {results_df.index[0].date()} to {results_df.index[-1].date()}")
    
    # Regime distribution
    regime_counts = results_df['regime'].value_counts().sort_index()
    print(f"\nRegime Distribution:")
    for regime, count in regime_counts.items():
        percentage = count / len(results_df) * 100
        print(f"  State {regime}: {count} days ({percentage:.1f}%)")
    
    # Confidence statistics
    avg_confidence = results_df['confidence'].mean()
    high_confidence_pct = (results_df['confidence'] > 0.7).mean()
    low_confidence_pct = (results_df['confidence'] < 0.5).mean()
    
    print(f"\nConfidence Statistics:")
    print(f"  Average confidence: {avg_confidence:.1%}")
    print(f"  High confidence days (>70%): {high_confidence_pct:.1%}")
    print(f"  Low confidence days (<50%): {low_confidence_pct:.1%}")
    
    # Performance statistics
    avg_processing_time = results_df['processing_time_ms'].mean()
    max_processing_time = results_df['processing_time_ms'].max()
    
    print(f"\nProcessing Performance:")
    print(f"  Average processing time: {avg_processing_time:.2f} ms")
    print(f"  Maximum processing time: {max_processing_time:.2f} ms")
    
    # Check for regime switches
    regime_switches = (results_df['regime'] != results_df['regime'].shift(1)).sum() - 1
    print(f"\nRegime Dynamics:")
    print(f"  Total regime switches: {regime_switches}")
    print(f"  Average regime duration: {len(results_df) / (regime_switches + 1):.1f} days")
    
else:
    print("‚ùå No streaming results to analyze")
    results_df = None

In [None]:
# Visualize streaming results
if results_df is not None and len(results_df) > 0:
    fig, axes = plt.subplots(4, 1, figsize=(16, 16))
    fig.suptitle('Online HMM Streaming Analysis Results', fontsize=16, fontweight='bold')
    
    # Plot 1: Returns colored by regime
    colors = ['red', 'orange', 'green']
    for regime in results_df['regime'].unique():
        if regime >= 0:  # Valid regime
            mask = results_df['regime'] == regime
            if mask.any():
                axes[0].scatter(results_df.index[mask], results_df['return'][mask], 
                               c=colors[int(regime) % len(colors)], alpha=0.6, s=20, 
                               label=f'State {int(regime)}')
    
    axes[0].axhline(y=0, color='black', linestyle='--', alpha=0.5)
    axes[0].set_title('Returns Colored by Detected Regime')
    axes[0].set_ylabel('Daily Return')
    axes[0].legend()
    axes[0].grid(True, alpha=0.3)
    
    # Plot 2: Regime probabilities over time
    if len(results_df['regime_probs'].iloc[0]) >= 3:
        # Extract probability arrays
        prob_array = np.array(results_df['regime_probs'].tolist())
        
        # Stack probabilities
        axes[1].fill_between(results_df.index, 0, prob_array[:, 0], 
                            color='red', alpha=0.7, label='State 0')
        axes[1].fill_between(results_df.index, prob_array[:, 0], 
                            prob_array[:, 0] + prob_array[:, 1], 
                            color='orange', alpha=0.7, label='State 1')
        axes[1].fill_between(results_df.index, prob_array[:, 0] + prob_array[:, 1], 1,
                            color='green', alpha=0.7, label='State 2')
        
        axes[1].set_title('Regime Probabilities Over Time')
        axes[1].set_ylabel('Probability')
        axes[1].set_ylim(0, 1)
        axes[1].legend()
        axes[1].grid(True, alpha=0.3)
    
    # Plot 3: Confidence over time
    axes[2].plot(results_df.index, results_df['confidence'], 'navy', linewidth=1, alpha=0.8)
    axes[2].axhline(y=0.7, color='green', linestyle='--', alpha=0.7, label='High Confidence')
    axes[2].axhline(y=0.5, color='orange', linestyle='--', alpha=0.7, label='Low Confidence')
    axes[2].fill_between(results_df.index, 0.7, 1, alpha=0.2, color='green')
    axes[2].fill_between(results_df.index, 0, 0.5, alpha=0.2, color='red')
    
    axes[2].set_title('Classification Confidence Over Time')
    axes[2].set_ylabel('Confidence')
    axes[2].set_ylim(0, 1)
    axes[2].legend()
    axes[2].grid(True, alpha=0.3)
    
    # Plot 4: Processing time performance
    axes[3].plot(results_df.index, results_df['processing_time_ms'], 'purple', alpha=0.7, linewidth=1)
    axes[3].axhline(y=results_df['processing_time_ms'].mean(), color='red', 
                   linestyle='-', alpha=0.8, label=f'Average: {results_df["processing_time_ms"].mean():.2f}ms')
    axes[3].axhline(y=results_df['processing_time_ms'].quantile(0.95), color='orange', 
                   linestyle='--', alpha=0.8, label=f'95th percentile: {results_df["processing_time_ms"].quantile(0.95):.2f}ms')
    
    axes[3].set_title('Processing Time Performance')
    axes[3].set_ylabel('Processing Time (ms)')
    axes[3].set_xlabel('Date')
    axes[3].legend()
    axes[3].grid(True, alpha=0.3)
    
    plt.tight_layout()
    plt.show()
    
    # Additional analysis: regime transitions
    print(f"\nüîÑ REGIME TRANSITION ANALYSIS:")
    print(f"=" * 40)
    
    # Find regime switches
    regime_changes = results_df[results_df['regime'] != results_df['regime'].shift(1)].copy()
    regime_changes = regime_changes.iloc[1:]  # Remove first observation
    
    if len(regime_changes) > 0:
        print(f"Total regime switches detected: {len(regime_changes)}")
        print(f"\nMajor regime transitions:")
        
        for i, (date, row) in enumerate(regime_changes.head(10).iterrows()):
            prev_regime = results_df.loc[:date].iloc[-2]['regime']
            new_regime = row['regime']
            confidence = row['confidence']
            return_val = row['return']
            
            print(f"  {date.strftime('%Y-%m-%d')}: State {int(prev_regime)} ‚Üí State {int(new_regime)} "
                  f"(conf: {confidence:.1%}, return: {return_val:.3f})")
    else:
        print("No regime switches detected in this period")

else:
    print("‚ùå No results to visualize")

## 6. Parameter Evolution Analysis

One of the key advantages of Online HMMs is their ability to adapt parameters over time. Let's examine how model parameters evolve during streaming.

In [None]:
# Analyze parameter evolution
print(f"üî¨ PARAMETER EVOLUTION ANALYSIS")
print(f"=" * 40)

if parameter_evolution['means'] and len(parameter_evolution['means']) > 1:
    print(f"Parameter snapshots collected: {len(parameter_evolution['means'])}")
    
    # Extract evolution data
    means_evolution = np.array(parameter_evolution['means'])
    covariances_evolution = np.array(parameter_evolution['covariances'])
    transitions_evolution = np.array(parameter_evolution['transitions'])
    
    print(f"Shape of means evolution: {means_evolution.shape}")
    print(f"Shape of covariances evolution: {covariances_evolution.shape}")
    print(f"Shape of transitions evolution: {transitions_evolution.shape}")
    
    # Analyze parameter stability
    print(f"\nüìä Parameter Stability Analysis:")
    
    # Mean parameters evolution
    for state in range(means_evolution.shape[1]):
        state_means = means_evolution[:, state, 0]  # Extract mean values
        initial_mean = state_means[0]
        final_mean = state_means[-1]
        mean_change = final_mean - initial_mean
        mean_volatility = np.std(state_means)
        
        print(f"  State {state} Mean Evolution:")
        print(f"    Initial: {initial_mean:.5f}")
        print(f"    Final: {final_mean:.5f}")
        print(f"    Change: {mean_change:.5f}")
        print(f"    Volatility: {mean_volatility:.5f}")
    
    # Variance parameters evolution
    print(f"\n  Variance Evolution:")
    for state in range(covariances_evolution.shape[1]):
        state_vars = covariances_evolution[:, state, 0, 0]  # Extract variance values
        state_stds = np.sqrt(state_vars)
        initial_std = state_stds[0]
        final_std = state_stds[-1]
        std_change = final_std - initial_std
        
        print(f"    State {state} Std Evolution:")
        print(f"      Initial: {initial_std:.5f}")
        print(f"      Final: {final_std:.5f}")
        print(f"      Change: {std_change:.5f}")
    
    # Visualize parameter evolution
    fig, axes = plt.subplots(2, 2, figsize=(15, 12))
    fig.suptitle('Online HMM Parameter Evolution', fontsize=16, fontweight='bold')
    
    # Plot mean evolution
    for state in range(means_evolution.shape[1]):
        axes[0, 0].plot(means_evolution[:, state, 0], 
                       label=f'State {state}', linewidth=2, marker='o', markersize=3)
    
    axes[0, 0].set_title('Mean Parameters Evolution')
    axes[0, 0].set_xlabel('Update Step')
    axes[0, 0].set_ylabel('Mean Return')
    axes[0, 0].legend()
    axes[0, 0].grid(True, alpha=0.3)
    axes[0, 0].axhline(y=0, color='black', linestyle='--', alpha=0.5)
    
    # Plot standard deviation evolution
    for state in range(covariances_evolution.shape[1]):
        state_stds = np.sqrt(covariances_evolution[:, state, 0, 0])
        axes[0, 1].plot(state_stds, 
                       label=f'State {state}', linewidth=2, marker='o', markersize=3)
    
    axes[0, 1].set_title('Standard Deviation Evolution')
    axes[0, 1].set_xlabel('Update Step')
    axes[0, 1].set_ylabel('Standard Deviation')
    axes[0, 1].legend()
    axes[0, 1].grid(True, alpha=0.3)
    
    # Plot transition probabilities evolution (diagonal elements)
    for state in range(transitions_evolution.shape[1]):
        persistence_probs = transitions_evolution[:, state, state]
        axes[1, 0].plot(persistence_probs, 
                       label=f'State {state}', linewidth=2, marker='o', markersize=3)
    
    axes[1, 0].set_title('State Persistence Probabilities')
    axes[1, 0].set_xlabel('Update Step')
    axes[1, 0].set_ylabel('P(stay in state)')
    axes[1, 0].legend()
    axes[1, 0].grid(True, alpha=0.3)
    axes[1, 0].set_ylim(0, 1)
    
    # Plot parameter change magnitude over time
    if len(means_evolution) > 1:
        mean_changes = []
        for i in range(1, len(means_evolution)):
            change_magnitude = np.linalg.norm(means_evolution[i] - means_evolution[i-1])
            mean_changes.append(change_magnitude)
        
        axes[1, 1].plot(mean_changes, 'purple', linewidth=2, marker='o', markersize=3)
        axes[1, 1].set_title('Parameter Change Magnitude')
        axes[1, 1].set_xlabel('Update Step')
        axes[1, 1].set_ylabel('Change Magnitude')
        axes[1, 1].grid(True, alpha=0.3)
        
        # Add trend line
        if len(mean_changes) > 2:
            z = np.polyfit(range(len(mean_changes)), mean_changes, 1)
            p = np.poly1d(z)
            axes[1, 1].plot(range(len(mean_changes)), p(range(len(mean_changes))), 
                           "r--", alpha=0.8, label=f'Trend (slope: {z[0]:.6f})')
            axes[1, 1].legend()
    
    plt.tight_layout()
    plt.show()
    
    # Stability assessment
    if len(mean_changes) > 5:
        recent_stability = np.mean(mean_changes[-5:])
        initial_stability = np.mean(mean_changes[:5])
        
        print(f"\nüìà Adaptation Assessment:")
        print(f"  Initial adaptation rate: {initial_stability:.6f}")
        print(f"  Recent adaptation rate: {recent_stability:.6f}")
        
        if recent_stability < initial_stability * 0.5:
            print(f"  ‚úÖ Parameters are stabilizing over time")
        elif recent_stability > initial_stability * 2:
            print(f"  ‚ö†Ô∏è Parameters are becoming more volatile")
        else:
            print(f"  ‚úÖ Parameters show consistent adaptation")

else:
    print("‚ùå Insufficient parameter evolution data collected")
    print("This could be due to:")
    print("  ‚Ä¢ Short streaming period")
    print("  ‚Ä¢ Parameter extraction issues")
    print("  ‚Ä¢ Model initialization problems")

## 7. Comparing Online vs Batch HMM Performance

Let's compare the Online HMM with a traditional Batch HMM to understand the trade-offs.

In [None]:
# Compare Online vs Batch HMM
print(f"‚öñÔ∏è ONLINE vs BATCH HMM COMPARISON")
print(f"=" * 50)

# Use test data for comparison
comparison_data = test_returns[:100]  # Limit for speed
comparison_dates = data.index[split_point:split_point + len(comparison_data)]

print(f"Comparison dataset: {len(comparison_data)} observations")
print(f"Period: {comparison_dates[0].date()} to {comparison_dates[-1].date()}")

# Initialize Batch HMM
print(f"\nüîÑ Training Batch HMM...")
batch_start_time = time.time()

try:
    batch_hmm = HiddenMarkovModel(n_states=3, random_state=42)
    
    # Train on all available data (train + comparison)
    all_training_data = np.concatenate([train_returns, comparison_data]).reshape(-1, 1)
    batch_hmm.fit(all_training_data)
    
    batch_training_time = time.time() - batch_start_time
    print(f"‚úÖ Batch HMM training completed in {batch_training_time:.3f} seconds")
    
    # Get batch predictions
    batch_states = batch_hmm.predict(comparison_data.reshape(-1, 1))
    batch_probs = batch_hmm.predict_proba(comparison_data.reshape(-1, 1))
    
    print(f"‚úÖ Batch HMM predictions completed")
    
except Exception as e:
    print(f"‚ùå Batch HMM training failed: {e}")
    batch_hmm = None
    batch_states = None
    batch_probs = None

# Process same data with Online HMM (continue from previous state)
print(f"\nüåä Processing with Online HMM...")
online_start_time = time.time()

online_states = []
online_probs = []
online_processing_times = []

for new_return in comparison_data:
    obs_start_time = time.time()
    
    try:
        online_hmm.update(new_return)
        regime_info = online_hmm.get_current_regime_info()
        
        online_states.append(regime_info.get('most_likely_regime', -1))
        online_probs.append(regime_info.get('regime_probabilities', [0, 0, 0]))
        
        obs_time = time.time() - obs_start_time
        online_processing_times.append(obs_time)
        
    except Exception as e:
        print(f"Error in online processing: {e}")
        online_states.append(-1)
        online_probs.append([0, 0, 0])
        online_processing_times.append(0)

online_total_time = time.time() - online_start_time
print(f"‚úÖ Online HMM processing completed in {online_total_time:.3f} seconds")

# Performance comparison
if batch_states is not None and online_states:
    print(f"\nüìä PERFORMANCE COMPARISON:")
    print(f"=" * 30)
    
    print(f"Training/Processing Time:")
    print(f"  Batch HMM: {batch_training_time:.3f} seconds (one-time training)")
    print(f"  Online HMM: {online_total_time:.3f} seconds (incremental processing)")
    print(f"  Speed ratio: {batch_training_time / online_total_time:.1f}x")
    
    print(f"\nPer-observation Processing:")
    avg_online_time = np.mean(online_processing_times) * 1000
    batch_per_obs = (batch_training_time / len(comparison_data)) * 1000
    print(f"  Batch HMM: {batch_per_obs:.2f} ms per observation (amortized)")
    print(f"  Online HMM: {avg_online_time:.2f} ms per observation")
    
    # Agreement analysis
    online_states_array = np.array(online_states)
    valid_mask = (online_states_array >= 0) & (batch_states >= 0)
    
    if valid_mask.sum() > 0:
        agreement = (online_states_array[valid_mask] == batch_states[valid_mask]).mean()
        print(f"\nRegime Classification Agreement:")
        print(f"  Agreement rate: {agreement:.1%}")
        
        if agreement > 0.8:
            print(f"  ‚úÖ High agreement - Online HMM is consistent")
        elif agreement > 0.6:
            print(f"  ‚ö†Ô∏è Moderate agreement - Some differences expected")
        else:
            print(f"  ‚ùå Low agreement - Check model configuration")
        
        # Confusion matrix
        from sklearn.metrics import confusion_matrix
        cm = confusion_matrix(batch_states[valid_mask], online_states_array[valid_mask])
        print(f"\nConfusion Matrix (Batch vs Online):")
        print(f"     Online‚Üí {' '.join([f'{i:3d}' for i in range(cm.shape[1])])}")
        for i, row in enumerate(cm):
            print(f"Batch {i}: [{' '.join([f'{val:3d}' for val in row])}]")

# Memory usage comparison
print(f"\nüíæ MEMORY USAGE COMPARISON:")
print(f"=" * 30)
print(f"Batch HMM:")
print(f"  ‚Ä¢ Stores entire dataset: {len(all_training_data) * 8} bytes")
print(f"  ‚Ä¢ Model parameters: ~{3*3 + 3*2} parameters")
print(f"  ‚Ä¢ Total memory: High (scales with data size)")
print(f"\nOnline HMM:")
print(f"  ‚Ä¢ Rolling window: {config.window_size * 8} bytes")
print(f"  ‚Ä¢ Sufficient statistics: Fixed size")
print(f"  ‚Ä¢ Total memory: Low (constant size)")

print(f"\n‚úÖ Comparison completed!")

In [None]:
# Visualize the comparison
if batch_states is not None and online_states and len(online_states) > 0:
    # Create comparison DataFrame
    comparison_df = pd.DataFrame({
        'date': comparison_dates[:len(online_states)],
        'return': comparison_data[:len(online_states)],
        'batch_regime': batch_states[:len(online_states)],
        'online_regime': online_states
    })
    comparison_df.set_index('date', inplace=True)
    
    # Visualization
    fig, axes = plt.subplots(3, 1, figsize=(16, 12))
    fig.suptitle('Online vs Batch HMM Comparison', fontsize=16, fontweight='bold')
    
    colors = ['red', 'orange', 'green']
    
    # Plot 1: Batch HMM results
    for regime in np.unique(comparison_df['batch_regime']):
        if regime >= 0:
            mask = comparison_df['batch_regime'] == regime
            if mask.any():
                axes[0].scatter(comparison_df.index[mask], comparison_df['return'][mask], 
                               c=colors[int(regime) % len(colors)], alpha=0.6, s=20, 
                               label=f'State {int(regime)}')
    
    axes[0].axhline(y=0, color='black', linestyle='--', alpha=0.5)
    axes[0].set_title('Batch HMM: Returns Colored by Regime')
    axes[0].set_ylabel('Daily Return')
    axes[0].legend()
    axes[0].grid(True, alpha=0.3)
    
    # Plot 2: Online HMM results
    for regime in np.unique(comparison_df['online_regime']):
        if regime >= 0:
            mask = comparison_df['online_regime'] == regime
            if mask.any():
                axes[1].scatter(comparison_df.index[mask], comparison_df['return'][mask], 
                               c=colors[int(regime) % len(colors)], alpha=0.6, s=20, 
                               label=f'State {int(regime)}')
    
    axes[1].axhline(y=0, color='black', linestyle='--', alpha=0.5)
    axes[1].set_title('Online HMM: Returns Colored by Regime')
    axes[1].set_ylabel('Daily Return')
    axes[1].legend()
    axes[1].grid(True, alpha=0.3)
    
    # Plot 3: Agreement analysis
    agreement_mask = comparison_df['batch_regime'] == comparison_df['online_regime']
    disagreement_mask = ~agreement_mask
    
    if agreement_mask.any():
        axes[2].scatter(comparison_df.index[agreement_mask], comparison_df['return'][agreement_mask], 
                       c='green', alpha=0.7, s=20, label='Agreement')
    
    if disagreement_mask.any():
        axes[2].scatter(comparison_df.index[disagreement_mask], comparison_df['return'][disagreement_mask], 
                       c='red', alpha=0.7, s=30, marker='x', label='Disagreement')
    
    axes[2].axhline(y=0, color='black', linestyle='--', alpha=0.5)
    axes[2].set_title('Model Agreement Analysis')
    axes[2].set_ylabel('Daily Return')
    axes[2].set_xlabel('Date')
    axes[2].legend()
    axes[2].grid(True, alpha=0.3)
    
    plt.tight_layout()
    plt.show()
    
    # Summary statistics
    agreement_rate = agreement_mask.mean()
    print(f"\nüìä VISUAL COMPARISON SUMMARY:")
    print(f"Total observations: {len(comparison_df)}")
    print(f"Agreement rate: {agreement_rate:.1%}")
    print(f"Disagreement points: {disagreement_mask.sum()}")
    
    if disagreement_mask.any():
        print(f"\nDisagreement analysis:")
        disagreement_returns = comparison_df.loc[disagreement_mask, 'return']
        print(f"  Average return on disagreement days: {disagreement_returns.mean():.4f}")
        print(f"  Volatility on disagreement days: {disagreement_returns.std():.4f}")
        print(f"  This suggests disagreements occur during: {'high volatility' if disagreement_returns.std() > comparison_df['return'].std() else 'normal'} periods")

else:
    print("‚ùå Cannot create comparison visualization - insufficient data")

## 8. Real-Time Trading Application

Let's build a simple real-time trading system using our Online HMM for regime-based position sizing.

In [None]:
# Real-time trading system simulation
print(f"üíº REAL-TIME TRADING SYSTEM SIMULATION")
print(f"=" * 50)

class OnlineHMMTradingSystem:
    """
    Real-time trading system using Online HMM for regime detection
    """
    
    def __init__(self, online_hmm, initial_capital=100000):
        self.hmm = online_hmm
        self.initial_capital = initial_capital
        self.current_capital = initial_capital
        self.current_position = 0.0  # Current position (-1 to 1)
        self.trade_history = []
        self.portfolio_history = []
        
        # Trading parameters
        self.regime_positions = {
            0: -0.5,   # Bear market: Short position
            1: 0.0,    # Sideways: Neutral
            2: 1.0     # Bull market: Long position
        }
        
        self.confidence_threshold = 0.6  # Minimum confidence for trading
        self.max_position_size = 1.0     # Maximum position size
        
    def get_target_position(self, regime_info, price):
        """
        Calculate target position based on regime detection
        """
        regime = regime_info.get('most_likely_regime', 1)
        confidence = regime_info.get('confidence', 0)
        
        # Base position from regime
        base_position = self.regime_positions.get(regime, 0.0)
        
        # Scale by confidence
        if confidence < self.confidence_threshold:
            # Reduce position when uncertain
            position_scale = confidence / self.confidence_threshold * 0.5
        else:
            position_scale = confidence
        
        target_position = base_position * position_scale * self.max_position_size
        
        return target_position
    
    def execute_trade(self, date, price, return_val):
        """
        Execute trading decision based on current market state
        """
        # Update HMM with new return
        self.hmm.update(return_val)
        
        # Get regime information
        regime_info = self.hmm.get_current_regime_info()
        
        # Calculate target position
        target_position = self.get_target_position(regime_info, price)
        
        # Execute trade if position change is significant
        position_change = target_position - self.current_position
        
        if abs(position_change) > 0.05:  # 5% threshold for trading
            trade = {
                'date': date,
                'price': price,
                'return': return_val,
                'old_position': self.current_position,
                'new_position': target_position,
                'position_change': position_change,
                'regime': regime_info.get('most_likely_regime', -1),
                'confidence': regime_info.get('confidence', 0),
                'capital_before': self.current_capital
            }
            
            self.current_position = target_position
            self.trade_history.append(trade)
        
        # Calculate portfolio value
        # Simplified: assume we can hold fractional positions
        portfolio_return = self.current_position * return_val
        self.current_capital *= (1 + portfolio_return)
        
        # Record portfolio state
        portfolio_state = {
            'date': date,
            'capital': self.current_capital,
            'position': self.current_position,
            'regime': regime_info.get('most_likely_regime', -1),
            'confidence': regime_info.get('confidence', 0),
            'daily_return': portfolio_return,
            'regime_probs': regime_info.get('regime_probabilities', [0, 0, 0])
        }
        
        self.portfolio_history.append(portfolio_state)
        
        return portfolio_state
    
    def get_performance_summary(self):
        """
        Calculate performance metrics
        """
        if not self.portfolio_history:
            return {}
        
        portfolio_df = pd.DataFrame(self.portfolio_history)
        
        total_return = (self.current_capital / self.initial_capital) - 1
        daily_returns = portfolio_df['daily_return'].dropna()
        
        if len(daily_returns) > 0:
            sharpe_ratio = daily_returns.mean() / daily_returns.std() * np.sqrt(252)
            max_drawdown = self.calculate_max_drawdown(portfolio_df['capital'])
        else:
            sharpe_ratio = 0
            max_drawdown = 0
        
        return {
            'total_return': total_return,
            'final_capital': self.current_capital,
            'sharpe_ratio': sharpe_ratio,
            'max_drawdown': max_drawdown,
            'total_trades': len(self.trade_history),
            'trading_days': len(self.portfolio_history)
        }
    
    def calculate_max_drawdown(self, capital_series):
        """
        Calculate maximum drawdown
        """
        if len(capital_series) == 0:
            return 0
        
        running_max = capital_series.expanding().max()
        drawdown = (capital_series - running_max) / running_max
        return drawdown.min()

# Initialize trading system
trading_system = OnlineHMMTradingSystem(online_hmm, initial_capital=100000)

print(f"‚úÖ Trading system initialized with $100,000 capital")
print(f"Strategy parameters:")
print(f"  Bear regime position: {trading_system.regime_positions[0]:.1%}")
print(f"  Sideways regime position: {trading_system.regime_positions[1]:.1%}")
print(f"  Bull regime position: {trading_system.regime_positions[2]:.1%}")
print(f"  Confidence threshold: {trading_system.confidence_threshold:.1%}")

In [None]:
# Run trading simulation
print(f"\nüîÑ RUNNING TRADING SIMULATION...")
print(f"=" * 40)

# Use test data for trading simulation
trading_data = test_returns[:50]  # Limit for demonstration
trading_dates = data.index[split_point:split_point + len(trading_data)]
trading_prices = data['Adj Close'].iloc[split_point:split_point + len(trading_data)]

print(f"Trading simulation period: {len(trading_data)} days")
print(f"From {trading_dates[0].date()} to {trading_dates[-1].date()}")

# Run simulation
simulation_start_time = time.time()

for i, (date, price, return_val) in enumerate(zip(trading_dates, trading_prices, trading_data)):
    try:
        portfolio_state = trading_system.execute_trade(date, price, return_val)
        
        # Print progress every 10 days
        if (i + 1) % 10 == 0:
            regime = portfolio_state['regime']
            confidence = portfolio_state['confidence']
            capital = portfolio_state['capital']
            position = portfolio_state['position']
            
            print(f"  Day {i+1:2d} | {date.strftime('%Y-%m-%d')} | "
                  f"Capital: ${capital:8,.0f} | Position: {position:6.1%} | "
                  f"Regime: {regime} ({confidence:.1%})")
    
    except Exception as e:
        print(f"‚ùå Error on day {i+1}: {e}")
        break

simulation_time = time.time() - simulation_start_time
print(f"\n‚úÖ Trading simulation completed in {simulation_time:.3f} seconds")

# Performance summary
performance = trading_system.get_performance_summary()

print(f"\nüìä TRADING PERFORMANCE SUMMARY:")
print(f"=" * 40)
print(f"Initial Capital: ${trading_system.initial_capital:,.0f}")
print(f"Final Capital: ${performance.get('final_capital', 0):,.0f}")
print(f"Total Return: {performance.get('total_return', 0):.2%}")
print(f"Sharpe Ratio: {performance.get('sharpe_ratio', 0):.2f}")
print(f"Max Drawdown: {performance.get('max_drawdown', 0):.2%}")
print(f"Total Trades: {performance.get('total_trades', 0)}")
print(f"Trading Days: {performance.get('trading_days', 0)}")

# Calculate buy-and-hold comparison
buy_hold_return = (1 + pd.Series(trading_data)).prod() - 1
excess_return = performance.get('total_return', 0) - buy_hold_return

print(f"\nComparison vs Buy & Hold:")
print(f"Buy & Hold Return: {buy_hold_return:.2%}")
print(f"Strategy Return: {performance.get('total_return', 0):.2%}")
print(f"Excess Return: {excess_return:.2%}")

if excess_return > 0:
    print(f"‚úÖ Strategy outperformed buy & hold!")
else:
    print(f"‚ö†Ô∏è Strategy underperformed buy & hold")

# Trade analysis
if trading_system.trade_history:
    trades_df = pd.DataFrame(trading_system.trade_history)
    
    print(f"\nüìà TRADE ANALYSIS:")
    print(f"Total position changes: {len(trades_df)}")
    
    regime_trades = trades_df.groupby('regime').size()
    print(f"Trades by regime:")
    for regime, count in regime_trades.items():
        print(f"  Regime {regime}: {count} trades")
    
    avg_confidence = trades_df['confidence'].mean()
    print(f"Average trading confidence: {avg_confidence:.1%}")
    
    # Show recent trades
    print(f"\nRecent trades:")
    for _, trade in trades_df.tail(5).iterrows():
        print(f"  {trade['date'].strftime('%Y-%m-%d')}: "
              f"{trade['old_position']:6.1%} ‚Üí {trade['new_position']:6.1%} "
              f"(Regime {trade['regime']}, conf: {trade['confidence']:.1%})")

else:
    print(f"\n‚ö†Ô∏è No trades executed (positions may not have changed significantly)")

In [None]:
# Visualize trading system performance
if trading_system.portfolio_history:
    portfolio_df = pd.DataFrame(trading_system.portfolio_history)
    portfolio_df.set_index('date', inplace=True)
    
    # Calculate buy-and-hold portfolio for comparison
    buy_hold_capital = [trading_system.initial_capital]
    for daily_return in trading_data:
        buy_hold_capital.append(buy_hold_capital[-1] * (1 + daily_return))
    
    buy_hold_df = pd.DataFrame({
        'capital': buy_hold_capital[1:],  # Remove initial value
        'date': trading_dates
    }).set_index('date')
    
    # Create visualization
    fig, axes = plt.subplots(4, 1, figsize=(16, 16))
    fig.suptitle('Online HMM Trading System Performance', fontsize=16, fontweight='bold')
    
    # Plot 1: Portfolio value comparison
    axes[0].plot(portfolio_df.index, portfolio_df['capital'], 
                label='HMM Strategy', linewidth=2, color='blue')
    axes[0].plot(buy_hold_df.index, buy_hold_df['capital'], 
                label='Buy & Hold', linewidth=2, color='red', alpha=0.7)
    
    axes[0].set_title('Portfolio Value Comparison')
    axes[0].set_ylabel('Capital ($)')
    axes[0].legend()
    axes[0].grid(True, alpha=0.3)
    axes[0].yaxis.set_major_formatter(plt.FuncFormatter(lambda x, p: f'${x:,.0f}'))
    
    # Plot 2: Position sizes over time
    axes[1].fill_between(portfolio_df.index, 0, portfolio_df['position'], 
                        where=(portfolio_df['position'] > 0), color='green', alpha=0.6, label='Long')
    axes[1].fill_between(portfolio_df.index, 0, portfolio_df['position'], 
                        where=(portfolio_df['position'] < 0), color='red', alpha=0.6, label='Short')
    axes[1].axhline(y=0, color='black', linestyle='-', alpha=0.5)
    
    axes[1].set_title('Position Sizes Over Time')
    axes[1].set_ylabel('Position Size')
    axes[1].set_ylim(-1.1, 1.1)
    axes[1].legend()
    axes[1].grid(True, alpha=0.3)
    
    # Plot 3: Regime detection and confidence
    colors = ['red', 'orange', 'green']
    for regime in portfolio_df['regime'].unique():
        if regime >= 0:
            mask = portfolio_df['regime'] == regime
            if mask.any():
                axes[2].scatter(portfolio_df.index[mask], portfolio_df['confidence'][mask], 
                               c=colors[int(regime) % len(colors)], alpha=0.7, s=30, 
                               label=f'State {int(regime)}')
    
    axes[2].axhline(y=trading_system.confidence_threshold, color='black', 
                   linestyle='--', alpha=0.7, label=f'Confidence Threshold ({trading_system.confidence_threshold:.1%})')
    axes[2].set_title('Regime Detection and Confidence')
    axes[2].set_ylabel('Confidence')
    axes[2].set_ylim(0, 1)
    axes[2].legend()
    axes[2].grid(True, alpha=0.3)
    
    # Plot 4: Daily returns comparison
    strategy_returns = portfolio_df['daily_return'].fillna(0)
    buy_hold_returns = pd.Series(trading_data, index=trading_dates)
    
    axes[3].plot(strategy_returns.index, strategy_returns.cumsum(), 
                label='HMM Strategy', linewidth=2, color='blue')
    axes[3].plot(buy_hold_returns.index, buy_hold_returns.cumsum(), 
                label='Buy & Hold', linewidth=2, color='red', alpha=0.7)
    
    axes[3].set_title('Cumulative Returns Comparison')
    axes[3].set_ylabel('Cumulative Return')
    axes[3].set_xlabel('Date')
    axes[3].legend()
    axes[3].grid(True, alpha=0.3)
    
    plt.tight_layout()
    plt.show()
    
    # Additional analysis
    print(f"\nüìä ADDITIONAL PERFORMANCE ANALYSIS:")
    print(f"=" * 40)
    
    # Volatility comparison
    strategy_vol = strategy_returns.std() * np.sqrt(252)
    buy_hold_vol = buy_hold_returns.std() * np.sqrt(252)
    
    print(f"Annualized Volatility:")
    print(f"  HMM Strategy: {strategy_vol:.2%}")
    print(f"  Buy & Hold: {buy_hold_vol:.2%}")
    print(f"  Volatility Reduction: {(1 - strategy_vol/buy_hold_vol):.1%}")
    
    # Risk-adjusted performance
    strategy_sharpe = strategy_returns.mean() / strategy_returns.std() * np.sqrt(252)
    buy_hold_sharpe = buy_hold_returns.mean() / buy_hold_returns.std() * np.sqrt(252)
    
    print(f"\nRisk-Adjusted Performance:")
    print(f"  HMM Strategy Sharpe: {strategy_sharpe:.2f}")
    print(f"  Buy & Hold Sharpe: {buy_hold_sharpe:.2f}")
    print(f"  Sharpe Improvement: {strategy_sharpe - buy_hold_sharpe:.2f}")

else:
    print("‚ùå No portfolio history to visualize")

## 9. Advanced Online HMM Features

Let's explore some advanced features of Online HMMs including change point detection and model monitoring.

In [None]:
# Advanced Online HMM features demonstration
print(f"üî¨ ADVANCED ONLINE HMM FEATURES")
print(f"=" * 40)

# Feature 1: Change Point Detection
print(f"\n1. CHANGE POINT DETECTION:")
print(f"-" * 25)

# Monitor likelihood changes for structural breaks
class ChangePointMonitor:
    def __init__(self, window_size=20, threshold=0.1):
        self.window_size = window_size
        self.threshold = threshold
        self.likelihood_history = []
        self.change_points = []
    
    def add_likelihood(self, likelihood, date):
        self.likelihood_history.append((date, likelihood))
        
        if len(self.likelihood_history) >= self.window_size * 2:
            # Compare recent vs historical likelihood
            recent_ll = np.mean([ll for _, ll in self.likelihood_history[-self.window_size:]])
            historical_ll = np.mean([ll for _, ll in self.likelihood_history[-2*self.window_size:-self.window_size]])
            
            likelihood_change = abs(recent_ll - historical_ll) / abs(historical_ll)
            
            if likelihood_change > self.threshold:
                self.change_points.append({
                    'date': date,
                    'likelihood_change': likelihood_change,
                    'recent_ll': recent_ll,
                    'historical_ll': historical_ll
                })
                return True
        
        return False

# Initialize change point monitor
cp_monitor = ChangePointMonitor(window_size=10, threshold=0.05)

print(f"‚úÖ Change point monitor initialized")
print(f"  Window size: {cp_monitor.window_size}")
print(f"  Threshold: {cp_monitor.threshold:.1%}")

# Feature 2: Model Health Monitoring
print(f"\n2. MODEL HEALTH MONITORING:")
print(f"-" * 25)

class ModelHealthMonitor:
    def __init__(self):
        self.confidence_history = []
        self.parameter_stability = []
        self.processing_times = []
        
    def add_observation(self, confidence, processing_time, params=None):
        self.confidence_history.append(confidence)
        self.processing_times.append(processing_time)
        
        if params is not None:
            self.parameter_stability.append(params)
    
    def get_health_status(self):
        if len(self.confidence_history) < 10:
            return {'status': 'warming_up', 'message': 'Insufficient data for health assessment'}
        
        # Check confidence trends
        recent_confidence = np.mean(self.confidence_history[-10:])
        overall_confidence = np.mean(self.confidence_history)
        
        # Check processing time trends
        recent_processing = np.mean(self.processing_times[-10:])
        avg_processing = np.mean(self.processing_times)
        
        health_issues = []
        
        if recent_confidence < 0.5:
            health_issues.append('Low recent confidence')
        
        if recent_processing > avg_processing * 2:
            health_issues.append('Processing time degradation')
        
        if len(self.parameter_stability) > 20:
            # Check parameter stability (simplified)
            param_changes = []
            for i in range(1, min(20, len(self.parameter_stability))):
                if isinstance(self.parameter_stability[i], dict) and isinstance(self.parameter_stability[i-1], dict):
                    # This is a simplified check - in practice you'd compare actual parameter values
                    param_changes.append(0.01)  # Placeholder
            
            if param_changes and np.mean(param_changes) > 0.05:
                health_issues.append('High parameter instability')
        
        if not health_issues:
            return {
                'status': 'healthy', 
                'message': 'Model operating normally',
                'confidence': recent_confidence,
                'processing_time': recent_processing
            }
        else:
            return {
                'status': 'warning',
                'message': '; '.join(health_issues),
                'confidence': recent_confidence,
                'processing_time': recent_processing
            }

# Initialize health monitor
health_monitor = ModelHealthMonitor()

print(f"‚úÖ Model health monitor initialized")

# Feature 3: Adaptive Configuration
print(f"\n3. ADAPTIVE CONFIGURATION:")
print(f"-" * 25)

def adapt_configuration(current_config, market_volatility, confidence_trend):
    """
    Adapt Online HMM configuration based on market conditions
    """
    adapted_config = current_config.__dict__.copy()
    
    # Adapt forgetting factor based on volatility
    if market_volatility > 0.02:  # High volatility
        adapted_config['forgetting_factor'] = max(0.99, current_config.forgetting_factor - 0.005)
        adapted_config['adaptation_rate'] = min(0.1, current_config.adaptation_rate + 0.01)
    elif market_volatility < 0.01:  # Low volatility
        adapted_config['forgetting_factor'] = min(0.999, current_config.forgetting_factor + 0.002)
        adapted_config['adaptation_rate'] = max(0.01, current_config.adaptation_rate - 0.005)
    
    # Adapt based on confidence trends
    if confidence_trend < 0.6:  # Low confidence trend
        adapted_config['window_size'] = min(500, current_config.window_size + 50)
    
    return adapted_config

# Demonstrate adaptive configuration
current_vol = np.std(trading_data) if len(trading_data) > 0 else 0.015
current_conf_trend = np.mean([p['confidence'] for p in trading_system.portfolio_history[-10:]]) if len(trading_system.portfolio_history) > 10 else 0.7

adapted_config = adapt_configuration(config, current_vol, current_conf_trend)

print(f"Current market volatility: {current_vol:.4f}")
print(f"Current confidence trend: {current_conf_trend:.2%}")
print(f"\nConfiguration adaptation:")
print(f"  Forgetting factor: {config.forgetting_factor} ‚Üí {adapted_config['forgetting_factor']}")
print(f"  Adaptation rate: {config.adaptation_rate} ‚Üí {adapted_config['adaptation_rate']}")
print(f"  Window size: {config.window_size} ‚Üí {adapted_config['window_size']}")

# Feature 4: Performance Monitoring
print(f"\n4. PERFORMANCE MONITORING:")
print(f"-" * 25)

# Simulate monitoring on recent data
monitoring_data = test_returns[-20:] if len(test_returns) >= 20 else test_returns
monitoring_dates = data.index[split_point + len(test_returns) - len(monitoring_data):split_point + len(test_returns)]

print(f"Monitoring last {len(monitoring_data)} observations...")

for i, (date, return_val) in enumerate(zip(monitoring_dates, monitoring_data)):
    start_time = time.time()
    
    try:
        # Update model (if needed)
        online_hmm.update(return_val)
        regime_info = online_hmm.get_current_regime_info()
        
        processing_time = time.time() - start_time
        
        # Update monitors
        confidence = regime_info.get('confidence', 0)
        health_monitor.add_observation(confidence, processing_time)
        
        # Check for change points (simplified - using dummy likelihood)
        dummy_likelihood = -abs(return_val) * 100  # Simplified likelihood proxy
        change_detected = cp_monitor.add_likelihood(dummy_likelihood, date)
        
        if change_detected:
            print(f"  ‚ö†Ô∏è Change point detected on {date.strftime('%Y-%m-%d')}")
        
        if (i + 1) % 10 == 0:
            health_status = health_monitor.get_health_status()
            print(f"  Health check {i+1}: {health_status['status']} - {health_status['message']}")
    
    except Exception as e:
        print(f"  ‚ùå Monitoring error on {date}: {e}")

# Final health assessment
final_health = health_monitor.get_health_status()
print(f"\nüìä FINAL MODEL HEALTH ASSESSMENT:")
print(f"Status: {final_health['status']}")
print(f"Message: {final_health['message']}")
if 'confidence' in final_health:
    print(f"Recent confidence: {final_health['confidence']:.1%}")
if 'processing_time' in final_health:
    print(f"Avg processing time: {final_health['processing_time']*1000:.2f}ms")

print(f"\nChange points detected: {len(cp_monitor.change_points)}")
for cp in cp_monitor.change_points[-3:]:  # Show last 3
    print(f"  {cp['date'].strftime('%Y-%m-%d')}: {cp['likelihood_change']:.1%} change")

print(f"\n‚úÖ Advanced features demonstration completed!")

## 10. Summary and Best Practices

In this tutorial, we've explored the power of Online Hidden Markov Models for real-time financial regime detection. Let's summarize the key concepts and best practices.

In [None]:
# Summary of Online HMM capabilities and recommendations
print(f"üìã ONLINE HMM TUTORIAL SUMMARY")
print(f"=" * 50)

print(f"\n‚úÖ WHAT WE ACCOMPLISHED:")
print(f"1. Configured and initialized Online HMM for streaming data")
print(f"2. Processed real-time market data with sub-millisecond latency")
print(f"3. Analyzed parameter evolution and model adaptation")
print(f"4. Compared Online vs Batch HMM performance")
print(f"5. Built a complete real-time trading system")
print(f"6. Demonstrated advanced monitoring and health checks")
print(f"7. Explored adaptive configuration strategies")

print(f"\nüéØ KEY ADVANTAGES OF ONLINE HMM:")
print(f"‚Ä¢ Real-time Processing: O(1) complexity per observation")
print(f"‚Ä¢ Memory Efficiency: Fixed memory usage regardless of data size")
print(f"‚Ä¢ Adaptive Learning: Parameters evolve with market conditions")
print(f"‚Ä¢ Temporal Consistency: Stable historical regime classifications")
print(f"‚Ä¢ Low Latency: Suitable for high-frequency trading applications")
print(f"‚Ä¢ Robust to Non-stationarity: Handles changing market dynamics")

print(f"\n‚öôÔ∏è CONFIGURATION BEST PRACTICES:")
print(f"\nForgetting Factor (0.99-0.999):")
print(f"  ‚Ä¢ Higher values (0.999): Stable markets, long-term trends")
print(f"  ‚Ä¢ Lower values (0.99): Volatile markets, quick adaptation")
print(f"  ‚Ä¢ Adaptive: Adjust based on market volatility")

print(f"\nAdaptation Rate (0.01-0.1):")
print(f"  ‚Ä¢ Higher values (0.1): Fast adaptation, less stability")
print(f"  ‚Ä¢ Lower values (0.01): Slow adaptation, more stability")
print(f"  ‚Ä¢ Recommended: 0.05 for balanced performance")

print(f"\nWindow Size (100-500):")
print(f"  ‚Ä¢ Larger windows: More stable, slower adaptation")
print(f"  ‚Ä¢ Smaller windows: Less stable, faster adaptation")
print(f"  ‚Ä¢ Recommended: 252 (1 trading year) for daily data")

print(f"\nüìä PERFORMANCE GUIDELINES:")
print(f"\nExpected Processing Times:")
print(f"  ‚Ä¢ Excellent: <1ms per observation")
print(f"  ‚Ä¢ Good: 1-10ms per observation")
print(f"  ‚Ä¢ Needs optimization: >10ms per observation")

print(f"\nConfidence Thresholds:")
print(f"  ‚Ä¢ High confidence: >80% (strong regime signal)")
print(f"  ‚Ä¢ Medium confidence: 60-80% (moderate regime signal)")
print(f"  ‚Ä¢ Low confidence: <60% (uncertain regime)")

print(f"\nüö® COMMON PITFALLS AND SOLUTIONS:")
print(f"\n1. Parameter Instability:")
print(f"   Problem: Parameters change too rapidly")
print(f"   Solution: Increase forgetting factor, decrease adaptation rate")

print(f"\n2. Slow Adaptation:")
print(f"   Problem: Model doesn't adapt to regime changes")
print(f"   Solution: Decrease forgetting factor, increase adaptation rate")

print(f"\n3. Low Confidence:")
print(f"   Problem: Model is uncertain about regime classifications")
print(f"   Solution: Increase window size, add more features, check data quality")

print(f"\n4. Memory Issues:")
print(f"   Problem: Model uses too much memory")
   f"   Solution: Reduce window size, implement proper cleanup")

print(f"\nüîß PRODUCTION DEPLOYMENT CHECKLIST:")
print(f"‚ñ° Configure appropriate forgetting factor for market conditions")
print(f"‚ñ° Set up model health monitoring and alerting")
print(f"‚ñ° Implement change point detection for structural breaks")
print(f"‚ñ° Add proper error handling and recovery mechanisms")
print(f"‚ñ° Set up parameter logging and visualization")
print(f"‚ñ° Implement adaptive configuration based on market conditions")
print(f"‚ñ° Add transaction cost modeling and position size limits")
print(f"‚ñ° Set up backtesting and performance validation")
print(f"‚ñ° Implement proper risk management and stop-loss mechanisms")
print(f"‚ñ° Add model versioning and rollback capabilities")

print(f"\nüìà NEXT STEPS:")
print(f"1. Explore multivariate Online HMMs with additional features")
print(f"2. Implement portfolio-level regime detection across assets")
print(f"3. Add more sophisticated trading strategies and risk management")
print(f"4. Integrate with live market data feeds")
print(f"5. Develop automated parameter optimization")
print(f"6. Build comprehensive backtesting and validation frameworks")

print(f"\n‚ö†Ô∏è IMPORTANT DISCLAIMERS:")
print(f"‚Ä¢ Past performance does not guarantee future results")
print(f"‚Ä¢ This is educational content, not financial advice")
print(f"‚Ä¢ Always implement proper risk management in production")
print(f"‚Ä¢ Consider transaction costs and market impact in real trading")
print(f"‚Ä¢ Validate models extensively before deployment")

print(f"\nüéâ CONGRATULATIONS!")
print(f"You've successfully mastered Online Hidden Markov Models for")
print(f"real-time financial regime detection and trading applications!")
print(f"\nHappy trading! üöÄüìà")