# Performance Indicators for Hybrid ARIMA-LSTM Model

This notebook adds comprehensive performance indicators to your hybrid model results.

**Required:** Run your main hybrid model notebook first to generate:
- `sp500_hybrid_results`
- `bitcoin_hybrid_results`
- `sp500_clean` data
- `bitcoin_clean` data

In [None]:
# Import required libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.metrics import mean_squared_error, mean_absolute_error
import warnings
warnings.filterwarnings('ignore')

print("Libraries imported successfully!")

## Step 1: Define volatility_predictions_to_returns() Function

In [None]:
def volatility_predictions_to_returns(predictions, true_values, actual_returns):
    """
    Convert volatility predictions into trading returns.
    
    Strategy: Go long when predicted volatility is below median (low risk),
              go short when predicted volatility is above median (high risk).
    
    Parameters:
    -----------
    predictions : np.array
        Predicted volatility values
    true_values : np.array
        Actual volatility values
    actual_returns : pd.Series or np.array
        Actual market returns aligned with predictions
    
    Returns:
    --------
    pd.Series
        Strategy returns based on volatility forecasts
    """
    # Ensure we have the same length
    min_len = min(len(predictions), len(true_values), len(actual_returns))
    predictions = predictions[:min_len]
    true_values = true_values[:min_len]
    actual_returns = actual_returns[:min_len] if isinstance(actual_returns, np.ndarray) else actual_returns.iloc[:min_len]
    
    # Calculate prediction median
    pred_median = np.median(predictions)
    
    # Trading signal: 1 when low predicted volatility, -1 when high
    # Rationale: Low volatility = favorable conditions (go long)
    #           High volatility = unfavorable conditions (go short or defensive)
    signals = np.where(predictions < pred_median, 1, -1)
    
    # Strategy returns = signal * actual_returns
    strategy_returns = signals * actual_returns
    
    return pd.Series(strategy_returns)

print("volatility_predictions_to_returns() function defined!")

## Step 2: Define Performance Metrics Functions

In [None]:
# Configuration
TRADING_DAYS = 252  # standard convention
RISK_FREE_RATE = 0.0  # set if you have T-bill data

def annualized_return(daily_returns):
    """ARC - Annualized return."""
    cumulative = (1 + daily_returns).prod()
    n = daily_returns.shape[0]
    return (cumulative ** (TRADING_DAYS / n)) - 1

def annualized_std(daily_returns):
    """ASD - Annualized standard deviation."""
    return daily_returns.std() * np.sqrt(TRADING_DAYS)

def max_drawdown(daily_returns):
    """MD - Maximum drawdown from equity curve."""
    equity = 1 + daily_returns.cumprod()
    peak = equity.cummax()
    drawdown = (equity - peak) / peak
    return drawdown.min()  # negative value

def information_ratio(strategy_returns, benchmark_returns):
    """IR - Information ratio relative to benchmark."""
    active_returns = strategy_returns - benchmark_returns
    tracking_error = active_returns.std()
    if tracking_error == 0:
        return np.nan
    return active_returns.mean() / tracking_error

def modified_information_ratio(strategy_returns, benchmark_returns):
    """IR* - Annualized Information Ratio (reduces autocorrelation issues)."""
    active_daily = strategy_returns - benchmark_returns
    
    # Annualized active return
    ann_active_return = annualized_return(active_daily)
    
    # Annualized tracking error
    ann_tracking_error = annualized_std(active_daily)
    
    if ann_tracking_error == 0:
        return np.nan
    
    return ann_active_return / ann_tracking_error

def sharpe_ratio(daily_returns, risk_free_rate=RISK_FREE_RATE):
    """SR - Sharpe ratio."""
    excess = (daily_returns - risk_free_rate / TRADING_DAYS)
    denom = daily_returns.std()
    if denom == 0:
        return np.nan
    return (excess.mean() / denom) * np.sqrt(TRADING_DAYS)

def compute_performance_indicators(strategy_returns, benchmark_returns):
    """Compute ARC, ASD, MD, IR, IR*, SR for a model's trading strategy."""
    return {
        'ARC': annualized_return(strategy_returns),
        'ASD': annualized_std(strategy_returns),
        'MD': max_drawdown(strategy_returns),
        'IR': information_ratio(strategy_returns, benchmark_returns),
        'IR*': modified_information_ratio(strategy_returns, benchmark_returns),
        'SR': sharpe_ratio(strategy_returns)
    }

print("Performance metrics functions defined!")

## Step 3: Define Data Extraction Function

In [None]:
def extract_hybrid_returns_from_results(model_results, data_clean, window_indices=None):
    """
    Extract and aggregate returns from hybrid model cross-validation results.
    
    FIXED VERSION: Handles hybrid model result structure.
    """
    all_strategy_returns = []
    
    windows_to_use = model_results['windowsprocessed']
    if window_indices is not None:
        windows_to_use = [w for w in windows_to_use if w['windowid'] in window_indices]
    
    for window_result in windows_to_use:
        try:
            # Get test period dates
            test_start = window_result['teststart']
            test_end = window_result['testend']
            
            # Get actual returns during test period
            test_data = data_clean[test_start:test_end]
            
            # Get predictions - hybrid model stores hybridpredictions
            predictions = window_result['hybridpredictions']
            true_values = window_result['actuals']
            
            # Align returns with predictions
            actual_returns = test_data['LogReturns'].iloc[-len(predictions):]
            
            # Generate strategy returns for this window
            window_returns = volatility_predictions_to_returns(predictions, true_values, actual_returns.values)
            
            all_strategy_returns.append(window_returns)
            
        except Exception as e:
            print(f"Warning: Failed to process window {window_result.get('windowid', '?')}: {str(e)}")
            continue
    
    # Concatenate all returns
    if all_strategy_returns:
        return pd.concat(all_strategy_returns, ignore_index=True)
    else:
        return pd.Series()

print("Data extraction function defined!")

## Step 4: Define Comparison Function

In [None]:
def compare_hybrid_models_performance(sp500_hybrid, bitcoin_hybrid, 
                                      sp500_data, bitcoin_data):
    """
    Generate comprehensive performance indicators comparison for hybrid models.
    
    Parameters:
    -----------
    sp500_hybrid : dict
        S&P 500 hybrid cross-validation results
    bitcoin_hybrid : dict
        Bitcoin hybrid cross-validation results
    sp500_data : pd.DataFrame
        Original S&P 500 clean data with returns
    bitcoin_data : pd.DataFrame
        Original Bitcoin clean data with returns
    """
    print("="*100)
    print("COMPREHENSIVE HYBRID MODEL PERFORMANCE INDICATORS")
    print("="*100)
    print("Computing performance metrics from hybrid predictions...\n")
    
    # Extract strategy returns for S&P 500
    print("Processing S&P 500 Hybrid Model...")
    sp500_strategy_returns = extract_hybrid_returns_from_results(sp500_hybrid, sp500_data)
    
    # Extract strategy returns for Bitcoin
    print("Processing Bitcoin Hybrid Model...")
    bitcoin_strategy_returns = extract_hybrid_returns_from_results(bitcoin_hybrid, bitcoin_data)
    
    print("Computing performance indicators...\n")
    
    # Create benchmark returns (buy-and-hold) for S&P 500
    sp500_all_test_returns = []
    for window_result in sp500_hybrid['windowsprocessed']:
        test_start = window_result['teststart']
        test_end = window_result['testend']
        test_data = sp500_data[test_start:test_end]
        predictions_len = len(window_result['hybridpredictions'])
        window_benchmark = test_data['LogReturns'].iloc[-predictions_len:]
        sp500_all_test_returns.append(window_benchmark)
    
    sp500_benchmark = pd.concat(sp500_all_test_returns, ignore_index=True)
    
    # Create benchmark returns (buy-and-hold) for Bitcoin
    bitcoin_all_test_returns = []
    for window_result in bitcoin_hybrid['windowsprocessed']:
        test_start = window_result['teststart']
        test_end = window_result['testend']
        test_data = bitcoin_data[test_start:test_end]
        predictions_len = len(window_result['hybridpredictions'])
        window_benchmark = test_data['LogReturns'].iloc[-predictions_len:]
        bitcoin_all_test_returns.append(window_benchmark)
    
    bitcoin_benchmark = pd.concat(bitcoin_all_test_returns, ignore_index=True)
    
    # Compute indicators
    sp500_indicators = compute_performance_indicators(sp500_strategy_returns, sp500_benchmark)
    bitcoin_indicators = compute_performance_indicators(bitcoin_strategy_returns, bitcoin_benchmark)
    sp500_bench_indicators = compute_performance_indicators(sp500_benchmark, sp500_benchmark)
    bitcoin_bench_indicators = compute_performance_indicators(bitcoin_benchmark, bitcoin_benchmark)
    
    # Create comparison DataFrames
    sp500_comparison = pd.DataFrame({
        'Model': ['Hybrid ARIMA-LSTM', 'Buy-and-Hold Benchmark'],
        'ARC': [sp500_indicators['ARC']*100, sp500_bench_indicators['ARC']*100],
        'ASD': [sp500_indicators['ASD']*100, sp500_bench_indicators['ASD']*100],
        'MD': [sp500_indicators['MD']*100, sp500_bench_indicators['MD']*100],
        'IR': [sp500_indicators['IR'], sp500_bench_indicators['IR']],
        'IR*': [sp500_indicators['IR*'], sp500_bench_indicators['IR*']],
        'SR': [sp500_indicators['SR'], sp500_bench_indicators['SR']]
    })
    
    bitcoin_comparison = pd.DataFrame({
        'Model': ['Hybrid ARIMA-LSTM', 'Buy-and-Hold Benchmark'],
        'ARC': [bitcoin_indicators['ARC']*100, bitcoin_bench_indicators['ARC']*100],
        'ASD': [bitcoin_indicators['ASD']*100, bitcoin_bench_indicators['ASD']*100],
        'MD': [bitcoin_indicators['MD']*100, bitcoin_bench_indicators['MD']*100],
        'IR': [bitcoin_indicators['IR'], bitcoin_bench_indicators['IR']],
        'IR*': [bitcoin_indicators['IR*'], bitcoin_bench_indicators['IR*']],
        'SR': [bitcoin_indicators['SR'], bitcoin_bench_indicators['SR']]
    })
    
    # Print S&P 500 results
    print("="*100)
    print("S&P 500 PERFORMANCE INDICATORS")
    print("="*100)
    print(sp500_comparison.to_string(index=False, float_format='%.4f'))
    print("\n")
    
    # Print Bitcoin results
    print("="*100)
    print("BITCOIN PERFORMANCE INDICATORS")
    print("="*100)
    print(bitcoin_comparison.to_string(index=False, float_format='%.4f'))
    print("\n")
    
    # Print interpretations
    print("INDICATOR EXPLANATIONS")
    print("-"*100)
    print("ARC : Annualized Return - Higher is better - annual compound return")
    print("ASD : Annualized Std Dev - Lower is better - annual volatility")
    print("MD  : Maximum Drawdown - Closer to 0 is better - worst peak-to-trough loss")
    print("IR  : Information Ratio - Higher is better - excess return per tracking error")
    print("IR* : Modified Information Ratio - Higher is better - annualized IR")
    print("SR  : Sharpe Ratio - Higher is better - risk-adjusted return")
    print("-"*100)
    print("\n")
    
    # Visualization
    fig, axes = plt.subplots(2, 3, figsize=(18, 10))
    
    # S&P 500 plots
    models_sp500 = sp500_comparison['Model'].tolist()
    colors = ['blue', 'gray']
    
    axes[0, 0].bar(models_sp500, sp500_comparison['ARC'], color=colors, alpha=0.7)
    axes[0, 0].set_title('S&P 500 - Annualized Return (ARC)', fontweight='bold')
    axes[0, 0].set_ylabel('Return (%)')
    axes[0, 0].tick_params(axis='x', rotation=45)
    axes[0, 0].grid(True, alpha=0.3, axis='y')
    axes[0, 0].axhline(y=0, color='black', linestyle='-', linewidth=0.8)
    
    axes[0, 1].bar(models_sp500, sp500_comparison['ASD'], color=colors, alpha=0.7)
    axes[0, 1].set_title('S&P 500 - Annualized Std Dev (ASD)', fontweight='bold')
    axes[0, 1].set_ylabel('Volatility (%)')
    axes[0, 1].tick_params(axis='x', rotation=45)
    axes[0, 1].grid(True, alpha=0.3, axis='y')
    
    axes[0, 2].bar(models_sp500, sp500_comparison['MD'], color=colors, alpha=0.7)
    axes[0, 2].set_title('S&P 500 - Maximum Drawdown (MD)', fontweight='bold')
    axes[0, 2].set_ylabel('Drawdown (%)')
    axes[0, 2].tick_params(axis='x', rotation=45)
    axes[0, 2].grid(True, alpha=0.3, axis='y')
    axes[0, 2].axhline(y=0, color='black', linestyle='-', linewidth=0.8)
    
    # Bitcoin plots
    models_btc = bitcoin_comparison['Model'].tolist()
    
    axes[1, 0].bar(models_btc, bitcoin_comparison['ARC'], color=colors, alpha=0.7)
    axes[1, 0].set_title('Bitcoin - Annualized Return (ARC)', fontweight='bold')
    axes[1, 0].set_ylabel('Return (%)')
    axes[1, 0].tick_params(axis='x', rotation=45)
    axes[1, 0].grid(True, alpha=0.3, axis='y')
    axes[1, 0].axhline(y=0, color='black', linestyle='-', linewidth=0.8)
    
    axes[1, 1].bar(models_btc, bitcoin_comparison['ASD'], color=colors, alpha=0.7)
    axes[1, 1].set_title('Bitcoin - Annualized Std Dev (ASD)', fontweight='bold')
    axes[1, 1].set_ylabel('Volatility (%)')
    axes[1, 1].tick_params(axis='x', rotation=45)
    axes[1, 1].grid(True, alpha=0.3, axis='y')
    
    axes[1, 2].bar(models_btc, bitcoin_comparison['MD'], color=colors, alpha=0.7)
    axes[1, 2].set_title('Bitcoin - Maximum Drawdown (MD)', fontweight='bold')
    axes[1, 2].set_ylabel('Drawdown (%)')
    axes[1, 2].tick_params(axis='x', rotation=45)
    axes[1, 2].grid(True, alpha=0.3, axis='y')
    axes[1, 2].axhline(y=0, color='black', linestyle='-', linewidth=0.8)
    
    plt.suptitle('Hybrid ARIMA-LSTM Model Performance Comparison', 
                 fontsize=16, fontweight='bold')
    plt.tight_layout()
    plt.show()
    
    print("="*100)
    print("PERFORMANCE INDICATORS ANALYSIS COMPLETE")
    print("="*100)
    
    return sp500_comparison, bitcoin_comparison

print("Comparison function defined!")

## Step 5: Execute the Analysis

**IMPORTANT:** Make sure you have run your main hybrid model notebook first!

Required variables that must exist:
- `sp500_hybrid_results` - from your hybrid model
- `bitcoin_hybrid_results` - from your hybrid model
- `sp500_clean` - S&P 500 data with LogReturns
- `bitcoin_clean` - Bitcoin data with LogReturns

In [None]:
# Run the comprehensive performance analysis
print("Starting comprehensive performance analysis...\n")

# NOTE: Update these variable names if your variables are named differently
sp500_comp, bitcoin_comp = compare_hybrid_models_performance(
    sp500_hybrid_results,      # ← Change if your variable name is different
    bitcoin_hybrid_results,    # ← Change if your variable name is different
    sp500_clean,               # ← Change if your variable name is different
    bitcoin_clean              # ← Change if your variable name is different
)

print("\nAnalysis complete!")
print("Check console output for detailed metrics and visualizations.")

## Step 6: Optional - Save Results to CSV

In [None]:
# Optional: Save the results to CSV files
sp500_comp.to_csv('sp500_performance_indicators.csv', index=False)
bitcoin_comp.to_csv('bitcoin_performance_indicators.csv', index=False)

print("Results saved to:")
print("- sp500_performance_indicators.csv")
print("- bitcoin_performance_indicators.csv")

## Step 7: Metric Interpretations

| Metric | Full Name | Good Value | Interpretation |
|--------|-----------|------------|----------------|
| **ARC** | Annualized Return | Higher | Annual compound growth (%) |
| **ASD** | Annualized Std Dev | Lower | Annual volatility (%) - lower = smoother |
| **MD** | Maximum Drawdown | Closer to 0 | Worst loss from peak (%) - e.g., -25% |
| **IR** | Information Ratio | > 0.5 | Excess return per unit tracking error |
| **IR\*** | Modified Information Ratio | > 0.5 | Annualized version (better for short periods) |
| **SR** | Sharpe Ratio | > 1.0 | Risk-adjusted return (return / volatility) |

### What These Mean:

- **Higher ARC** = Your strategy makes more money per year than buy-and-hold
- **Lower ASD** = Your strategy has less volatile returns (smoother ride)
- **MD closer to 0** = Your strategy doesn't lose as much during downturns
- **Higher IR/IR*** = You're beating the benchmark with less risk
- **Higher SR** = You're getting better returns for the risk you're taking