# Comprehensive Portfolio Optimization Backtest

This notebook demonstrates a comprehensive backtest of multiple portfolio optimization algorithms available in the AlloOptim library.

## Overview

This backtest compares the performance of:
- **13 individual optimizers** from the AlloOptim library
- **Ensemble methods** (average of all optimizer weights)
- **S&P 500 benchmark** for comparison

### Backtest Configuration
- **Period**: 2014-12-31 to 2024-12-31 (10 years)
- **Rebalancing**: Every 5 trading days
- **Lookback**: 90 days for optimizer estimation
- **Universe**: ~400 assets from Alpaca universe
- **Execution**: Perfect execution (target = actual weights)

### Performance Metrics
- Sharpe ratio, maximum drawdown, time underwater
- Risk-adjusted returns, portfolio turnover
- Daily return statistics, computation time
- Optimizer clustering analysis

## 1. Import Required Libraries

Import the necessary libraries for backtesting, data analysis, and visualization.

In [None]:
import logging
import warnings
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

# AlloOptim imports
from allo_optim.backtest.backtest_config import BacktestConfig, config
from allo_optim.backtest.backtest_engine import BacktestEngine
from allo_optim.backtest.backtest_report import generate_report
from allo_optim.backtest.backtest_visualizer import create_visualizations
from allo_optim.backtest.cluster_analyzer import ClusterAnalyzer

# Notebook utilities
from notebook_utils import (
    display_optimizer_comparison,
    plot_returns_distribution,
    create_performance_summary,
    plot_cumulative_returns,
    save_notebook_results,
    print_backtest_summary
)

# Configure plotting style
plt.style.use('default')
sns.set_palette("husl")

# Suppress warnings for cleaner output
warnings.filterwarnings("ignore")

# Configure logging to show in notebook
logging.basicConfig(level=logging.INFO, format="%(asctime)s - %(levelname)s - %(message)s")
logger = logging.getLogger(__name__)

print("Libraries imported successfully!")

## 2. Backtest Configuration

Let's examine and display the backtest configuration parameters.

In [None]:
# Display backtest configuration
print("Backtest Configuration:")
print("=" * 50)
print(f"Start Date: {config.start_date}")
print(f"End Date: {config.end_date}")
print(f"Rebalancing Frequency: Every {config.rebalance_frequency} trading days")
print(f"Lookback Window: {config.lookback_days} days")
print(f"Results Directory: {config.results_dir}")
print(f"Random Seed: Not used in current implementation")

# Show date range for the report
start_date, end_date = config.get_report_date_range()
print(f"Report Date Range: {start_date} to {end_date}")

print("\nBacktest configuration loaded successfully!")

## 3. Run the Backtest

Now we'll execute the comprehensive backtest. This will test all available optimizers and may take several minutes to complete.

In [None]:
%%time
# Initialize and run the backtest
print("Initializing backtest engine...")
backtest_engine = BacktestEngine(config)

print("Running comprehensive backtest...")
print("This may take several minutes depending on your hardware...")

# Run the backtest
results = backtest_engine.run_backtest()

if results:
    print(f"\n‚úÖ Backtest completed successfully!")
    print(f"   Tested {len(results)} optimizers")
    print(f"   Results available for analysis")
else:
    print("‚ùå Backtest failed - no results generated")
    raise RuntimeError("Backtest execution failed")

## 4. Analyze Results

Let's examine the backtest results and performance metrics.

In [None]:
# Extract performance data for analysis
performance_data = []
optimizer_names = []

for name, data in results.items():
    if 'metrics' in data:
        row = {'optimizer': name}
        row.update(data['metrics'])
        performance_data.append(row)
        optimizer_names.append(name)

# Create DataFrame for analysis
df_results = pd.DataFrame(performance_data)
df_results = df_results.sort_values('sharpe_ratio', ascending=False)

print("Backtest Results Summary:")
print("=" * 60)
print(f"Total optimizers tested: {len(df_results)}")
print(f"Best Sharpe ratio: {df_results.iloc[0]['sharpe_ratio']:.3f} ({df_results.iloc[0]['optimizer']})")
print(f"Worst Sharpe ratio: {df_results.iloc[-1]['sharpe_ratio']:.3f} ({df_results.iloc[-1]['optimizer']})")
print(".3f")
print(".3f")

In [None]:
# Display top 10 performers using utility function
print("\nTop 10 Optimizers by Sharpe Ratio:")
print("-" * 80)
top_10_df = display_optimizer_comparison(results, top_n=10)
print(top_10_df.to_string(index=False))

## 5. Clustering Analysis

Let's analyze how the optimizers cluster based on their performance and portfolio similarity.

In [None]:
# Perform clustering analysis
print("Performing optimizer clustering analysis...")
cluster_analyzer = ClusterAnalyzer(results)
clustering_results = cluster_analyzer.analyze_clusters()

print("‚úÖ Clustering analysis completed!")

# Display clustering summary
if clustering_results and 'summary' in clustering_results:
    print("\nClustering Analysis Summary:")
    print("-" * 40)
    summary = clustering_results['summary']
    print(f"Number of clusters: {summary.get('n_clusters', 'N/A')}")
    print(f"Silhouette score: {summary.get('silhouette_score', 'N/A'):.3f}")
    print(f"Calinski-Harabasz score: {summary.get('calinski_harabasz_score', 'N/A'):.1f}")

    # Show cluster sizes
    if 'cluster_sizes' in summary:
        print(f"\nCluster sizes: {summary['cluster_sizes']}")

## 7. Visualizations

Let's create visualizations to better understand the backtest results.

## 6. Returns Distribution Analysis

Let's examine the distribution of daily returns for the top performing optimizers.

In [None]:
# Plot returns distribution for top 5 optimizers
print("Creating returns distribution plots...")
fig = plot_returns_distribution(results, top_n=5)
plt.show()

print("‚úÖ Returns distribution analysis completed!")

In [None]:
# Create visualizations
print("Creating visualizations...")
create_visualizations(results, clustering_results, config.results_dir)
print("‚úÖ Visualizations created and saved to results directory")

In [None]:
# Plot cumulative returns for top optimizers
print("Creating cumulative returns comparison...")
fig = plot_cumulative_returns(results, top_n=5)
plt.show()

print("‚úÖ Cumulative returns plot created!")

## 8. Generate Report

Let's generate a comprehensive report of the backtest results.

In [None]:
# Generate comprehensive report
print("Generating comprehensive backtest report...")
report = generate_report(results, clustering_results, config)

# Save report to file
report_path = config.results_dir / "comprehensive_backtest_report.md"
with open(report_path, "w") as f:
    f.write(report)

print(f"‚úÖ Report saved to: {report_path}")

# Display first part of the report
print("\n" + "="*80)
print("REPORT PREVIEW (first 20 lines)")
print("="*80)
lines = report.split('\n')[:20]
for line in lines:
    print(line)
print("...")
print(f"\nüìÑ Full report available at: {report_path}")

In [None]:
# Save notebook results to CSV files
print("Saving notebook results to CSV files...")
results_dir = save_notebook_results(results, clustering_results, "notebook_results")
print(f"‚úÖ Results saved to: {results_dir}")

## 9. Summary and Next Steps

### What We've Accomplished
‚úÖ **Comprehensive backtest** of 13+ portfolio optimization algorithms  
‚úÖ **Performance analysis** with key risk-adjusted metrics  
‚úÖ **Clustering analysis** to understand optimizer relationships  
‚úÖ **Returns distribution analysis** for top performers  
‚úÖ **Cumulative returns comparison** over time  
‚úÖ **Visualizations** for intuitive result exploration  
‚úÖ **Detailed report** with all findings and insights  
‚úÖ **CSV exports** for further analysis  

### Key Findings
- **Best performing optimizer**: {df_results.iloc[0]['optimizer']} (Sharpe: {df_results.iloc[0]['sharpe_ratio']:.3f})
- **Total optimizers tested**: {len(df_results)}
- **Test period**: 10 years (2014-2024)
- **Rebalancing**: Every 5 trading days

### Files Generated
- üìä **Visualizations**: PNG plots in `{config.results_dir}/plots/`
- üìã **Report**: Markdown report at `{report_path}`
- üìà **Results**: CSV data at `{config.results_dir}/backtest_results.csv`
- üíæ **Notebook Results**: CSV files in `notebook_results/`

### Next Steps
1. **Analyze specific optimizers** in more detail
2. **Compare different rebalancing frequencies**
3. **Test on different time periods**
4. **Incorporate transaction costs**
5. **Add custom optimizers** to the comparison

---
**Backtest completed successfully!** üéâ

In [None]:
# Print comprehensive backtest summary
print_backtest_summary(results)