# AlphaForge Research Notebook 🚀

**Interactive environment for systematic alpha research and factor model validation**

This notebook demonstrates the full capabilities of AlphaForge for quantitative factor research, portfolio construction, and performance analysis.

## Table of Contents
1. [Setup and Data Loading](#setup)
2. [Factor Analysis](#factors)
3. [Portfolio Construction](#portfolio)
4. [Backtesting](#backtest)
5. [Walk-Forward Analysis](#walkforward)
6. [Performance Analysis](#performance)
7. [Advanced Techniques](#advanced)

## 1. Setup and Data Loading {#setup}

In [None]:
# Import required libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from datetime import datetime, timedelta
import warnings
warnings.filterwarnings('ignore')

# Import AlphaForge framework
from factor_backtester import (
    Backtester, BacktestConfig, DataProvider, 
    FactorCalculator, PortfolioConstructor, 
    PerformanceAnalyzer
)

# Set up plotting
plt.style.use('seaborn-v0_8')
sns.set_palette("husl")
%matplotlib inline

print("🚀 AlphaForge Research Environment Ready!")
print("📊 Systematic alpha research toolkit loaded")

In [None]:
# AlphaForge Configuration
config = BacktestConfig(
    start_date="2015-01-01",
    end_date="2023-12-31",
    rebalance_freq="M",  # Monthly rebalancing
    transaction_cost=0.001,  # 10 bps transaction cost
    max_weight=0.05,  # 5% max position
    min_weight=-0.05,  # 5% max short position
    leverage=1.0  # No leverage
)

# Initialize AlphaForge components
data_provider = DataProvider()
backtester = Backtester(config)

print(f"⚙️ AlphaForge Configuration:")
print(f"   📅 Period: {config.start_date} to {config.end_date}")
print(f"   🔄 Rebalancing: {config.rebalance_freq}")
print(f"   💰 Transaction Cost: {config.transaction_cost:.1%}")
print(f"   📊 Position Limits: {config.min_weight:.1%} to {config.max_weight:.1%}")
print(f"   🎯 Leverage: {config.leverage}x")

In [None]:
# Load market universe
print("📈 Loading S&P 500 universe...")
tickers = data_provider.get_universe("SP500")
print(f"🌟 Universe: {len(tickers)} stocks")
print(f"📋 Sample tickers: {tickers[:15]}")

# Fetch market data with caching
print("\n🔄 Fetching market data (this may take a moment)...")
raw_data = data_provider.fetch_yahoo_data(tickers, config.start_date, config.end_date)
print(f"✅ Loaded {len(raw_data):,} observations")
print(f"📊 {len(raw_data['ticker'].unique())} unique tickers")
print(f"📅 Date range: {raw_data['Date'].min()} to {raw_data['Date'].max()}")

# Display sample data
print("\n📋 Sample Data:")
display(raw_data.head(10))

## 2. Factor Analysis {#factors}

Calculate and analyze classic risk factors using AlphaForge's factor engineering capabilities.

In [None]:
# Calculate factors using AlphaForge
print("🔬 Calculating systematic risk factors...")
factor_calculator = FactorCalculator(raw_data)
factor_data = factor_calculator.calculate_all_factors()

print(f"📊 Factor data shape: {factor_data.shape}")
print(f"📅 Factor coverage: {factor_data['Date'].min()} to {factor_data['Date'].max()}")
print(f"🎯 {len(factor_data['ticker'].unique())} stocks with factor scores")

# Display factor summary statistics
factor_cols = ['momentum', 'value', 'quality', 'size', 'low_vol']
print("\n📈 Factor Summary Statistics:")
factor_summary = factor_data[factor_cols].describe()
display(factor_summary)

In [None]:
# Factor correlation and distribution analysis
fig, axes = plt.subplots(2, 2, figsize=(16, 12))

# Factor correlation heatmap
corr_matrix = factor_data[factor_cols].corr()
sns.heatmap(corr_matrix, annot=True, cmap='RdBu_r', center=0, 
            square=True, ax=axes[0, 0], fmt='.3f')
axes[0, 0].set_title('📊 Factor Correlation Matrix')

# Factor rank distributions
factor_ranks = [col + '_rank' for col in factor_cols if col + '_rank' in factor_data.columns]
if factor_ranks:
    factor_data[factor_ranks].hist(bins=50, ax=axes[0, 1], alpha=0.7)
    axes[0, 1].set_title('📈 Factor Rank Distributions')

# Factor stability over time (cross-sectional means)
factor_ts = factor_data.groupby('Date')[factor_cols].mean()
for factor in factor_cols:
    axes[1, 0].plot(factor_ts.index, factor_ts[factor], label=factor.title(), alpha=0.8)
axes[1, 0].set_title('🔄 Factor Evolution Over Time')
axes[1, 0].set_ylabel('Cross-Sectional Mean')
axes[1, 0].legend()
axes[1, 0].grid(True, alpha=0.3)

# Factor volatility
factor_vol = factor_data.groupby('Date')[factor_cols].std().mean()
factor_vol.plot(kind='bar', ax=axes[1, 1], alpha=0.7)
axes[1, 1].set_title('📊 Average Factor Volatility')
axes[1, 1].set_ylabel('Cross-Sectional Std Dev')
axes[1, 1].tick_params(axis='x', rotation=45)

plt.tight_layout()
plt.show()

## 3. Portfolio Construction {#portfolio}

Demonstrate systematic portfolio construction with factor-based signals.

In [None]:
# Portfolio construction example
portfolio_constructor = PortfolioConstructor(config)

# Select a sample date for analysis
sample_date = factor_data['Date'].iloc[len(factor_data)//2]  # Mid-sample date
print(f"📅 Portfolio construction date: {sample_date.strftime('%Y-%m-%d')}")

# Construct portfolio with AlphaForge
weights = portfolio_constructor.construct_portfolio(
    factor_data, sample_date, use_shrinkage=True, use_lasso=True
)

print(f"\n📊 Portfolio Statistics:")
print(f"   🎯 Total positions: {len(weights)}")
print(f"   📈 Long positions: {(weights > 0).sum()}")
print(f"   📉 Short positions: {(weights < 0).sum()}")
print(f"   💰 Total long weight: {weights[weights > 0].sum():.2%}")
print(f"   💸 Total short weight: {weights[weights < 0].sum():.2%}")
print(f"   🎪 Net exposure: {weights.sum():.2%}")
print(f"   🌐 Gross exposure: {weights.abs().sum():.2%}")

# Display top holdings
if len(weights) > 0:
    print("\n🔝 Top 10 Long Positions:")
    display(weights.nlargest(10).to_frame('Weight'))
    
    print("\n🔻 Top 10 Short Positions:")
    display(weights.nsmallest(10).to_frame('Weight'))

In [None]:
# Visualize portfolio construction
if len(weights) > 0:
    fig, axes = plt.subplots(2, 2, figsize=(15, 10))
    
    # Weight distribution
    axes[0, 0].hist(weights, bins=30, alpha=0.7, color='steelblue', edgecolor='black')
    axes[0, 0].set_title('📊 Portfolio Weight Distribution')
    axes[0, 0].set_xlabel('Weight')
    axes[0, 0].set_ylabel('Frequency')
    axes[0, 0].axvline(0, color='red', linestyle='--', alpha=0.7)
    
    # Long vs Short exposure
    long_weights = weights[weights > 0]
    short_weights = weights[weights < 0]
    exposure_data = ['Long', 'Short']
    exposure_values = [long_weights.sum(), abs(short_weights.sum())]
    colors = ['green', 'red']
    
    axes[0, 1].bar(exposure_data, exposure_values, color=colors, alpha=0.7)
    axes[0, 1].set_title('📈 Long vs Short Exposure')
    axes[0, 1].set_ylabel('Total Weight')
    
    # Top positions by absolute weight
    top_positions = weights.abs().nlargest(15)
    position_colors = ['green' if weights[ticker] > 0 else 'red' for ticker in top_positions.index]
    
    y_pos = range(len(top_positions))
    axes[1, 0].barh(y_pos, top_positions.values, color=position_colors, alpha=0.7)
    axes[1, 0].set_yticks(y_pos)
    axes[1, 0].set_yticklabels(top_positions.index, fontsize=8)
    axes[1, 0].set_title('🎯 Top 15 Positions by Absolute Weight')
    axes[1, 0].set_xlabel('Absolute Weight')
    
    # Portfolio utilization
    gross_exposure = weights.abs().sum()
    cash_allocation = 1 - gross_exposure
    
    utilization_labels = ['Invested', 'Cash']
    utilization_values = [gross_exposure, cash_allocation]
    
    axes[1, 1].pie(utilization_values, labels=utilization_labels, 
                   autopct='%1.1f%%', startangle=90, colors=['lightblue', 'lightgray'])
    axes[1, 1].set_title('💰 Portfolio Utilization')
    
    plt.tight_layout()
    plt.show()
else:
    print("⚠️ No portfolio weights generated for visualization")

## 4. Backtesting {#backtest}

Execute comprehensive backtesting with transaction costs and performance analytics.

In [None]:
# Run comprehensive backtest
print("🚀 Running AlphaForge backtest...")
print("   🔬 Applying Bayesian shrinkage and Lasso regularization")
print("   💰 Including realistic transaction costs")
print("   📊 Computing comprehensive performance metrics")

results = backtester.run_backtest(tickers=tickers, use_shrinkage=True, use_lasso=True)

if results:
    print("\n✅ Backtest completed successfully!")
    print("\n📊 Performance Summary:")
    print("=" * 40)
    
    metrics = results['metrics']
    
    print(f"📈 Total Return:        {metrics['total_return']:>8.2%}")
    print(f"📊 Annualized Return:   {metrics['annualized_return']:>8.2%}")
    print(f"📉 Volatility:          {metrics['volatility']:>8.2%}")
    print(f"⚡ Sharpe Ratio:        {metrics['sharpe_ratio']:>8.2f}")
    print(f"🔻 Maximum Drawdown:    {metrics['max_drawdown']:>8.2%}")
    print(f"🎯 Win Rate:            {metrics['win_rate']:>8.2%}")
    print(f"📐 Skewness:            {metrics['skewness']:>8.2f}")
    print(f"📊 Kurtosis:            {metrics['kurtosis']:>8.2f}")
    print(f"📋 Observations:        {metrics['num_observations']:>8d}")
    
    # Transaction cost analysis
    total_costs = results['transaction_costs'].sum()
    print(f"\n💸 Transaction Cost Analysis:")
    print(f"   Total Costs:         {total_costs:.4f}")
    print(f"   Average per Period:  {results['transaction_costs'].mean():.4f}")
    print(f"   Annual Cost Drag:    {total_costs / len(results['returns']) * 252:.2%}")
    
else:
    print("❌ Backtest failed - check data availability and configuration")

In [None]:
# Comprehensive performance visualization
if results:
    returns = results['returns']
    gross_returns = results['gross_returns']
    transaction_costs = results['transaction_costs']
    
    # Create performance dashboard
    fig, axes = plt.subplots(2, 3, figsize=(20, 12))
    fig.suptitle('📊 AlphaForge Performance Dashboard', fontsize=16, fontweight='bold')
    
    # 1. Cumulative returns
    cum_returns = (1 + returns).cumprod()
    cum_gross_returns = (1 + gross_returns).cumprod()
    
    axes[0, 0].plot(cum_returns.index, cum_returns.values, 
                    label='Net Returns', linewidth=2.5, color='steelblue')
    axes[0, 0].plot(cum_gross_returns.index, cum_gross_returns.values, 
                    label='Gross Returns', linewidth=2, alpha=0.7, color='orange')
    axes[0, 0].set_title('📈 Cumulative Returns')
    axes[0, 0].set_ylabel('Cumulative Return')
    axes[0, 0].legend()
    axes[0, 0].grid(True, alpha=0.3)
    
    # 2. Rolling Sharpe ratio
    rolling_sharpe = returns.rolling(60).mean() / returns.rolling(60).std() * np.sqrt(252)
    axes[0, 1].plot(rolling_sharpe.index, rolling_sharpe.values, 
                    color='green', linewidth=2)
    axes[0, 1].set_title('⚡ Rolling Sharpe Ratio (60-day)')
    axes[0, 1].set_ylabel('Sharpe Ratio')
    axes[0, 1].axhline(y=0, color='red', linestyle='--', alpha=0.7)
    axes[0, 1].axhline(y=1, color='green', linestyle='--', alpha=0.5, label='Target')
    axes[0, 1].grid(True, alpha=0.3)
    
    # 3. Drawdown analysis
    rolling_max = cum_returns.expanding().max()
    drawdown = (cum_returns / rolling_max) - 1
    axes[0, 2].fill_between(drawdown.index, drawdown.values, 0, 
                           alpha=0.4, color='red', label='Drawdown')
    axes[0, 2].plot(drawdown.index, drawdown.values, color='darkred', linewidth=1)
    axes[0, 2].set_title('🔻 Drawdown Analysis')
    axes[0, 2].set_ylabel('Drawdown')
    axes[0, 2].grid(True, alpha=0.3)
    
    # 4. Return distribution
    axes[1, 0].hist(returns, bins=50, alpha=0.7, color='lightblue', 
                    edgecolor='black', density=True)
    axes[1, 0].axvline(returns.mean(), color='red', linestyle='--', 
                       alpha=0.8, linewidth=2, label=f'Mean: {returns.mean():.3f}')
    axes[1, 0].axvline(returns.median(), color='green', linestyle='--', 
                       alpha=0.8, linewidth=2, label=f'Median: {returns.median():.3f}')
    axes[1, 0].set_title('📊 Return Distribution')
    axes[1, 0].set_xlabel('Daily Return')
    axes[1, 0].set_ylabel('Density')
    axes[1, 0].legend()
    axes[1, 0].grid(True, alpha=0.3)
    
    # 5. Transaction costs over time
    axes[1, 1].plot(transaction_costs.index, transaction_costs.cumsum(), 
                    color='purple', linewidth=2)
    axes[1, 1].set_title('💸 Cumulative Transaction Costs')
    axes[1, 1].set_ylabel('Cumulative Costs')
    axes[1, 1].grid(True, alpha=0.3)
    
    # 6. Rolling volatility
    rolling_vol = returns.rolling(60).std() * np.sqrt(252)
    axes[1, 2].plot(rolling_vol.index, rolling_vol.values, 
                    color='orange', linewidth=2)
    axes[1, 2].set_title('📉 Rolling Volatility (60-day)')
    axes[1, 2].set_ylabel('Annualized Volatility')
    axes[1, 2].grid(True, alpha=0.3)
    
    plt.tight_layout()
    plt.show()
    
    print("📊 Performance dashboard generated successfully!")

## 5. Walk-Forward Analysis {#walkforward}

Rigorous out-of-sample testing with expanding windows for unbiased performance estimates.

In [None]:
# Walk-forward out-of-sample analysis
print("🔄 Running walk-forward analysis...")
print("   📈 Expanding window approach for robust validation")
print("   🎯 Out-of-sample performance estimation")
print("   ⚡ Statistical significance testing")

oos_results = backtester.walk_forward_analysis(
    tickers=tickers,
    initial_window=504,  # 2 years initial training
    step_size=21  # Monthly steps
)

if oos_results:
    print("\n✅ Walk-forward analysis completed!")
    print("\n📊 Out-of-Sample Results:")
    print("=" * 45)
    
    oos_metrics = oos_results['oos_metrics']
    
    print(f"📈 OOS Total Return:      {oos_metrics['total_return']:>8.2%}")
    print(f"📊 OOS Annualized Return: {oos_metrics['annualized_return']:>8.2%}")
    print(f"📉 OOS Volatility:        {oos_metrics['volatility']:>8.2%}")
    print(f"⚡ OOS Sharpe Ratio:      {oos_metrics['sharpe_ratio']:>8.2f}")
    print(f"🔻 OOS Maximum Drawdown:  {oos_metrics['max_drawdown']:>8.2%}")
    print(f"🎯 OOS Win Rate:          {oos_metrics['win_rate']:>8.2%}")
    print(f"📋 OOS Observations:      {oos_metrics['num_observations']:>8d}")
    
else:
    print("❌ Walk-forward analysis failed")

In [None]:
# Compare in-sample vs out-of-sample performance
if results and oos_results:
    print("📊 In-Sample vs Out-of-Sample Comparison")
    print("=" * 50)
    
    # Create comparison table
    comparison_data = {
        'Metric': ['Total Return', 'Annualized Return', 'Volatility', 'Sharpe Ratio', 'Max Drawdown', 'Win Rate'],
        'In-Sample': [
            results['metrics']['total_return'],
            results['metrics']['annualized_return'],
            results['metrics']['volatility'],
            results['metrics']['sharpe_ratio'],
            results['metrics']['max_drawdown'],
            results['metrics']['win_rate']
        ],
        'Out-of-Sample': [
            oos_results['oos_metrics']['total_return'],
            oos_results['oos_metrics']['annualized_return'],
            oos_results['oos_metrics']['volatility'],
            oos_results['oos_metrics']['sharpe_ratio'],
            oos_results['oos_metrics']['max_drawdown'],
            oos_results['oos_metrics']['win_rate']
        ]
    }
    
    comparison_df = pd.DataFrame(comparison_data)
    
    # Calculate degradation
    comparison_df['Degradation (%)'] = (
        (comparison_df['Out-of-Sample'] - comparison_df['In-Sample']) / 
        comparison_df['In-Sample'].abs() * 100
    )
    
    display(comparison_df)
    
    # Visualization comparison
    fig, axes = plt.subplots(1, 3, figsize=(18, 6))
    
    # Performance metrics comparison
    metrics_to_plot = ['Annualized Return', 'Volatility', 'Sharpe Ratio']
    x = np.arange(len(metrics_to_plot))
    width = 0.35


## 6. Performance Analysis {#performance}\n\nDeep dive into risk metrics, attribution analysis, and performance characteristics.

In [None]:
# Advanced risk analysis\nif results:\n    returns = results['returns']\n    \n    print(\"⚠️ Risk Analysis Dashboard\")\n    print(\"=\" * 30)\n    \n    # Value at Risk (VaR) analysis\n    var_95 = np.percentile(returns, 5)\n    var_99 = np.percentile(returns, 1)\n    cvar_95 = returns[returns <= var_95].mean()\n    \n    print(f\"📊 Value at Risk:\")\n    print(f\"   Daily VaR (95%): {var_95:.3%}\")\n    print(f\"   Daily VaR (99%): {var_99:.3%}\")\n    print(f\"   Conditional VaR (95%): {cvar_95:.3%}\")\n    \n    # Tail risk analysis\n    extreme_losses = returns[returns < np.percentile(returns, 5)]\n    extreme_gains = returns[returns > np.percentile(returns, 95)]\n    \n    print(f\"\\n📈 Tail Risk Analysis:\")\n    print(f\"   Extreme loss days: {len(extreme_losses)}\")\n    print(f\"   Average extreme loss: {extreme_losses.mean():.3%}\")\n    print(f\"   Extreme gain days: {len(extreme_gains)}\")\n    print(f\"   Average extreme gain: {extreme_gains.mean():.3%}\")\n    print(f\"   Gain/Loss ratio: {extreme_gains.mean() / abs(extreme_losses.mean()):.2f}\")\n    \n    # Monthly and yearly performance breakdown\n    monthly_returns = returns.resample('M').apply(lambda x: (1 + x).prod() - 1)\n    yearly_returns = returns.resample('Y').apply(lambda x: (1 + x).prod() - 1)\n    \n    print(f\"\\n📅 Period Performance:\")\n    print(f\"   Best month: {monthly_returns.max():.2%} ({monthly_returns.idxmax().strftime('%Y-%m')})\")\n    print(f\"   Worst month: {monthly_returns.min():.2%} ({monthly_returns.idxmin().strftime('%Y-%m')})\")\n    print(f\"   Positive months: {(monthly_returns > 0).sum()}/{len(monthly_returns)} ({(monthly_returns > 0).mean():.1%})\")\n    \n    print(f\"\\n📊 Yearly Performance Breakdown:\")\n    for year, ret in yearly_returns.items():\n        print(f\"   {year.year}: {ret:>8.2%}\")

In [None]:
# Advanced factor importance analysis
if results and 'factor_data' in results:
    factor_data = results['factor_data']
    
    print("📊 Factor Importance Analysis")
    print("=" * 35)
    
    # Calculate factor importance using correlation with future returns
    factor_cols = ['momentum_rank', 'value_rank', 'quality_rank', 'size_rank', 'low_vol_rank']
    
    # Calculate next period returns
    factor_data['next_return'] = factor_data.groupby('ticker')['returns'].shift(-1)
    
    # Calculate correlations and information coefficients
    factor_importance = {}
    for factor in factor_cols:
        if factor in factor_data.columns:
            # Pearson correlation
            corr = factor_data[factor].corr(factor_data['next_return'])
            # Spearman rank correlation (more robust)
            rank_corr = factor_data[factor].corr(factor_data['next_return'], method='spearman')
            
            factor_name = factor.replace('_rank', '')
            factor_importance[factor_name] = {
                'Pearson_Corr': abs(corr),
                'Spearman_Corr': abs(rank_corr),
                'Average': (abs(corr) + abs(rank_corr)) / 2
            }
    
    # Create importance DataFrame
    importance_df = pd.DataFrame(factor_importance).T
    importance_df = importance_df.sort_values('Average', ascending=False)
    
    print("🔬 Factor Predictive Power (Correlation with Future Returns):")
    display(importance_df)
    
    # Visualization
    fig, axes = plt.subplots(1, 2, figsize=(15, 6))
    
    # Factor importance bar chart
    importance_df['Average'].plot(kind='bar', ax=axes[0], alpha=0.7, color='steelblue')
    axes[0].set_title('📊 Factor Importance (Predictive Power)')
    axes[0].set_xlabel('Factor')
    axes[0].set_ylabel('Average Correlation')
    axes[0].tick_params(axis='x', rotation=45)
    axes[0].grid(True, alpha=0.3)
    
    # Correlation comparison
    x = np.arange(len(importance_df))
    width = 0.35
    
    axes[1].bar(x - width/2, importance_df['Pearson_Corr'], width, 
               label='Pearson', alpha=0.8, color='lightblue')
    axes[1].bar(x + width/2, importance_df['Spearman_Corr'], width, 
               label='Spearman', alpha=0.8, color='orange')
    
    axes[1].set_xlabel('Factor')
    axes[1].set_ylabel('Correlation (Absolute)')
    axes[1].set_title('📈 Correlation Methods Comparison')
    axes[1].set_xticks(x)
    axes[1].set_xticklabels(importance_df.index, rotation=45)
    axes[1].legend()
    axes[1].grid(True, alpha=0.3)
    
    plt.tight_layout()
    plt.show()
    
    # Factor insights
    best_factor = importance_df.index[0]
    worst_factor = importance_df.index[-1]
    
    print(f"\n🏆 Key Insights:")
    print(f"   🥇 Most predictive factor: {best_factor.title()}")
    print(f"   📊 Predictive power: {importance_df.loc[best_factor, 'Average']:.4f}")
    print(f"   🥉 Least predictive factor: {worst_factor.title()}")
    print(f"   📉 Predictive power: {importance_df.loc[worst_factor, 'Average']:.4f}")
    print(f"   📈 Factor spread: {importance_df['Average'].max() - importance_df['Average'].min():.4f}")

## 7. Advanced Techniques {#advanced}\n\nExplore regularization methods and sensitivity analysis.

In [None]:
# Compare regularization techniques
print("🔬 Advanced Regularization Comparison")
print("=" * 45)

# Test different regularization configurations
configs = [
    {'name': 'No Regularization', 'shrinkage': False, 'lasso': False},
    {'name': 'Bayesian Shrinkage Only', 'shrinkage': True, 'lasso': False},
    {'name': 'Lasso Only', 'shrinkage': False, 'lasso': True},
    {'name': 'Both Techniques', 'shrinkage': True, 'lasso': True}
]

comparison_results = []

for config_dict in configs:
    print(f"🔄 Testing: {config_dict['name']}...")
    
    result = backtester.run_backtest(
        tickers=tickers[:50],  # Smaller universe for speed
        use_shrinkage=config_dict['shrinkage'],
        use_lasso=config_dict['lasso']
    )
    
    if result:
        comparison_results.append({
            'Strategy': config_dict['name'],
            'Total Return': result['metrics']['total_return'],
            'Annualized Return': result['metrics']['annualized_return'],
            'Volatility': result['metrics']['volatility'],
            'Sharpe Ratio': result['metrics']['sharpe_ratio'],
            'Max Drawdown': result['metrics']['max_drawdown'],
            'Win Rate': result['metrics']['win_rate']
        })

# Display comparison results
if comparison_results:
    comparison_df = pd.DataFrame(comparison_results)
    print("\n📊 Regularization Comparison Results:")
    display(comparison_df)
    
    # Find best performing strategy
    best_strategy = comparison_df.loc[comparison_df['Sharpe Ratio'].idxmax()]
    print(f"\n🏆 Best Strategy: {best_strategy['Strategy']}")
    print(f"   ⚡ Sharpe Ratio: {best_strategy['Sharpe Ratio']:.3f}")
    print(f"   📈 Annualized Return: {best_strategy['Annualized Return']:.2%}")

In [None]:
# Final summary and conclusions
print("\n🎯 AlphaForge Analysis Summary")
print("=" * 40)

if results:
    final_sharpe = results['metrics']['sharpe_ratio']
    final_return = results['metrics']['annualized_return']
    final_vol = results['metrics']['volatility']
    final_dd = results['metrics']['max_drawdown']
    
    print(f"📊 Final Strategy Performance:")
    print(f"   📈 Annualized Return: {final_return:.2%}")
    print(f"   📉 Volatility: {final_vol:.2%}")
    print(f"   ⚡ Sharpe Ratio: {final_sharpe:.3f}")
    print(f"   🔻 Max Drawdown: {final_dd:.2%}")
    
    # Strategy assessment
    if final_sharpe > 1.0:
        assessment = "🏆 Excellent strategy with strong risk-adjusted returns"
    elif final_sharpe > 0.5:
        assessment = "✅ Good strategy with solid performance"
    elif final_sharpe > 0.0:
        assessment = "⚠️ Moderate strategy with room for improvement"
    else:
        assessment = "❌ Poor strategy requiring significant revision"
    
    print(f"\n🎯 Assessment: {assessment}")
    
    if oos_results:
        oos_sharpe = oos_results['oos_metrics']['sharpe_ratio']
        degradation = (oos_sharpe - final_sharpe) / abs(final_sharpe) * 100
        print(f"\n🔬 Out-of-Sample Validation:")
        print(f"   📊 OOS Sharpe Ratio: {oos_sharpe:.3f}")
        print(f"   📉 Performance Degradation: {degradation:.1f}%")
        
        if degradation > -20:
            validation = "✅ Strategy passes out-of-sample validation"
        else:
            validation = "⚠️ Strategy shows significant overfitting"
        print(f"   🎯 Validation: {validation}")

print(f"\n🚀 Next Steps:")
print(f"   📈 Experiment with custom factors using examples/custom_factors.py")
print(f"   🔬 Explore ML techniques for enhanced alpha generation")
print(f"   📊 Test on different universes and time periods")
print(f"   🎯 Implement risk management overlays")
print(f"   💰 Consider transaction cost optimization")

print(f"\n✨ AlphaForge analysis complete! Ready to forge alpha! 🔥")

## Summary and Conclusions

This notebook demonstrates the comprehensive capabilities of **AlphaForge** - a systematic alpha research platform:

### 🎯 Key Features Demonstrated:
1. **🔗 Data Integration**: Seamless multi-source data fetching with intelligent caching
2. **⚙️ Factor Engineering**: Classic factors (momentum, value, quality, size, low-vol) with robust calculation
3. **🎪 Portfolio Construction**: Long-short strategies with position limits and risk controls
4. **🧠 Advanced Techniques**: Bayesian shrinkage and Lasso regularization for overfitting prevention
5. **🔬 Rigorous Testing**: Walk-forward analysis for unbiased out-of-sample validation
6. **📊 Comprehensive Analytics**: Risk metrics, performance attribution, and sensitivity analysis

### 🚀 Next Steps for Research:
- **Custom Factors**: Develop proprietary signals using the extensible framework
- **ML Integration**: Implement ensemble methods and deep learning models
- **Alternative Data**: Incorporate sentiment, satellite, and patent data
- **Multi-Asset**: Extend to fixed income, commodities, and currencies
- **Regime Models**: Add market regime detection and adaptive strategies

### 💼 Framework Benefits:
- **🏗️ Modular Design**: Easy to extend and customize for specific research needs
- **⚡ Performance**: Parallel processing and caching for institutional-scale research
- **🛡️ Robust**: Transaction costs, survivorship bias, and statistical validation
- **📈 Research-Ready**: Publication-quality analytics and visualizations

### 🔥 Production Applications:
- **Hedge Funds**: Systematic strategy development and risk management
- **Asset Managers**: Portfolio optimization and performance attribution
- **Risk Teams**: Factor exposure monitoring and stress testing
- **Academic Research**: Empirical asset pricing and factor model validation

**AlphaForge** provides a professional-grade foundation for systematic alpha research that scales from academic studies to institutional trading strategies.

---

🚀 **Ready to forge alpha?** Explore the examples directory for advanced use cases and custom factor development patterns.
