## Summary: Strategy vs Research Targets

**Implementation Status:**
✓ Complete implementation of market-labeled LLM sentiment strategy  
✓ Cross-sectional market-neutral portfolio construction  
✓ Realistic transaction costs and market impact modeling  
✓ Comprehensive risk management framework  

**Key Takeaways:**
1. **Market-Labeled Approach**: Training BERT on abnormal returns (not human labels) directly aligns model with trading objective
2. **Transaction Costs Critical**: 10-30bps can significantly impact performance
3. **Signal Decay Monitoring**: Rolling Sharpe <0.5 triggers retrain/review
4. **Survivorship Bias**: MANDATORY control using CRSP delisted returns

**Research Citation:**
"Quant Radio: How AI Reads Market Moods to Predict Stock Success"  
Target: 35.56% return, 2.21 Sharpe (equal-weighted)

**Production Requirements:**
- Bloomberg/Reuters/RavenPack/Accern API access for text data
- CRSP for survivorship-bias-free market data
- GPU compute (A100/V100) for LLM inference
- Factor regression data (Fama-French library)

In [None]:
# Simulate with different cost assumptions
cost_scenarios = {
    'Optimistic (10bps)': 0.0010,
    'Moderate (20bps)': 0.0020,
    'Conservative (30bps)': 0.0030
}

cost_sensitivity = []
for scenario, cost_bps in cost_scenarios.items():
    # Recalculate returns with different costs
    adjusted_returns = results_df['gross_return'] - (results_df['turnover'] * cost_bps)
    adj_annual_return = (1 + adjusted_returns.sum()) ** (252 / len(adjusted_returns)) - 1
    adj_sharpe = adjusted_returns.mean() / adjusted_returns.std() * np.sqrt(252)
    cost_sensitivity.append({
        'Scenario': scenario,
        'Annual Return': f"{adj_annual_return:.2%}",
        'Sharpe Ratio': f"{adj_sharpe:.2f}"
    })

cost_df = pd.DataFrame(cost_sensitivity)
print("\nTransaction Cost Sensitivity Analysis")
print("="*80)
print(cost_df.to_string(index=False))
print("\nNote: Higher costs significantly impact performance - execution critical!")

## 6. Transaction Cost Sensitivity

In [None]:
# Rolling 3-month Sharpe ratio
if len(results_df) > 63:
    rolling_sharpe = results_df['net_return'].rolling(window=63).apply(
        lambda x: x.mean() / x.std() * np.sqrt(252) if x.std() > 0 else 0
    )
    
    fig, ax = plt.subplots(figsize=(14, 6))
    ax.plot(results_df['date'], rolling_sharpe, linewidth=2, color='purple')
    ax.axhline(target_sharpe, color='green', linestyle='--', label=f'Target: {target_sharpe:.2f}')
    ax.axhline(0.5, color='red', linestyle='--', label='Alert Threshold: 0.5')
    ax.set_title('Rolling 3-Month Sharpe Ratio (Model Decay Detection)', fontsize=14, fontweight='bold')
    ax.set_xlabel('Date')
    ax.set_ylabel('Sharpe Ratio')
    ax.legend()
    ax.grid(True, alpha=0.3)
    plt.tight_layout()
    plt.show()
    
    # Model decay check
    recent_sharpe = rolling_sharpe.iloc[-21:].mean()
    print(f"Recent 1-month avg Sharpe: {recent_sharpe:.2f}")
    print(f"Alert if < 0.5: {'⚠️ MODEL DECAY ALERT!' if recent_sharpe < 0.5 else '✓ Normal'}")

## 5. Rolling Performance Metrics

In [None]:
# Placeholder factor regression (requires Fama-French data)
print("Factor Regression Analysis")
print("="*80)
print("Objective: Test for significant alpha independent of traditional factors")
print("\nTarget Regression: Strategy ~ Mkt-RF + SMB + HML + RMW + CMA + UMD")
print("\nData Required:")
print("- Fama-French 5 factors (Mkt-RF, SMB, HML, RMW, CMA)")
print("- Momentum factor (UMD)")
print("- Source: Kenneth French Data Library")
print("\nExpected Results:")
print("- Significant positive alpha (p < 0.05)")
print("- Low factor loadings (|β| < 0.3 for market neutrality)")
print("\nNote: Implement with actual Fama-French data for production")

## 4. Factor Regression Analysis (Fama-French)

In [None]:
# Calculate key metrics
returns = results_df['net_return'].values
n_days = len(returns)
n_years = n_days / 252

# Annualized metrics
total_return = (results_df['portfolio_value'].iloc[-1] / config['backtest']['initial_capital']) - 1
annualized_return = (1 + total_return) ** (1 / n_years) - 1
volatility = np.std(returns) * np.sqrt(252)
sharpe = annualized_return / volatility
sortino = annualized_return / (np.std(returns[returns < 0]) * np.sqrt(252))
max_dd = results_df['drawdown'].max()
calmar = annualized_return / max_dd if max_dd > 0 else 0
win_rate = np.sum(returns > 0) / len(returns)
avg_turnover = results_df['turnover'].mean()

# Comparison table
metrics_data = {
    'Metric': ['Annualized Return', 'Sharpe Ratio', 'Sortino Ratio', 'Calmar Ratio', 
               'Max Drawdown', 'Win Rate', 'Ann. Turnover'],
    'Strategy': [f"{annualized_return:.2%}", f"{sharpe:.2f}", f"{sortino:.2f}", f"{calmar:.2f}",
                 f"{max_dd:.2%}", f"{win_rate:.2%}", f"{avg_turnover * 252:.2%}"],
    'Target': [f"{target_return:.2%}", f"{target_sharpe:.2f}", 'N/A', 'N/A', 
               'N/A', 'N/A', 'N/A'],
    'Status': [
        '✓' if annualized_return >= target_return * 0.8 else '✗',
        '✓' if sharpe >= target_sharpe * 0.8 else '✗',
        'N/A', 'N/A', 'N/A', 'N/A', 'N/A'
    ]
}

metrics_df = pd.DataFrame(metrics_data)
print("\n" + "="*80)
print("PERFORMANCE METRICS COMPARISON")
print("="*80)
print(metrics_df.to_string(index=False))
print("\n* Target ±20% tolerance considered passing")

## 3. Performance Metrics vs Targets

In [None]:
# Underwater plot
fig, ax = plt.subplots(figsize=(14, 6))
ax.fill_between(results_df['date'], -results_df['drawdown'] * 100, 0, color='red', alpha=0.3)
ax.plot(results_df['date'], -results_df['drawdown'] * 100, linewidth=2, color='darkred')
ax.set_title('Strategy Drawdown (Underwater Plot)', fontsize=14, fontweight='bold')
ax.set_xlabel('Date')
ax.set_ylabel('Drawdown (%)')
ax.axhline(-15, color='orange', linestyle='--', label='15% Stop-Loss Threshold')
ax.grid(True, alpha=0.3)
ax.legend()
plt.tight_layout()
plt.show()

print(f"Maximum Drawdown: {results_df['drawdown'].max() * 100:.2f}%")
print(f"Stop-Loss Threshold: 15%")
print(f"Breached: {'Yes ⚠️' if results_df['drawdown'].max() > 0.15 else 'No ✓'}")

## 2. Drawdown Analysis

In [None]:
# Cumulative returns plot
results_df['cumulative_return'] = (1 + results_df['net_return']).cumprod()

fig, ax = plt.subplots(figsize=(14, 7))
ax.plot(results_df['date'], results_df['cumulative_return'], linewidth=2.5, label='Strategy', color='darkblue')
ax.set_title('Cumulative Returns: Sentiment-LLM Strategy', fontsize=16, fontweight='bold')
ax.set_xlabel('Date', fontsize=12)
ax.set_ylabel('Cumulative Return', fontsize=12)
ax.set_yscale('log')
ax.grid(True, alpha=0.3)
ax.legend(fontsize=12)
plt.tight_layout()
plt.show()

print(f"Final portfolio value: ${results_df['portfolio_value'].iloc[-1]:,.2f}")
print(f"Total return: {(results_df['portfolio_value'].iloc[-1] / config['backtest']['initial_capital'] - 1) * 100:.2f}%")

## 1. Cumulative Returns Visualization

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from scipy import stats
from sklearn.linear_model import LinearRegression
import yaml
import warnings
warnings.filterwarnings('ignore')

sns.set_style('whitegrid')
plt.rcParams['figure.figsize'] = (14, 6)

# Load config
with open('config.yaml', 'r') as f:
    config = yaml.safe_load(f)

# Load backtest results
results_df = pd.read_csv('data/results/daily_returns.csv')
results_df['date'] = pd.to_datetime(results_df['date'])
print(f"Backtest results loaded: {len(results_df)} days")

# Extract benchmarks
target_return = config['evaluation']['benchmarks']['annualized_return']
target_sharpe = config['evaluation']['benchmarks']['sharpe_ratio']
print(f"\nTargets: {target_return:.2%} return, {target_sharpe:.2f} Sharpe")

# Backtest Evaluation: Market-Labeled LLM Strategy

This notebook evaluates the full backtest performance against research targets.

**Research Targets (from Quant Radio paper):**
- Annualized Return: **35.56%**
- Sharpe Ratio: **2.21**
- Strategy: Equal-weighted, cross-sectional market-neutral

**Evaluation Framework:**
1. Performance metrics vs benchmarks
2. Factor regression (alpha significance)
3. Drawdown analysis
4. Capacity and cost sensitivity
5. Failure modes validation