# Factor-Adjusted Portfolio Returns Analysis

This notebook demonstrates how to compute risk-adjusted alphas using Fama-French factor models.

## Key Features:
- Fetch Fama-French 5-Factor + Momentum data
- Estimate factor loadings (betas) for portfolios
- Calculate Jensen's alpha and Carhart 4-factor alpha
- Decompose returns into factor premiums
- Compute long-short alpha for opacity trading strategy

In [None]:
import sys
sys.path.insert(0, "..")

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

from src.utils.factor_data import fetch_all_factors, save_factors_to_dvc
from src.analysis.factor_models.fama_french import estimate_factor_loadings
from src.analysis.factor_models.alpha_decomposition import (
    jensen_alpha,
    carhart_alpha,
    alpha_attribution,
    summarize_decile_alphas,
)
from src.analysis.decile_backtest import run_factor_adjusted_backtest

sns.set_style("whitegrid")
plt.rcParams["figure.figsize"] = (12, 6)

print("✓ Imports successful")

## 1. Fetch Fama-French Factor Data

Download daily FF5 + Momentum factors from Ken French Data Library.

In [None]:
# Fetch factors (cached for performance)
factors = fetch_all_factors(
    start_date="2020-01-01",
    end_date="2024-12-31",
    disable_cache=False
)

print(f"Fetched {len(factors)} days of factor data")
print(f"Columns: {factors.columns.tolist()}")

# Save to DVC for versioning
save_factors_to_dvc(factors, "../data/factors/ff5_momentum_daily.csv")

factors.head()

## 2. Factor Summary Statistics

Analyze factor returns and correlations.

In [None]:
# Annualized statistics
factor_stats = pd.DataFrame({
    'Mean (Annual %)': factors[['Mkt-RF', 'SMB', 'HML', 'RMW', 'CMA', 'MOM']].mean() * 252 * 100,
    'Std (Annual %)': factors[['Mkt-RF', 'SMB', 'HML', 'RMW', 'CMA', 'MOM']].std() * np.sqrt(252) * 100,
    'Sharpe': (factors[['Mkt-RF', 'SMB', 'HML', 'RMW', 'CMA', 'MOM']].mean() / 
               factors[['Mkt-RF', 'SMB', 'HML', 'RMW', 'CMA', 'MOM']].std()) * np.sqrt(252)
})

print("\nFama-French Factor Statistics (Annualized)")
print("=" * 60)
print(factor_stats.round(2))

In [None]:
# Correlation matrix
plt.figure(figsize=(10, 8))
corr = factors[['Mkt-RF', 'SMB', 'HML', 'RMW', 'CMA', 'MOM']].corr()
sns.heatmap(corr, annot=True, fmt=".2f", cmap="coolwarm", center=0, 
            square=True, linewidths=1)
plt.title("Fama-French Factor Correlations")
plt.tight_layout()
plt.show()

## 3. Example: Estimating Factor Loadings

Demonstrate beta estimation with simulated portfolio returns.

In [None]:
# Simulate a portfolio with known factor exposures
np.random.seed(42)
simulated_returns = (
    0.0005  # 12.6% annual alpha
    + 1.2 * factors['Mkt-RF']
    + 0.3 * factors['SMB']
    - 0.2 * factors['HML']
    + np.random.randn(len(factors)) * 0.01
    + factors['RF']
)

simulated_returns = pd.Series(simulated_returns, index=factors.index)

# Estimate factor loadings
betas = estimate_factor_loadings(simulated_returns, factors, model="FF5")

print("\nEstimated Factor Loadings")
print("=" * 60)
print(f"Alpha (daily):    {betas['alpha']:.6f}")
print(f"Alpha (annual):   {betas['alpha'] * 252:.4f} ({betas['alpha'] * 252 * 100:.2f}%)")
print(f"t-stat (alpha):   {betas['t_alpha']:.2f}")
print(f"\nBeta (Market):    {betas['beta_mkt']:.3f}")
print(f"Beta (SMB):       {betas['beta_smb']:.3f}")
print(f"Beta (HML):       {betas['beta_hml']:.3f}")
print(f"Beta (RMW):       {betas['beta_rmw']:.3f}")
print(f"Beta (CMA):       {betas['beta_cma']:.3f}")
print(f"\nR-squared:        {betas['r_squared']:.4f}")
print(f"Observations:     {betas['n_obs']}")

## 4. Alpha Calculation

Compute Jensen's alpha and Carhart 4-factor alpha.

In [None]:
# Jensen's alpha (FF5)
jensen_result = jensen_alpha(simulated_returns, factors, model="FF5")

print("\nJensen's Alpha (FF5 Model)")
print("=" * 60)
print(f"Annual Alpha:     {jensen_result['alpha_annual']:.4f} ({jensen_result['alpha_annual'] * 100:.2f}%)")
print(f"t-statistic:      {jensen_result['t_stat']:.2f}")
print(f"p-value:          {jensen_result['p_value']:.4f}")
print(f"R-squared:        {jensen_result['r_squared']:.4f}")

# Carhart alpha (FF5 + Momentum)
carhart_result = carhart_alpha(simulated_returns, factors)

print("\nCarhart 4-Factor Alpha (FF5 + Momentum)")
print("=" * 60)
print(f"Annual Alpha:     {carhart_result['alpha_annual']:.4f} ({carhart_result['alpha_annual'] * 100:.2f}%)")
print(f"t-statistic:      {carhart_result['t_stat']:.2f}")
print(f"p-value:          {carhart_result['p_value']:.4f}")

## 5. Return Attribution

Decompose returns into factor premiums.

In [None]:
# Attribution analysis
attribution = alpha_attribution(simulated_returns, factors, model="FF5")

# Average contribution by factor
avg_contribution = pd.Series({
    'Alpha': attribution['alpha'].mean(),
    'Market': attribution['mkt_premium'].mean(),
    'Size (SMB)': attribution['smb_premium'].mean(),
    'Value (HML)': attribution['hml_premium'].mean(),
    'Profitability (RMW)': attribution['rmw_premium'].mean(),
    'Investment (CMA)': attribution['cma_premium'].mean(),
    'Risk-Free': attribution['rf'].mean(),
    'Residual': attribution['residual'].mean()
}) * 252 * 100  # Annualize

print("\nAverage Annual Contribution by Factor (%)")
print("=" * 60)
print(avg_contribution.round(2))

# Visualize
plt.figure(figsize=(10, 6))
avg_contribution.plot(kind='barh', color='steelblue')
plt.xlabel('Annual Return Contribution (%)')
plt.title('Return Attribution: Factor Decomposition')
plt.axvline(0, color='black', linewidth=0.8)
plt.tight_layout()
plt.show()

## 6. Interpretation Guide

**Key Metrics:**

- **Alpha**: Excess return not explained by factors (Jensen's alpha)
  - Positive alpha = outperformance after risk adjustment
  - t-stat > 3.0 = statistically significant (Harvey-Liu-Zhu threshold)

- **Beta (Market)**: Sensitivity to market returns
  - β = 1.0 → moves with market
  - β > 1.0 → more volatile than market

- **R-squared**: % of variance explained by factors
  - Higher R² → returns mostly driven by factors
  - Lower R² → more idiosyncratic risk

**For CNOI Research:**
- If opacity premium survives factor adjustment (alpha > 0, t > 3.0), it's NOT just compensation for systematic risk
- Compare raw returns vs. alphas: if alpha << raw return, opacity is a factor risk premium

## 7. Example Application: Decile Backtest with Alphas

This section shows how to integrate factor models with decile backtests.

**Note**: Requires real CNOI and return data. See `run_factor_adjusted_backtest()` for full implementation.

In [None]:
# Example usage (pseudocode)
# from src.analysis.decile_backtest import run_factor_adjusted_backtest
#
# results = run_factor_adjusted_backtest(
#     cnoi_df=cnoi_data,
#     returns_df=returns_data,
#     factors_df=factors,
#     model="FF5_MOM"
# )
#
# print(f"Long-Short Raw Return: {results['raw_returns']['long_short']['mean_ret'].iloc[0]:.4f}")
# print(f"Long-Short Alpha: {results['factor_adjusted']['LS_alpha']:.4f}")
# print(f"Alpha t-stat: {results['factor_adjusted']['LS_alpha_tstat']:.2f}")

print("Refer to full backtest notebooks for real data analysis.")

## Summary

This notebook demonstrated:
1. ✅ Fetching Fama-French factor data
2. ✅ Estimating factor loadings (betas)
3. ✅ Computing Jensen's and Carhart alphas
4. ✅ Decomposing returns into factor premiums

**Next Steps:**
- Run factor-adjusted decile backtests with real CNOI data
- Compare alphas across FF3, FF5, and Carhart models
- Test if opacity premium survives factor adjustment (key research question!)