# Empirical Asset Pricing — Portfolio Construction & CAPM Evidence

**Goal:** Build and evaluate portfolios (equally-weighted and value-weighted) and test **CAPM** relationships using historical stock returns.

## What this notebook delivers
- Clean return + excess return computation
- Portfolio construction:
  - Equally-weighted portfolio
  - Value-weighted portfolio (using market cap weights)
- Risk-adjusted performance:
  - Annualised mean, volatility, Sharpe ratio
- CAPM evidence:
  - Estimate betas (and alpha) via OLS
  - Interpret statistical significance + economic meaning

In [None]:
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [None]:
# --- Setup & Imports (reproducible + clean) ---

import os
import sys
import warnings
from pathlib import Path

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

import statsmodels.api as sm

warnings.filterwarnings("ignore")
pd.set_option("display.max_columns", 200)
pd.set_option("display.width", 120)

# Project paths
PROJECT_ROOT = Path("/content/drive/MyDrive/Empirical_Asset_Pricing_Project")
DATA_DIR = PROJECT_ROOT / "data"
OUTPUTS_DIR = PROJECT_ROOT / "outputs"
FIG_DIR = OUTPUTS_DIR / "figures"

OUTPUTS_DIR.mkdir(exist_ok=True, parents=True)
FIG_DIR.mkdir(exist_ok=True, parents=True)

print("Working directory:", PROJECT_ROOT)
print("Data directory:", DATA_DIR)
print("Outputs directory:", OUTPUTS_DIR)

Working directory: /content/drive/MyDrive/Empirical_Asset_Pricing_Project
Data directory: /content/drive/MyDrive/Empirical_Asset_Pricing_Project/data
Outputs directory: /content/drive/MyDrive/Empirical_Asset_Pricing_Project/outputs


## Notebook roadmap
1. Data load & validation  
2. Compute returns + excess returns  
3. Summary statistics & visual checks  
4. Build portfolios (equal-weighted, value-weighted)  
5. Performance metrics (annualised mean/vol/Sharpe)  
6. CAPM regression (alpha, beta) + interpretation  
7. Conclusions, limitations, and extensions  

# 1. Data Description & Preparation

This project combines:

### 1️⃣ CRSP Monthly Stock Data
- Individual stock returns
- Market capitalization proxy (price × shares outstanding)
- Monthly frequency

### 2️⃣ Fama-French 5 Factors (2×3)
- Mkt-RF, SMB, HML, RMW, CMA
- Risk-free rate (RF)

### 3️⃣ Momentum Factor
- Carhart momentum factor (MOM)

⚠️ Important:
- CRSP returns are in decimal format.
- Fama-French factors are in percentage format and must be divided by 100.
- All datasets are monthly frequency and must be date-aligned.

In [None]:
# --- Load CRSP Monthly Data ---

crsp_path = DATA_DIR / "raw" / "crsp_monthly_subsample.csv"

crsp = pd.read_csv(crsp_path)

# Convert date
crsp["date"] = pd.to_datetime(crsp["date"])

# Sort properly
crsp = crsp.sort_values(["permno", "date"]).reset_index(drop=True)

crsp.head()

Unnamed: 0,permno,date,tsymbol,comnam,prc,ret,vol,shrout
0,10026,2015-01-30,JJSF,J & J SNACK FOODS CORP,98.12,-0.097913,15844.0,18688.0
1,10026,2015-02-27,JJSF,J & J SNACK FOODS CORP,101.19,0.031288,11316.0,18688.0
2,10026,2015-03-31,JJSF,J & J SNACK FOODS CORP,106.7,0.05801,19048.0,18689.0
3,10026,2015-04-30,JJSF,J & J SNACK FOODS CORP,104.33,-0.022212,11603.0,18691.0
4,10026,2015-05-29,JJSF,J & J SNACK FOODS CORP,107.8,0.03326,9415.0,18691.0


## 1.2 Fama-French 5 Factors

The Fama-French dataset provides monthly factor returns in **percentage format**.

We will:
- Load the CSV from the zip file
- Remove metadata rows
- Convert date column to datetime
- Convert factors from percent to decimal
- Keep only relevant columns

In [None]:
# --- Load Fama-French 5 Factors ---

ff5_zip_path = DATA_DIR / "raw" / "F-F_Research_Data_5_Factors_2x3_CSV.zip"

with zipfile.ZipFile(ff5_zip_path, 'r') as z:
    file_name = z.namelist()[0]
    with z.open(file_name) as f:
        ff5 = pd.read_csv(f, skiprows=3)

# Rename first column properly
ff5.rename(columns={ff5.columns[0]: "date"}, inplace=True)

# Strip any whitespace from the date column
ff5["date"] = ff5["date"].astype(str).str.strip()

# Filter out rows where 'date' is not 6 characters long (e.g., "  1964")
# This implicitly handles non-numeric footers as well as malformed dates
ff5 = ff5[ff5["date"].str.len() == 6]

# Convert date to datetime (YYYYMM format)
ff5["date"] = pd.to_datetime(ff5["date"], format="%Y%m")

# Convert factors from percent to decimal
factor_cols = ["Mkt-RF", "SMB", "HML", "RMW", "CMA", "RF"]

# Convert factor columns to numeric, coercing errors to NaN
for col in factor_cols:
    ff5[col] = pd.to_numeric(ff5[col], errors='coerce')

ff5[factor_cols] = ff5[factor_cols] / 100

ff5.head()

Unnamed: 0,date,Mkt-RF,SMB,HML,RMW,CMA,RF
0,1963-07-01,-0.0039,-0.0048,-0.0081,0.0064,-0.0115,0.0027
1,1963-08-01,0.0508,-0.008,0.017,0.004,-0.0038,0.0025
2,1963-09-01,-0.0157,-0.0043,0.0,-0.0078,0.0015,0.0027
3,1963-10-01,0.0254,-0.0134,-0.0004,0.0279,-0.0225,0.0029
4,1963-11-01,-0.0086,-0.0085,0.0173,-0.0043,0.0227,0.0027


## 1.3 Momentum Factor

We now load the Carhart momentum factor.
As with FF5 factors, values are in percentage format and must be converted to decimal.

In [None]:
# --- Load Momentum Factor ---

mom_zip_path = DATA_DIR / "raw" / "F-F_Momentum_Factor_CSV.zip"

with zipfile.ZipFile(mom_zip_path, 'r') as z:
    file_name = z.namelist()[0]
    with z.open(file_name) as f:
        mom = pd.read_csv(f, skiprows=13)

# Rename first column
mom.rename(columns={mom.columns[0]: "date"}, inplace=True)

# Clean date column
mom["date"] = mom["date"].astype(str).str.strip()
mom = mom[mom["date"].str.len() == 6]
mom["date"] = pd.to_datetime(mom["date"], format="%Y%m")

# Convert to numeric
mom["Mom"] = pd.to_numeric(mom["Mom"], errors="coerce")

# Convert from percent to decimal
mom["Mom"] = mom["Mom"] / 100

mom.head()

Unnamed: 0,date,Mom
0,1927-01-01,0.0057
1,1927-02-01,-0.015
2,1927-03-01,0.0352
3,1927-04-01,0.0436
4,1927-05-01,0.0278


## 1.4 Combine All Factors

We merge:
- Fama-French 5 factors
- Momentum factor

This creates a unified monthly factor dataset that will be merged with CRSP stock returns.

In [None]:
# --- Merge FF5 and Momentum ---

factors = ff5.merge(mom, on="date", how="inner")

# Sort by date
factors = factors.sort_values("date").reset_index(drop=True)

factors.head()

Unnamed: 0,date,Mkt-RF,SMB,HML,RMW,CMA,RF,Mom
0,1963-07-01,-0.0039,-0.0048,-0.0081,0.0064,-0.0115,0.0027,0.0101
1,1963-08-01,0.0508,-0.008,0.017,0.004,-0.0038,0.0025,0.01
2,1963-09-01,-0.0157,-0.0043,0.0,-0.0078,0.0015,0.0027,0.0012
3,1963-10-01,0.0254,-0.0134,-0.0004,0.0279,-0.0225,0.0029,0.0313
4,1963-11-01,-0.0086,-0.0085,0.0173,-0.0043,0.0227,0.0027,-0.0078


# 2. Align CRSP Stock Data With Factor Data

We now:

1. Compute market capitalization  
2. Merge stock-level returns with factor data  
3. Compute stock excess returns  

This step prepares the dataset for portfolio construction and regression analysis.

In [None]:
# --- Compute Market Capitalization ---

crsp["market_cap"] = crsp["prc"].abs() * crsp["shrout"]

crsp[["permno", "date", "market_cap"]].head()

Unnamed: 0,permno,date,market_cap
0,10026,2015-01-30,1833666.56
1,10026,2015-02-27,1891038.72
2,10026,2015-03-31,1994116.3
3,10026,2015-04-30,1950032.03
4,10026,2015-05-29,2014889.8


In [None]:
# Convert both to monthly period format

crsp["year_month"] = crsp["date"].dt.to_period("M")
factors["year_month"] = factors["date"].dt.to_period("M")

crsp.head()

Unnamed: 0,permno,date,tsymbol,comnam,prc,ret,vol,shrout,market_cap,year_month
0,10026,2015-01-30,JJSF,J & J SNACK FOODS CORP,98.12,-0.097913,15844.0,18688.0,1833666.56,2015-01
1,10026,2015-02-27,JJSF,J & J SNACK FOODS CORP,101.19,0.031288,11316.0,18688.0,1891038.72,2015-02
2,10026,2015-03-31,JJSF,J & J SNACK FOODS CORP,106.7,0.05801,19048.0,18689.0,1994116.3,2015-03
3,10026,2015-04-30,JJSF,J & J SNACK FOODS CORP,104.33,-0.022212,11603.0,18691.0,1950032.03,2015-04
4,10026,2015-05-29,JJSF,J & J SNACK FOODS CORP,107.8,0.03326,9415.0,18691.0,2014889.8,2015-05


## 2.1 Merge Stock Returns With Factor Data

We merge CRSP stock data with monthly factor returns using the date key.

This ensures each stock-month observation has corresponding:
- Market factor
- Size, value, profitability, investment factors
- Momentum factor
- Risk-free rate

In [None]:
# --- Merge CRSP with Factors ---

data = crsp.merge(factors, on="year_month", how="inner")

# Clean up duplicate date columns
data = data.rename(columns={"date_x": "date"})
data = data.drop(columns=["date_y"])

data.head()

Unnamed: 0,permno,date,tsymbol,comnam,prc,ret,vol,shrout,market_cap,year_month,Mkt-RF,SMB,HML,RMW,CMA,RF,Mom
0,10026,2015-01-30,JJSF,J & J SNACK FOODS CORP,98.12,-0.097913,15844.0,18688.0,1833666.56,2015-01,-0.0309,-0.0093,-0.0345,0.0158,-0.0164,0.0,0.0374
1,10026,2015-02-27,JJSF,J & J SNACK FOODS CORP,101.19,0.031288,11316.0,18688.0,1891038.72,2015-02,0.0614,0.0036,-0.0179,-0.011,-0.0175,0.0,-0.031
2,10026,2015-03-31,JJSF,J & J SNACK FOODS CORP,106.7,0.05801,19048.0,18689.0,1994116.3,2015-03,-0.0109,0.0308,-0.0038,0.0007,-0.0062,0.0,0.027
3,10026,2015-04-30,JJSF,J & J SNACK FOODS CORP,104.33,-0.022212,11603.0,18691.0,1950032.03,2015-04,0.006,-0.0301,0.018,0.0005,-0.0062,0.0,-0.0727
4,10026,2015-05-29,JJSF,J & J SNACK FOODS CORP,107.8,0.03326,9415.0,18691.0,2014889.8,2015-05,0.0138,0.0082,-0.0111,-0.0176,-0.0083,0.0,0.0568


## 2.2 Compute Excess Stock Returns

We compute stock excess returns as:

Rᵢₜ^e = Rᵢₜ − R_fₜ

where:
- Rᵢₜ is the stock return
- R_fₜ is the monthly risk-free rate

In [None]:
# --- Compute Excess Returns ---

data["excess_ret"] = data["ret"] - data["RF"]

data[["ret", "RF", "excess_ret"]].head()

Unnamed: 0,ret,RF,excess_ret
0,-0.097913,0.0,-0.097913
1,0.031288,0.0,0.031288
2,0.05801,0.0,0.05801
3,-0.022212,0.0,-0.022212
4,0.03326,0.0,0.03326


# 3. Portfolio Construction

We construct two benchmark portfolios:

### 1️⃣ Equal-Weighted Portfolio (EW)
Each stock receives equal weight each month.

### 2️⃣ Value-Weighted Portfolio (VW)
Stocks are weighted by market capitalization each month.

These portfolios will be evaluated under:
- CAPM
- Fama-French 5-factor model
- Momentum-augmented model

In [None]:
# --- Equal-Weighted Portfolio ---

ew_portfolio = (
    data
    .groupby("year_month")["ret"]
    .mean()
    .reset_index()
    .rename(columns={"ret": "EW_ret"})
)

ew_portfolio.head()

Unnamed: 0,year_month,EW_ret
0,2015-01,-0.07208
1,2015-02,0.069038
2,2015-03,-0.007718
3,2015-04,-0.021759
4,2015-05,0.026515


## 3.1 Value-Weighted Portfolio

Weights are computed monthly as:

$wᵢₜ$ = $MarketCapᵢₜ$ / $Σ MarketCapⱼₜ$

The value-weighted portfolio reflects how most market indices are constructed.

In [None]:
# --- Compute Monthly Weights ---

# Compute total market cap per month
monthly_total_mcap = (
    data
    .groupby("year_month")["market_cap"]
    .sum()
    .reset_index()
    .rename(columns={"market_cap": "total_mcap"})
)

# Merge back to main data
data = data.merge(monthly_total_mcap, on="year_month", how="left")

# Compute weight
data["weight"] = data["market_cap"] / data["total_mcap"]

data[["year_month", "market_cap", "total_mcap", "weight"]].head()

Unnamed: 0,year_month,market_cap,total_mcap,weight
0,2015-01,1833666.56,608516200.0,0.003013
1,2015-02,1891038.72,653937200.0,0.002892
2,2015-03,1994116.3,622219300.0,0.003205
3,2015-04,1950032.03,695011400.0,0.002806
4,2015-05,2014889.8,663318900.0,0.003038


In [None]:
# --- Value-Weighted Portfolio ---

vw_portfolio = (
    data
    .assign(weighted_ret=data["weight"] * data["ret"])
    .groupby("year_month")["weighted_ret"]
    .sum()
    .reset_index()
    .rename(columns={"weighted_ret": "VW_ret"})
)

vw_portfolio.head()

Unnamed: 0,year_month,VW_ret
0,2015-01,-0.102319
1,2015-02,0.077682
2,2015-03,-0.035991
3,2015-04,0.105985
4,2015-05,-0.020701


# 4. Portfolio Performance Evaluation

We compare Equal-Weighted (EW) and Value-Weighted (VW) portfolios using:

- Average return
- Volatility
- Sharpe ratio

We compute excess portfolio returns using the risk-free rate.

In [None]:
# --- Merge EW and VW ---

portfolio = ew_portfolio.merge(vw_portfolio, on="year_month", how="inner")

portfolio.head()

Unnamed: 0,year_month,EW_ret,VW_ret
0,2015-01,-0.07208,-0.102319
1,2015-02,0.069038,0.077682
2,2015-03,-0.007718,-0.035991
3,2015-04,-0.021759,0.105985
4,2015-05,0.026515,-0.020701


In [None]:
# Extract monthly RF (one per month)
rf_monthly = factors[["year_month", "RF"]].drop_duplicates()

portfolio = portfolio.merge(rf_monthly, on="year_month", how="left")

portfolio.head()

Unnamed: 0,year_month,EW_ret,VW_ret,RF
0,2015-01,-0.07208,-0.102319,0.0
1,2015-02,0.069038,0.077682,0.0
2,2015-03,-0.007718,-0.035991,0.0
3,2015-04,-0.021759,0.105985,0.0
4,2015-05,0.026515,-0.020701,0.0


In [None]:
# Compute Portfolio Excess Returns

portfolio["EW_excess"] = portfolio["EW_ret"] - portfolio["RF"]
portfolio["VW_excess"] = portfolio["VW_ret"] - portfolio["RF"]

portfolio.head()

Unnamed: 0,year_month,EW_ret,VW_ret,RF,EW_excess,VW_excess
0,2015-01,-0.07208,-0.102319,0.0,-0.07208,-0.102319
1,2015-02,0.069038,0.077682,0.0,0.069038,0.077682
2,2015-03,-0.007718,-0.035991,0.0,-0.007718,-0.035991
3,2015-04,-0.021759,0.105985,0.0,-0.021759,0.105985
4,2015-05,0.026515,-0.020701,0.0,0.026515,-0.020701


## 4.1 Annualised Performance Metrics

We annualise monthly statistics:

- Annualised Mean = Monthly Mean × 12
- Annualised Volatility = Monthly Std × √12
- Sharpe Ratio = Annualised Excess Return / Annualised Volatility

In [None]:
import numpy as np

# Monthly statistics
monthly_stats = portfolio[["EW_excess", "VW_excess"]].agg(["mean", "std"])

# Annualise
annual_mean = monthly_stats.loc["mean"] * 12
annual_vol = monthly_stats.loc["std"] * np.sqrt(12)

# Sharpe ratio
sharpe_ratio = annual_mean / annual_vol

performance_summary = pd.DataFrame({
    "Annualised Mean Return": annual_mean,
    "Annualised Volatility": annual_vol,
    "Sharpe Ratio": sharpe_ratio
})

performance_summary

Unnamed: 0,Annualised Mean Return,Annualised Volatility,Sharpe Ratio
EW_excess,0.100523,0.188226,0.534056
VW_excess,0.215764,0.197139,1.09448


### Interpretation of Portfolio Performance

The value-weighted portfolio exhibits a higher Sharpe ratio than the equal-weighted portfolio.

Although the value-weighted portfolio has slightly higher volatility, it delivers greater risk-adjusted returns.

This suggests that larger-cap stocks in the sample contributed more efficiently to excess returns relative to their risk exposure.

# 5. CAPM Regression

We test whether portfolio excess returns can be explained by the market factor:

$Rₚₜᵉ$ = $α + β (Mkt − RF)ₜ + εₜ$

Where:
- α measures abnormal performance
- β measures market exposure

In [None]:
# Add market factor
portfolio = portfolio.merge(
    factors[["year_month", "Mkt-RF"]],
    on="year_month",
    how="left"
)

portfolio.head()

Unnamed: 0,year_month,EW_ret,VW_ret,RF,EW_excess,VW_excess,Mkt-RF
0,2015-01,-0.07208,-0.102319,0.0,-0.07208,-0.102319,-0.0309
1,2015-02,0.069038,0.077682,0.0,0.069038,0.077682,0.0614
2,2015-03,-0.007718,-0.035991,0.0,-0.007718,-0.035991,-0.0109
3,2015-04,-0.021759,0.105985,0.0,-0.021759,0.105985,0.006
4,2015-05,0.026515,-0.020701,0.0,0.026515,-0.020701,0.0138


In [None]:
import statsmodels.api as sm

# Define dependent and independent variables
Y_ew = portfolio["EW_excess"]
X = portfolio["Mkt-RF"]

# Add constant for alpha
X = sm.add_constant(X)

# Run regression
capm_ew = sm.OLS(Y_ew, X).fit()

capm_ew.summary()

0,1,2,3
Dep. Variable:,EW_excess,R-squared:,0.699
Model:,OLS,Adj. R-squared:,0.696
Method:,Least Squares,F-statistic:,273.5
Date:,"Fri, 20 Feb 2026",Prob (F-statistic):,1.6300000000000002e-32
Time:,02:23:10,Log-Likelihood:,251.69
No. Observations:,120,AIC:,-499.4
Df Residuals:,118,BIC:,-493.8
Df Model:,1,,
Covariance Type:,nonrobust,,

0,1,2,3,4,5,6
,coef,std err,t,P>|t|,[0.025,0.975]
const,-0.0013,0.003,-0.480,0.632,-0.007,0.004
Mkt-RF,0.9898,0.060,16.537,0.000,0.871,1.108

0,1,2,3
Omnibus:,6.153,Durbin-Watson:,1.786
Prob(Omnibus):,0.046,Jarque-Bera (JB):,10.124
Skew:,0.003,Prob(JB):,0.00633
Kurtosis:,4.423,Cond. No.,21.9


### CAPM Interpretation — Equal-Weighted Portfolio

The estimated beta is approximately 1, indicating that the portfolio moves closely with the market.

The alpha is not statistically significant, suggesting no abnormal performance after controlling for market risk.

The R-squared of 0.70 indicates that the CAPM explains a substantial portion of return variation, though some residual risk remains.

In [None]:
# CAPM for Value-Weighted Portfolio

Y_vw = portfolio["VW_excess"]
X = portfolio["Mkt-RF"]
X = sm.add_constant(X)

capm_vw = sm.OLS(Y_vw, X).fit()

capm_vw.summary()

0,1,2,3
Dep. Variable:,VW_excess,R-squared:,0.579
Model:,OLS,Adj. R-squared:,0.576
Method:,Least Squares,F-statistic:,162.6
Date:,"Fri, 20 Feb 2026",Prob (F-statistic):,6.08e-24
Time:,02:23:10,Log-Likelihood:,226.16
No. Observations:,120,AIC:,-448.3
Df Residuals:,118,BIC:,-442.7
Df Model:,1,,
Covariance Type:,nonrobust,,

0,1,2,3,4,5,6
,coef,std err,t,P>|t|,[0.025,0.975]
const,0.0087,0.003,2.517,0.013,0.002,0.016
Mkt-RF,0.9442,0.074,12.752,0.000,0.798,1.091

0,1,2,3
Omnibus:,3.1,Durbin-Watson:,2.183
Prob(Omnibus):,0.212,Jarque-Bera (JB):,2.518
Skew:,0.309,Prob(JB):,0.284
Kurtosis:,3.35,Cond. No.,21.9


### CAPM Interpretation — Value-Weighted Portfolio

The value-weighted portfolio exhibits a beta slightly below 1, indicating somewhat lower systematic risk exposure relative to the equal-weighted portfolio.

The alpha is statistically significant, suggesting that CAPM alone does not fully explain the portfolio's excess returns.

The lower R-squared relative to the equal-weighted portfolio indicates that additional factors may be relevant in explaining value-weighted performance.

# 6. Fama-French 5-Factor Model

We extend CAPM by including additional systematic risk factors:

- Market (Mkt-RF)
- Size (SMB)
- Value (HML)
- Profitability (RMW)
- Investment (CMA)

This tests whether abnormal performance under CAPM disappears under a richer factor specification.

In [None]:
# Merge all required factors into portfolio

ff5_factors = factors[[
    "year_month",
    "Mkt-RF",
    "SMB",
    "HML",
    "RMW",
    "CMA"
]]

portfolio = portfolio.merge(
    ff5_factors,
    on="year_month",
    how="left"
)

portfolio.head()

Unnamed: 0,year_month,EW_ret,VW_ret,RF,EW_excess,VW_excess,Mkt-RF_x,Mkt-RF_y,SMB,HML,RMW,CMA
0,2015-01,-0.07208,-0.102319,0.0,-0.07208,-0.102319,-0.0309,-0.0309,-0.0093,-0.0345,0.0158,-0.0164
1,2015-02,0.069038,0.077682,0.0,0.069038,0.077682,0.0614,0.0614,0.0036,-0.0179,-0.011,-0.0175
2,2015-03,-0.007718,-0.035991,0.0,-0.007718,-0.035991,-0.0109,-0.0109,0.0308,-0.0038,0.0007,-0.0062
3,2015-04,-0.021759,0.105985,0.0,-0.021759,0.105985,0.006,0.006,-0.0301,0.018,0.0005,-0.0062
4,2015-05,0.026515,-0.020701,0.0,0.026515,-0.020701,0.0138,0.0138,0.0082,-0.0111,-0.0176,-0.0083


In [None]:
# Remove old market factor column if exists
portfolio = portfolio.drop(columns=[col for col in portfolio.columns if col.endswith("_x")])

# Rename _y columns back to original
portfolio = portfolio.rename(columns=lambda x: x.replace("_y", ""))

portfolio.head()

Unnamed: 0,year_month,EW_ret,VW_ret,RF,EW_excess,VW_excess,Mkt-RF,SMB,HML,RMW,CMA
0,2015-01,-0.07208,-0.102319,0.0,-0.07208,-0.102319,-0.0309,-0.0093,-0.0345,0.0158,-0.0164
1,2015-02,0.069038,0.077682,0.0,0.069038,0.077682,0.0614,0.0036,-0.0179,-0.011,-0.0175
2,2015-03,-0.007718,-0.035991,0.0,-0.007718,-0.035991,-0.0109,0.0308,-0.0038,0.0007,-0.0062
3,2015-04,-0.021759,0.105985,0.0,-0.021759,0.105985,0.006,-0.0301,0.018,0.0005,-0.0062
4,2015-05,0.026515,-0.020701,0.0,0.026515,-0.020701,0.0138,0.0082,-0.0111,-0.0176,-0.0083


In [None]:
# --- FF5 Regression: Equal-Weighted Portfolio ---

Y_ew = portfolio["EW_excess"]

X_ff5 = portfolio[["Mkt-RF", "SMB", "HML", "RMW", "CMA"]]
X_ff5 = sm.add_constant(X_ff5)

ff5_ew = sm.OLS(Y_ew, X_ff5).fit()

ff5_ew.summary()

0,1,2,3
Dep. Variable:,EW_excess,R-squared:,0.748
Model:,OLS,Adj. R-squared:,0.737
Method:,Least Squares,F-statistic:,67.73
Date:,"Fri, 20 Feb 2026",Prob (F-statistic):,1.6e-32
Time:,02:23:10,Log-Likelihood:,262.47
No. Observations:,120,AIC:,-512.9
Df Residuals:,114,BIC:,-496.2
Df Model:,5,,
Covariance Type:,nonrobust,,

0,1,2,3,4,5,6
,coef,std err,t,P>|t|,[0.025,0.975]
const,-0.0002,0.003,-0.080,0.937,-0.005,0.005
Mkt-RF,0.9052,0.062,14.629,0.000,0.783,1.028
SMB,0.3921,0.111,3.540,0.001,0.173,0.611
HML,0.0872,0.101,0.861,0.391,-0.113,0.288
RMW,0.1303,0.140,0.929,0.355,-0.147,0.408
CMA,-0.0046,0.152,-0.030,0.976,-0.305,0.296

0,1,2,3
Omnibus:,4.739,Durbin-Watson:,1.759
Prob(Omnibus):,0.094,Jarque-Bera (JB):,6.076
Skew:,0.125,Prob(JB):,0.0479
Kurtosis:,4.074,Cond. No.,69.3


### FF5 Interpretation — Equal-Weighted Portfolio

Under the Fama-French 5-factor model, alpha becomes economically and statistically insignificant, suggesting no abnormal performance.

R-squared increases relative to CAPM, indicating improved explanatory power.

The portfolio exhibits significant exposure to the size factor (SMB), consistent with equal-weighting's overweighting of smaller-cap stocks.

Other factors (HML, RMW, CMA) are not statistically significant in this sample.

In [None]:
# --- FF5 Regression: Value-Weighted Portfolio ---

Y_vw = portfolio["VW_excess"]

X_ff5 = portfolio[["Mkt-RF", "SMB", "HML", "RMW", "CMA"]]
X_ff5 = sm.add_constant(X_ff5)

ff5_vw = sm.OLS(Y_vw, X_ff5).fit()

ff5_vw.summary()

0,1,2,3
Dep. Variable:,VW_excess,R-squared:,0.718
Model:,OLS,Adj. R-squared:,0.705
Method:,Least Squares,F-statistic:,57.92
Date:,"Fri, 20 Feb 2026",Prob (F-statistic):,1.0400000000000001e-29
Time:,02:23:10,Log-Likelihood:,250.04
No. Observations:,120,AIC:,-488.1
Df Residuals:,114,BIC:,-471.4
Df Model:,5,,
Covariance Type:,nonrobust,,

0,1,2,3,4,5,6
,coef,std err,t,P>|t|,[0.025,0.975]
const,0.0063,0.003,2.139,0.035,0.000,0.012
Mkt-RF,1.0564,0.069,15.393,0.000,0.920,1.192
SMB,-0.6033,0.123,-4.911,0.000,-0.847,-0.360
HML,-0.1556,0.112,-1.386,0.169,-0.378,0.067
RMW,-0.0274,0.155,-0.176,0.860,-0.335,0.281
CMA,-0.1461,0.168,-0.868,0.387,-0.480,0.187

0,1,2,3
Omnibus:,3.563,Durbin-Watson:,2.103
Prob(Omnibus):,0.168,Jarque-Bera (JB):,3.967
Skew:,0.084,Prob(JB):,0.138
Kurtosis:,3.875,Cond. No.,69.3


### FF5 Interpretation — Value-Weighted Portfolio

The value-weighted portfolio retains a statistically significant alpha even under the FF5 specification.

The portfolio exhibits strong negative exposure to the size factor (SMB), consistent with large-cap dominance in value-weighting.

The persistence of alpha suggests that additional systematic factors, such as momentum, may be relevant.

# 7. FF6 Model (Adding Momentum)

We extend the FF5 model by including the momentum factor (MOM).

This tests whether remaining abnormal returns are explained by momentum exposure.

In [None]:
# --- FF6 Regression: Equal-Weighted Portfolio ---

Y_ew = portfolio["EW_excess"]

# Merge the Momentum factor into the portfolio DataFrame
portfolio = portfolio.merge(
    factors[["year_month", "Mom"]],
    on="year_month",
    how="left"
)

X_ff6 = portfolio[[
    "Mkt-RF",
    "SMB",
    "HML",
    "RMW",
    "CMA",
    "Mom"
]]

X_ff6 = sm.add_constant(X_ff6)

ff6_ew = sm.OLS(Y_ew, X_ff6).fit()

ff6_ew.summary()

0,1,2,3
Dep. Variable:,EW_excess,R-squared:,0.754
Model:,OLS,Adj. R-squared:,0.741
Method:,Least Squares,F-statistic:,57.65
Date:,"Fri, 20 Feb 2026",Prob (F-statistic):,3.9600000000000003e-32
Time:,02:26:01,Log-Likelihood:,263.82
No. Observations:,120,AIC:,-513.6
Df Residuals:,113,BIC:,-494.1
Df Model:,6,,
Covariance Type:,nonrobust,,

0,1,2,3,4,5,6
,coef,std err,t,P>|t|,[0.025,0.975]
const,0.0003,0.003,0.122,0.903,-0.005,0.006
Mkt-RF,0.8770,0.064,13.719,0.000,0.750,1.004
SMB,0.3414,0.114,2.983,0.004,0.115,0.568
HML,0.0499,0.103,0.483,0.630,-0.155,0.254
RMW,0.0968,0.141,0.688,0.493,-0.182,0.376
CMA,0.0325,0.153,0.213,0.832,-0.270,0.335
Mom,-0.1232,0.077,-1.602,0.112,-0.276,0.029

0,1,2,3
Omnibus:,4.845,Durbin-Watson:,1.685
Prob(Omnibus):,0.089,Jarque-Bera (JB):,5.284
Skew:,0.244,Prob(JB):,0.0712
Kurtosis:,3.905,Cond. No.,69.5


In [None]:
# --- FF6 Regression: Value-Weighted Portfolio ---

Y_vw = portfolio["VW_excess"]

X_ff6_vw = portfolio[[
    "Mkt-RF",
    "SMB",
    "HML",
    "RMW",
    "CMA",
    "Mom"
]]

X_ff6_vw = sm.add_constant(X_ff6_vw)

ff6_vw = sm.OLS(Y_vw, X_ff6_vw).fit()

ff6_vw.summary()

0,1,2,3
Dep. Variable:,VW_excess,R-squared:,0.719
Model:,OLS,Adj. R-squared:,0.704
Method:,Least Squares,F-statistic:,48.13
Date:,"Fri, 20 Feb 2026",Prob (F-statistic):,6.61e-29
Time:,02:27:11,Log-Likelihood:,250.29
No. Observations:,120,AIC:,-486.6
Df Residuals:,113,BIC:,-467.1
Df Model:,6,,
Covariance Type:,nonrobust,,

0,1,2,3,4,5,6
,coef,std err,t,P>|t|,[0.025,0.975]
const,0.0066,0.003,2.204,0.030,0.001,0.012
Mkt-RF,1.0427,0.072,14.573,0.000,0.901,1.185
SMB,-0.6279,0.128,-4.901,0.000,-0.882,-0.374
HML,-0.1736,0.115,-1.503,0.136,-0.402,0.055
RMW,-0.0436,0.158,-0.277,0.782,-0.356,0.269
CMA,-0.1282,0.171,-0.751,0.454,-0.466,0.210
Mom,-0.0596,0.086,-0.692,0.490,-0.230,0.111

0,1,2,3
Omnibus:,3.603,Durbin-Watson:,2.104
Prob(Omnibus):,0.165,Jarque-Bera (JB):,3.981
Skew:,0.1,Prob(JB):,0.137
Kurtosis:,3.87,Cond. No.,69.5


In [326]:
from pathlib import Path

PROJECT_DIR = Path("/content/drive/MyDrive/Empirical_Asset_Pricing_Project")
OUT_T = PROJECT_DIR / "outputs" / "tables"
OUT_F = PROJECT_DIR / "outputs" / "figures"

OUT_T.mkdir(parents=True, exist_ok=True)
OUT_F.mkdir(parents=True, exist_ok=True)

# Save performance summary
performance_summary.to_csv(OUT_T / "performance_summary.csv")

# Save regression summaries (simple approach)
with open(OUT_T / "capm_ew.txt", "w") as f:
    f.write(capm_ew.summary().as_text())

with open(OUT_T / "capm_vw.txt", "w") as f:
    f.write(capm_vw.summary().as_text())

with open(OUT_T / "ff5_ew.txt", "w") as f:
    f.write(ff5_ew.summary().as_text())

with open(OUT_T / "ff5_vw.txt", "w") as f:
    f.write(ff5_vw.summary().as_text())

with open(OUT_T / "ff6_ew.txt", "w") as f:
    f.write(ff6_ew.summary().as_text())

with open(OUT_T / "ff6_vw.txt", "w") as f:
    f.write(ff6_vw.summary().as_text())