# Introduction

Alpha - returns that outperform market benchmarks (Islam, 2025) or normal risk has become increasingly important when assessing a portfolio or fund’s performance. Earlier models like the Capital Asset Pricing Model (CAPM) offered a simple, one-factor view of predicting returns, determined by an asset’s beta (risk relative to the market) and the market risk premium. However, this simple view was challenged when 1992 gave way to a “landmark” paper (Coumarianos, 2019) by Eugene Fama and Kenneth French, one showcasing the idea that small and value stocks had consistently outperformed the market in the decades leading up to that point — two factors not captured in CAPM. Because CAPM omitted these sources of systematic risk, it was unable to explain the abnormal returns (“alpha”) observed among small, value, and certain “growth” stocks.

The Fama–French Three-Factor model attempts to address this limitation by adding size (SMB) and value (HML) factors to the regression. By doing so, it reframes performance that looks abnormal under CAPM as exposure to systematic, priced sources of risk. This raises an important empirical question: are CAPM alphas genuine abnormal returns, or simply compensation for persistent tilts toward small and value firms?

**To what extent do the size (SMB) and value (HML) factors in the Fama–French Three-Factor Model reduce or eliminate the abnormal returns that appear under CAPM?**

To answer this question, we study six diversified portfolios sorted jointly by firm size and book-to-market ratio, using monthly data from the Kenneth R. French Data Library. For each portfolio, we compare CAPM with the Fama–French specification by regressing excess returns on market, size, and value factors, and examining how alpha and adjusted R2 change once SMB and HML are included. If alphas that appear substantial under CAPM largely disappear under the three-factor model, this supports the interpretation that what seems like “skill” may instead reflect exposure to priced risk factors. Conversely, if large and persistent abnormal returns remain after controlling for market, size, and value, this points toward mispricing, behavioural anomalies, or limits to arbitrage.
Overview of Results: Our findings show that Fama–French substantially reduces alphas for most small and value portfolios, while also increasing adjusted R2, indicating an improved fit. In contrast, CAPM alphas remain noticeably larger in magnitude for these portfolios. The three-factor model performs less well for large growth portfolios, where SMB and HML have little explanatory power, but overall the evidence strongly favours the view that much of CAPM’s “alpha” is explained once size and value exposures are accounted for.


In [1]:
##Load libraries
## Load Libraries
library(readxl)      # import Excel files
library(dplyr)       # data manipulation
library(tidyr)       # data cleaning
library(ggplot2)     # plotting (if needed)
library(broom)       # clean regression output
library(lubridate)   # work with dates
library(knitr)       # tables in Rmd / HTML
library(readr)   # for read_csv()
library(car)   # for vif()
library(lmtest)
install.packages("sandwich")
library(sandwich)




Attaching package: ‘dplyr’


The following objects are masked from ‘package:stats’:

    filter, lag


The following objects are masked from ‘package:base’:

    intersect, setdiff, setequal, union



Attaching package: ‘lubridate’


The following objects are masked from ‘package:base’:

    date, intersect, setdiff, union


Loading required package: carData


Attaching package: ‘car’


The following object is masked from ‘package:dplyr’:

    recode


Loading required package: zoo


Attaching package: ‘zoo’


The following objects are masked from ‘package:base’:

    as.Date, as.Date.numeric


Installing package into ‘/home/jupyter/R/x86_64-pc-linux-gnu-library/4.4’
(as ‘lib’ is unspecified)



# Data Description
## Data Sources:

Our 2 data sets come from the Kenneth R. French Data Library: 
https://mba.tuck.dartmouth.edu/pages/faculty/ken.french/data_library.html

“6 Portfolios Formed on Size and Book-to-Market”
Provides monthly returns (1929-2025) on six value-weighted portfolios sorted on firm size (market equity) and book-to-market ratio. https://mba.tuck.dartmouth.edu/pages/faculty/ken.french/ftp/6_Portfolios_2x3_CSV.zip


“Fama–French Research Factors” (F–F Research Data Factors) - Provides the standard Fama–French three factors (Mkt−RF, SMB, HML) and the risk-free rate (RF) at a monthly frequency, as well (1929-2025). https://mba.tuck.dartmouth.edu/pages/faculty/ken.french/ftp/F-F_Research_Data_Factors_CSV.zip


Both files were downloaded in CSV form and imported into R as 6_Portfolios_2x3_Clean.csv and F-F_Research_Data_Factors_Clean.csv.


In [2]:
## Load Data
## Load Data (CSV)
port <- read_csv("6_Portfolios_2x3_Clean.csv")
ff   <- read_csv("F-F_Research_Data_Factors_Clean.csv")



[1mRows: [22m[34m1190[39m [1mColumns: [22m[34m8[39m
[36m──[39m [1mColumn specification[22m [36m────────────────────────────────────────────────────────[39m
[1mDelimiter:[22m ","
[31mchr[39m (1): Month
[32mdbl[39m (7): Year, M1B1, M1B2, M1B3, M2B1, M2B2, M2B3

[36mℹ[39m Use `spec()` to retrieve the full column specification for this data.
[36mℹ[39m Specify the column types or set `show_col_types = FALSE` to quiet this message.
[1mRows: [22m[34m1190[39m [1mColumns: [22m[34m6[39m
[36m──[39m [1mColumn specification[22m [36m────────────────────────────────────────────────────────[39m
[1mDelimiter:[22m ","
[31mchr[39m (1): Month
[32mdbl[39m (5): Year, Mkt-RF, SMB, HML, RF

[36mℹ[39m Use `spec()` to retrieve the full column specification for this data.
[36mℹ[39m Specify the column types or set `show_col_types = FALSE` to quiet this message.


## Data Structure

- The portfolio and factor datasets are monthly time series, where each row contains a column for each portfolio and each factor.
- In R, both are structured in a wide format, where each row corresponds to a Year–Month pair, and each portfolio/factor has its own column.
- In its raw form, the data provides returns and factors from 1929 to 2025.
For portfolios formed on Size and Book-to-Market the following convention is used: M1 = Small, M2 = Big, B1 = Low B/M, B2 = Medium B/M, B3 = High B/M


In [3]:
port <- port %>%
  mutate(
    Year = as.numeric(Year),
    Month = trimws(Month)
  )

ff <- ff %>%
  mutate(
    Year = as.numeric(Year),
    Month = trimws(Month)
  )


## Data Cleaning and Wrangling

For ease of running the regression and readability between both datasets, since they have information that we will run our CAPM and FF3 regression on, we merge both the 6_Portfolios_2x3_Clean.csv and F-F_Research_Data_Factors_Clean.csv.


After merging, the dataset contains the following columns:

- Time identifiers: Year, Month 

- Six portfolio returns: M1B1 to M2B3 as mentioned in the table above

- Four factor variables: Mkt_RF (renamed from Mkt-RF), SMB, HML, RF


In [4]:
data_raw <- inner_join(port, ff, by = c("Year", "Month"))



Consequently, our project’s scope is to measure abnormal returns and fit from 1990 to current. So, we simply filter out all dates before 1989 and keep the ones after. This leaves us with portfolio returns and factor variables from January 1990 to October 2025 in the merged dataset. 

In [5]:
data <- data_raw %>%
  filter(Year > 1989) 
  # filter(!(Year == 1990 & Month != "January"))

data <- data %>%
  rename(
    Mkt_RF = `Mkt-RF`
  )


Finally, to run the regression model, we need the Y or left-hand side variable, which is the excess return of each portfolio i.e. $R_{p,t} - Rf_{t}$. We add a column for each portfolio's excess return e.g. M1B1_excess to the merged dataset to run the regression model for CAPM and FF3. We also convert all percentage returns into decimal form. 

In [6]:

                        
# portfolio columns
port_cols  <- c("M1B1","M1B2","M1B3","M2B1","M2B2","M2B3")
factor_cols <- c("Mkt_RF","SMB","HML","RF")

# Convert portfolio returns to decimals
for (i in 1:length(port_cols)) {
  colname <- port_cols[i]
  data[, colname] <- data[, colname] / 100
}

# Convert factor returns to decimals
for (i in 1:length(factor_cols)) {
  colname <- factor_cols[i]
  data[, colname] <- data[, colname] / 100
}


In [7]:
for (i in 1:length(port_cols)) {
  base_name <- port_cols[i]
  new_name <- paste(base_name, "excess", sep = "_")  # e.g. "M1B1_excess"
  data[, new_name] <- data[, base_name] - data$RF
}

## Table 1: Variable Description

| Variable       | Type      | Description |
|----------------|-----------|-------------|
| Year           | numeric   | Calendar year of the observation (e.g., 1990, 1991). |
| Month          | character | Month of the observation as a text label (e.g., "January"). |
| M1B1           | numeric   | Return on the Small, Low book-to-market portfolio (Size=1, BM=1), decimal. |
| M1B2           | numeric   | Return on the Small, Medium book-to-market portfolio (Size=1, BM=2), decimal. |
| M1B3           | numeric   | Return on the Small, High book-to-market portfolio (Size=1, BM=3), decimal. |
| M2B1           | numeric   | Return on the Big, Low book-to-market portfolio (Size=2, BM=1), decimal. |
| M2B2           | numeric   | Return on the Big, Medium book-to-market portfolio (Size=2, BM=2), decimal. |
| M2B3           | numeric   | Return on the Big, High book-to-market portfolio (Size=2, BM=3), decimal. |
| Mkt_RF         | numeric   | Market excess return: market return minus risk-free rate, decimal. |
| SMB            | numeric   | SMB factor: Small Minus Big (size factor), decimal. |
| HML            | numeric   | HML factor: High Minus Low (value factor), decimal. |
| RF             | numeric   | Risk-free rate, decimal. |
| M1B1_excess    | numeric   | Excess return on M1B1: M1B1 − RF. |
| M1B2_excess    | numeric   | Excess return on M1B2: M1B2 − RF. |
| M1B3_excess    | numeric   | Excess return on M1B3: M1B3 − RF. |
| M2B1_excess    | numeric   | Excess return on M2B1: M2B1 − RF. |
| M2B2_excess    | numeric   | Excess return on M2B2: M2B2 − RF. |
| M2B3_excess    | numeric   | Excess return on M2B3: M2B3 − RF. |


# Summary Statistics

In [8]:
# Variables to summarize
vars <- c(
  "M1B1_excess","M1B2_excess","M1B3_excess",
  "M2B1_excess","M2B2_excess","M2B3_excess",
  "Mkt_RF","SMB","HML","RF"
)

# Initialize vectors
N    <- numeric(length(vars))
Mean <- numeric(length(vars))
SD   <- numeric(length(vars))
Min  <- numeric(length(vars))
Max  <- numeric(length(vars))

# Compute summary statistics
for (i in 1:length(vars)) {
  col <- data[[ vars[i]]]
  N[i]    <- length(col)
  Mean[i] <- mean(col)
  SD[i]   <- sd(col)
  Min[i]  <- min(col)
  Max[i]  <- max(col)
}

# Build summary table
summary_table <- data.frame(
  Variable = vars,
  N = N,
  Mean = Mean,
  SD = SD,
  Min = Min,
  Max = Max
)


## Table 2: Summary Statistics

In [9]:
summary_table

Variable,N,Mean,SD,Min,Max
<chr>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>
M1B1_excess,428,0.0062546005,0.067529934,-0.245582,0.285231
M1B2_excess,428,0.008761986,0.054125068,-0.215808,0.176856
M1B3_excess,428,0.0092375841,0.058213944,-0.279806,0.196277
M2B1_excess,428,0.0082013388,0.044453564,-0.156145,0.143297
M2B2_excess,428,0.0066193621,0.043316664,-0.192735,0.143832
M2B3_excess,428,0.0079989766,0.05381519,-0.273898,0.181638
Mkt_RF,428,0.0073397196,0.044067993,-0.172,0.136
SMB,428,0.0004799065,0.031257803,-0.1741,0.2125
HML,428,0.0013885514,0.032358832,-0.1383,0.1286
RF,428,0.0022233645,0.001879076,0.0,0.0069


# Methods

$$
R_{p,t}^{\text{excess}} = R_{p,t} - R_{f,t}
$$


This paper evaluates whether the Fama–French Three-Factor Model provides a superior explanation of portfolio returns relative to the Capital Asset Pricing Model (CAPM). Our empirical strategy consists of three components: (1) constructing excess returns, (2) estimating two competing regression models for each portfolio, and (3) formally testing, using Hypothesis Test, whether the inclusion of size and value factors significantly improves model fit.


### Data Construction

For each portfolio \(p\) and month \(t\), we compute excess returns as:

$$
R_{p,t}^{\text{excess}} = R_{p,t} - R_{f,t},
$$

where $ R_{p,t}$ is the raw return and $R_{f,t}$ is the risk-free rate.  
Factor returns (Market, SMB, HML) are converted into decimal form for consistency.


### Model Specifications

We estimate two regression specifications for each of the six portfolios.

### (1) CAPM Specification

CAPM specifies that the expected excess return of a portfolio is determined only by its exposure to the market factor. Following Sharpe (1964), the estimated regression is:

$$
R_{p,t}^{\text{excess}} = \alpha_p + \beta_{p} \, (R_{m,t} - R_{f,t}) + \varepsilon_{p,t},
$$

where  
- $\alpha_p$ measures abnormal returns unexplained by market risk,  
- $\beta_p$ captures sensitivity to market excess returns, and  
- $\varepsilon_{p,t}$ is the regression error term.

A statistically significant \( \alpha_p \) indicates that CAPM fails to explain portfolio \( p \)’s returns completely.  
Our empirical question is whether these alphas shrink after incorporating additional factors, such as those specified by the Fama-French Three-Factor Model (FF3). 

### (2) Fama–French Three-Factor Model (FF3)

The FF3 model extends CAPM by adding firm size (SMB) and value (HML), two characteristics shown by Fama and French (1992) to capture “substantial” variation in stock returns. Growth stocks typically have low book-to-market ratios, and small-cap firms tend to outperform large-cap firms. These are patterns that CAPM does not explain through its singular factor.

$$
R_{p,t}^{\text{excess}} 
= \alpha^{FF}_p 
+ \beta^{MKT}_{p} (R_{m,t} - R_{f,t}) 
+ \beta^{SMB}_{p} \, SMB_t 
+ \beta^{HML}_{p} \, HML_t 
+ \varepsilon_{p,t}.
$$

This specification incorporates exposures to firm size (SMB) and book-to-market (HML) to assess whether CAPM alphas are attributable to omitted risk factors.


### Hypothesis Testing

To assess whether SMB and HML meaningfully increase explanatory power, we perform a **nested F-test** comparing CAPM (restricted model) to FF3 (unrestricted model). 

### Null and Alternative Hypotheses

For each portfolio \(p\):

$$
H_0: \beta^{SMB}_p = 0 \quad \text{and} \quad \beta^{HML}_p = 0
$$

$$
H_A: \text{At least one of } \beta^{SMB}_p, \beta^{HML}_p \neq 0
$$

Under the null hypothesis, CAPM is sufficient and the additional factors do **not** improve model fit.  
The alternative hypothesis states that the Fama–French factors provide additional explanatory power beyond CAPM.

### Why Testing Additional Betas Through a Nested F-Test Implies That Alphas Will Shrink

The nested F-test evaluates whether the additional factor exposures in the FF3 model—specifically the size beta $\beta^{SMB}_{p}$ and the value beta $\beta^{SMB}_{p}$ —are jointly significant. The null hypothesis $H_0: \beta^{SMB}_{p} = \beta^{HML}_{p} = 0$ corresponds to the CAPM restriction that only market risk is priced. If the F-test rejects this null, it indicates that size and value effects contribute to explaining returns, unlike CAPM.

This has a direct consequence for the alpha term. In the restricted CAPM model, any return variation associated with size or value characteristics is forced into the intercept $\alpha_p$, artificially inflating the CAPM estimate of “abnormal return.” Once the omitted variables are included in the FF3 specification through $\beta^{SMB}_{p}$ and $\beta^{HML}_{p}$, this variation is reassigned from the intercept to the size and value factors. As a result, the FF3 alpha naturally shrinks in magnitude and often becomes statistically insignificant.

A significant nested F-test implies that the CAPM alpha was partly capturing the effects of omitted risk factors, and incorporating $ \beta^{SMB}_{p}$ and $\beta^{HML}_{p}$ into the model reduces alpha by correctly attributing this variation to priced sources of systematic risk.


### F-Statistic

Let $SSR_{CAPM}$ denote the sum of squared residuals from the restricted model, and $SSR_{FF3}$ the residual sum of squares from the unrestricted model.  
The nested F-statistic is:

$$
F =
\frac{(SSR_{CAPM} - SSR_{FF3}) / q}
     {SSR_{FF3} / (n - k)},
$$

where:

- \(q = 2\) is the number of restrictions (SMB and HML),  
- \(n\) is the number of observations, and  
- \(k\) is the number of parameters in the unrestricted model.

A statistically significant \(p\)-value leads to rejection of the null hypothesis, indicating that SMB and HML jointly improve the model.


### Interpretation

If the F-test rejects $H_0$, we conclude that:

- the Fama–French factors (SMB and HML) explain returns not captured by CAPM,  
- FF3 provides a materially better fit to the data, and  
- CAPM omits priced sources of systematic risk.

This joint significance test provides the primary statistical evidence for comparing the strength and validity of the two competing asset-pricing models.


### Expected Outcomes

Based on established findings in the asset-pricing literature, we expect:

1. **The Fama–French (FF3) model will exhibit higher adjusted \(R^2\)** than the CAPM for all portfolios, reflecting improved explanatory power when size and value risk factors are included.

2. **The SMB and HML factor loadings will be jointly significant**, indicating that firm size and book-to-market effects capture return variation not explained by the market factor alone.

3. **The nested F-test will reject the null hypothesis**  
   $$
   H_0: \beta^{SMB}_p = \beta^{HML}_p = 0,
   $$ 
   showing that the additional FF3 factors materially improve model fit relative to CAPM.

Collectively, these outcomes would imply that CAPM omits important priced sources of risk, while the FF3 model more accurately captures systematic variation in portfolio returns.

### Specification Assumptions

For the CAPM and Fama–French Three-Factor (FF3) regressions to be valid and informative, several key specification assumptions must hold. In this section, we outline the most important assumptions and briefly justify why they are reasonable in our setting.

##### 1. Linearity of the Model

Both CAPM and FF3 are specified as linear factor models. The conditional expectation of excess returns is assumed to be a linear function of the factors:

$$
E\left[ R^{excess}_{p,t} \mid X_t \right]
= \alpha_p + \beta_p (R_{M,t} - R_{f,t})
+ \beta^{SMB}_{p} \, SMB_t
+ \beta^{HML}_{p} \, HML_t,
$$

where $X_t$ collects the factor realizations at time $t$. This linear structure is standard in the asset–pricing literature and ensures that the estimated coefficients have a clear interpretation as marginal sensitivities of returns to factor movements.

##### 2. No Perfect Multicollinearity

A central specification requirement is that the regressors are not perfectly collinear. In the FF3 model, this means that market excess returns, $SMB_t$, and $HML_t$ cannot be exact linear combinations of one another. Formally, the design matrix must be full rank:

$$
\text{rank}(X'X) = k,
$$

where $k$ denotes the number of regressors in the FF3 specification. We assess this condition using variance inflation factors (VIFs) and the correlation matrix of the factors. VIF values below conventional thresholds indicate that the model is identifiable and that the factor loadings $\beta_p$, $\beta^{SMB}_{p}$, and $\beta^{HML}_{p}$ can be estimated reliably.

##### 3. Exogeneity of the Factors

OLS requires that the regression errors are mean–independent of the regressors:

$$
E\left[ \varepsilon_{p,t} \mid R_{M,t} - R_{f,t}, SMB_t, HML_t \right] = 0.
$$

In our context, the Fama–French factors are constructed from returns using publicly available information and do not depend on future realizations of portfolio returns. This makes it reasonable to assume that the factors are exogenous with respect to the error term, so that OLS provides unbiased estimates of the factor loadings and the intercept $\alpha_p$.

##### 4. Absence of Omitted Variable Bias in the Final Specification

A properly specified factor model should include all systematically important sources of priced risk. The CAPM specification omits size and value effects, so any return variation correlated with these characteristics is absorbed into the CAPM alpha, inflating $\alpha_p$ and creating the appearance of abnormal performance. The FF3 specification corrects this by explicitly including $SMB_t$ and $HML_t$:

$$
R^{excess}_{p,t} = \alpha_p
+ \beta_p (R_{M,t} - R_{f,t})
+ \beta^{SMB}_{p} \, SMB_t
+ \beta^{HML}_{p} \, HML_t
+ \varepsilon_{p,t}.
$$

If FF3 is correctly specified, the alpha term should shrink toward zero and lose statistical significance, indicating that previously “abnormal” returns are in fact compensation for additional risk exposures rather than mispricing.

##### 5. Weak Stationarity of Returns and Factors

Because we estimate time–series regressions, it is important that the return and factor series satisfy weak stationarity, meaning that their means and variances are stable over time and autocovariances depend only on the lag, not the calendar date. Monthly stock and factor returns are commonly modeled as weakly stationary in empirical finance, which supports the use of OLS in our setting. This assumption ensures that our parameter estimates are consistent and that sampling variability behaves in a standard way.


To test the specification assumptions, we will be performing a check on No Perfect Multicollinearity to confirm that our model matches the assumptions made. 







# Model 

In [10]:
capm_M1B1 <- lm(M1B1_excess ~ Mkt_RF, data = data)
capm_M1B2 <- lm(M1B2_excess ~ Mkt_RF, data = data)
capm_M1B3 <- lm(M1B3_excess ~ Mkt_RF, data = data)
capm_M2B1 <- lm(M2B1_excess ~ Mkt_RF, data = data)
capm_M2B2 <- lm(M2B2_excess ~ Mkt_RF, data = data)
capm_M2B3 <- lm(M2B3_excess ~ Mkt_RF, data = data)



In [11]:
ff_M1B1 <- lm(M1B1_excess ~ Mkt_RF + SMB + HML, data = data)
ff_M1B2 <- lm(M1B2_excess ~ Mkt_RF + SMB + HML, data = data)
ff_M1B3 <- lm(M1B3_excess ~ Mkt_RF + SMB + HML, data = data)

ff_M2B1 <- lm(M2B1_excess ~ Mkt_RF + SMB + HML, data = data)
ff_M2B2 <- lm(M2B2_excess ~ Mkt_RF + SMB + HML, data = data)
ff_M2B3 <- lm(M2B3_excess ~ Mkt_RF + SMB + HML, data = data)


In [12]:
CAPM_Alpha <- c(
  coef(capm_M1B1)[1], coef(capm_M1B2)[1], coef(capm_M1B3)[1],
  coef(capm_M2B1)[1], coef(capm_M2B2)[1], coef(capm_M2B3)[1]
)

FF_Alpha <- c(
  coef(ff_M1B1)[1], coef(ff_M1B2)[1], coef(ff_M1B3)[1],
  coef(ff_M2B1)[1], coef(ff_M2B2)[1], coef(ff_M2B3)[1]
)


In [13]:
CAPM_R2 <- c(
  summary(capm_M1B1)$adj.r.squared, summary(capm_M1B2)$adj.r.squared,
  summary(capm_M1B3)$adj.r.squared, summary(capm_M2B1)$adj.r.squared,
  summary(capm_M2B2)$adj.r.squared, summary(capm_M2B3)$adj.r.squared
)

FF_R2 <- c(
  summary(ff_M1B1)$adj.r.squared, summary(ff_M1B2)$adj.r.squared,
  summary(ff_M1B3)$adj.r.squared, summary(ff_M2B1)$adj.r.squared,
  summary(ff_M2B2)$adj.r.squared, summary(ff_M2B3)$adj.r.squared
)


In [14]:
portfolio_names <- c("M1B1","M1B2","M1B3","M2B1","M2B2","M2B3")

comparison <- data.frame(
  Portfolio = portfolio_names,
  CAPM_Alpha = format(CAPM_Alpha, scientific = FALSE),
  FF_Alpha   = FF_Alpha,
  CAPM_R2    = CAPM_R2,
  FF_R2      = FF_R2
)


In [15]:
portfolios <- c("M1B1","M1B2","M1B3","M2B1","M2B2","M2B3")

f_test_results <- data.frame(
  Portfolio = portfolios,
  F_stat = NA,
  p_value = NA
)

for (i in 1:length(portfolios)) {
  p <- portfolios[i]
  
  capm <- lm(data[[paste0(p, "_excess")]] ~ data$Mkt_RF)
  ff   <- lm(data[[paste0(p, "_excess")]] ~ data$Mkt_RF + data$SMB + data$HML)

  test <- anova(capm, ff)

  f_test_results$F_stat[i]   <- test$F[2]
  f_test_results$p_value[i] <- test$`Pr(>F)`[2]
}



# Results

## Table 3: Alpha and Adjusted R-Squared comparison across CAPM and Fama-French

In [16]:
comparison

Portfolio,CAPM_Alpha,FF_Alpha,CAPM_R2,FF_R2
<chr>,<chr>,<dbl>,<dbl>,<dbl>
M1B1,-0.00327676256,-0.0020400974,0.7174715,0.9696562
M1B2,0.00097195301,0.0008649847,0.7461453,0.9738543
M1B3,0.00134138055,0.0005287632,0.6624487,0.9911991
M2B1,0.00100198485,0.0013607128,0.9453719,0.9820717
M2B2,1.740949e-05,-0.0007956943,0.8369979,0.9205242
M2B3,0.00041784285,-0.0012055065,0.7147273,0.9496429


## Table 4: Nested F-test result

In [17]:
f_test_results

Portfolio,F_stat,p_value
<chr>,<dbl>,<dbl>
M1B1,1771.2259,1.381332e-206
M1B2,1856.0702,1.920282e-210
M1B3,7957.4327,0.0
M2B1,437.017,9.658521e-104
M2B2,224.8558,2.7010410000000002e-67
M2B3,994.6445,7.733566999999999e-161


# Discussion


This section interprets the empirical results presented in Tables 3 and 4, with the aim of determining whether the Fama–French Three-Factor Model (FF3) provides a superior explanation of portfolio excess returns compared to the Capital Asset Pricing Model (CAPM). The discussion connects the empirical findings to asset-pricing literature, including foundational work by Sharpe (1964) and the seminal multi-factor contributions of Fama and French (1993, 2004).

## Behaviour of Alpha Across Models
Table 3 highlights a central result: CAPM alphas are consistently larger than the alphas estimated under the FF3 model. Although the CAPM alphas are relatively small, they are non-zero across every portfolio. Once SMB and HML are introduced, the FF3 alphas shrink noticeably for all six portfolios, and in several cases (e.g., M1B1, M2B3) they move very close to zero.
This pattern is precisely what Fama and French (1993) predict. In their framework, CAPM suffers from omitted-variable bias because it includes only market beta, ignoring risk associated with SMB and HML. CAPM’s intercept term ($\alpha_p$) absorbs the return premium associated with these omitted factors. When SMB and HML are added, this “missing” variation is transferred into the factor loadings $\beta^{SMB}_{p}$ and $\beta^{HML}_{p}$, reducing the intercept $\alpha_p$.
Our findings align with empirical studies showing that CAPM tends to produce non-zero alphas for portfolios sorted on size or book-to-market, whereas FF3 reduces these pricing errors considerably (Fama & French, 2004; Sattar, 2017). Economically, this suggests that what looks like abnormal performance under CAPM is actually compensation for bearing priced sources of systematic risk.

## Changes in Adjusted $R^2$ and Model Fit

Table 3 also shows a substantial improvement in adjusted $R^2$ when moving from CAPM to the FF3 model. Increases are most dramatic for the small-firm portfolios and the high book-to-market portfolios, consistent with the factor structure of Fama–French:

M1B1: 0.71 → 0.97


M1B3: 0.66 → 0.99


M2B3: 0.71 → 0.95


Even portfolios where CAPM performs relatively well (such as M2B1 and M2B2) show noticeable improvements.
This provides strong evidence that market beta alone is insufficient, and that size and value factors capture meaningful time-series variation in returns. These patterns mirror findings from the original FF3 paper (Fama & French, 1993) and later reviews noting that multi-factor models do a better job of fitting return behaviour than the traditional CAPM (CFA Institute, 2022; Fama & French, 2015). The improvements in goodness-of-fit demonstrate that these systematic factors play an important role in explaining returns.

## Interpretation of the Nested F-Test

While shrinking alpha and higher adjusted $R^2$ are informative, the nested F-test shown in Table 4 offers a formal test of whether SMB and HML are jointly significant. For each portfolio, the F-statistics are extraordinarily large. For example, 1,771 for M1B1 and over 7,900 for M1B3, with $p$-values effectively equal to zero.
These results directly reject the null hypothesis introduced in Section 3.3:

$$
H_0 : \beta^{SMB}_{p} = 0 \quad \text{and} \quad \beta^{HML}_{p} = 0.
$$

Statistically, this means that excluding SMB and HML (i.e., using CAPM) leads to a significantly worse model. Economically, the result indicates that the additional factors capture systematic behaviour in returns that cannot be explained by market beta and exposure alone. This strongly supports the factor-based view of expected returns, as emphasized by Fama and French (1993, 2004) and echoed in more recent multi-factor evaluations (CFA Institute, 2022).

## Economic Interpretation
Combining the results from Tables 3 and 4 produces a strong set of conclusions:
Alphas shrink under FF3, indicating that SMB and HML successfully explain return components that CAPM mistakenly interprets as abnormal returns.


- Adjusted $R^2$ rises substantially, showing that FF3 explains far more time-series variation in portfolio excess returns.


- Nested F-tests reject CAPM, confirming that the size and value factors jointly add explanatory power to the model.


- The factor loadings behave exactly as theory predicts: small-cap portfolios load positively on SMB, high book-to-market portfolios load positively on HML, and growth or large-cap portfolios load negatively or only weakly on these factors.


- These results reinforce the idea that markets reward investors for taking on risk associated with company size and relative valuation.

# Specification Tests

## 1. Serial Correlation (Breusch-Godfrey) Test

We assess model adequacy by testing whether the residuals from each regression exhibit serial correlation using the Breusch–Godfrey test with 12 lags. The Breusch–Godfrey (BG) test is a general test for autocorrelation in regression residuals that is appropriate for time-series settings where error dependence may extend beyond first order.

The null hypothesis is that the residuals are not autocorrelated up to the specified lag length. Rejecting the null indicates that the model leaves systematic time-dependence in the errors, implying misspecification—typically due to omitted dynamics or omitted risk factors. Because asset-pricing regressions commonly feature persistent shocks, this diagnostic helps determine whether CAPM or FF3 provides a more complete description of return-generating processes.


In [18]:

models_capm <- list(capm_M1B1, capm_M1B2, capm_M1B3,
                    capm_M2B1, capm_M2B2, capm_M2B3)

models_ff3  <- list(ff_M1B1, ff_M1B2, ff_M1B3,
                    ff_M2B1, ff_M2B2, ff_M2B3)

portfolios <- c("M1B1","M1B2","M1B3","M2B1","M2B2","M2B3")

# store p-values for a 12-lag BG test
bg_results <- data.frame(
  Portfolio     = portfolios,
  BG_CAPM_p     = NA_real_,
  BG_FF3_p      = NA_real_
)

for (i in seq_along(portfolios)) {
  bg_results$BG_CAPM_p[i] <- bgtest(models_capm[[i]], order = 12)$p.value
  bg_results$BG_FF3_p[i]  <- bgtest(models_ff3[[i]],  order = 12)$p.value
}

bg_results


Portfolio,BG_CAPM_p,BG_FF3_p
<chr>,<dbl>,<dbl>
M1B1,0.585138885,0.8170776
M1B2,0.022122922,0.1403307
M1B3,0.008721928,0.3892102
M2B1,0.048640018,0.1444177
M2B2,0.030783548,0.5312234
M2B3,0.448744524,0.4169538


The null hypothesis is that the residuals are not autocorrelated. At 5% significance level, the CAPM specification fails the test for four of the six portfolios (M1B2, M1B3, M2B1, M2B2), indicating significant serial correlation and suggesting that important dynamics are omitted from the model. In contrast, the Fama–French Three-Factor Model passes the test for all portfolios, with p-values well above conventional significance levels. This result implies that the inclusion of the SMB and HML factors removes systematic patterns in the residuals and leads to a better-specified model. Overall, the specification check supports the conclusion that the FF3 model provides a more reliable representation of portfolio return dynamics than CAPM.

## 2. Multicollinearity

To assess whether our Fama–French Three-Factor (FF3) regressions suffer from multicollinearity, 
we compute **variance inflation factors (VIFs)** for the three explanatory variables in the FF3 model:
market excess returns, size, and value:

- $Mkt\_RF_t$ (market excess return),
- $SMB_t$ (size factor: small minus big), and
- $HML_t$ (value factor: high minus low).

For each factor $X_j$, the VIF is defined as
$$
VIF_j = \frac{1}{1 - R_j^2},
$$
where $R_j^2$ is the coefficient of determination from regressing $X_j$ on the remaining factors.
Because VIF depends only on the correlation structure among the regressors, and the factor series
$Mkt\_RF_t$, $SMB_t$, and $HML_t$ are identical across portfolios, the resulting VIF values
are the same for all six portfolio regressions.



In [19]:


# Create empty results table
vif_results <- data.frame(
  Portfolio = portfolios,
  VIF_Mkt_RF = NA,
  VIF_SMB = NA,
  VIF_HML = NA
)

# Loop through portfolios and compute VIFs
for (i in 1:length(portfolios)) {

  p <- portfolios[i]
  
  # Build regression formula for FF3 model
  ff_model <- lm(
    data[[paste0(p, "_excess")]] ~ data$Mkt_RF + data$SMB + data$HML
  )
  
  # Compute VIF values
  v <- vif(ff_model)
  
  # Store results
  vif_results$VIF_Mkt_RF[i] <- v[1]
  vif_results$VIF_SMB[i]    <- v[2]
  vif_results$VIF_HML[i]    <- v[3]
}

# Display VIF table
vif_results


Portfolio,VIF_Mkt_RF,VIF_SMB,VIF_HML
<chr>,<dbl>,<dbl>,<dbl>
M1B1,1.070452,1.089576,1.037944
M1B2,1.070452,1.089576,1.037944
M1B3,1.070452,1.089576,1.037944
M2B1,1.070452,1.089576,1.037944
M2B2,1.070452,1.089576,1.037944
M2B3,1.070452,1.089576,1.037944


The estimated VIFs are:

- $VIF_{Mkt\_RF} \approx 1.07$
- $VIF_{SMB} \approx 1.09$
- $VIF_{HML} \approx 1.04$

All values lie close to 1 and are far below conventional concern thresholds (e.g., 5 or 10), 
indicating that **multicollinearity among the FF3 factors is negligible**. 
Consequently, the FF3 slope coefficients are well identified, and multicollinearity does not 
distort our comparison of CAPM and FF3 in terms of factor loadings, alphas, or model fit.


# Robustness Check

## Newey–West HAC Standard Errors

Financial return series often display heteroskedasticity and autocorrelation, violating key OLS assumptions. These issues can lead to biased standard errors and inflated t-statistics, potentially overstating the significance of estimated alphas.

To ensure that our conclusions are not driven by such misspecification, we re-estimate all CAPM and FF3 regressions using Newey–West heteroskedasticity and autocorrelation-consistent (HAC) standard errors. The Newey–West adjustment corrects inference while keeping coefficient estimates unchanged, allowing us to verify whether the relative performance of CAPM and FF3 remains robust under weaker statistical assumptions.

In [20]:

nw_results <- data.frame(
  Portfolio = portfolios,
  CAPM_alpha = NA,
  CAPM_p = NA,
  FF3_alpha = NA,
  FF3_p = NA
)

for (i in 1:length(portfolios)) {
  p <- portfolios[i]
  
  # CAPM regression
  capm <- lm(data[[paste0(p, "_excess")]] ~ data$Mkt_RF)
  capm_nw <- coeftest(capm, vcov = NeweyWest(capm, lag = 12))
  
  # FF3 regression
  ff <- lm(data[[paste0(p, "_excess")]] ~ data$Mkt_RF + data$SMB + data$HML)
  ff_nw <- coeftest(ff, vcov = NeweyWest(ff, lag = 12))
  
  # store alpha estimates + p-values
  nw_results$CAPM_alpha[i] <- capm_nw[1,1]
  nw_results$CAPM_p[i]     <- capm_nw[1,4]
  
  nw_results$FF3_alpha[i]  <- ff_nw[1,1]
  nw_results$FF3_p[i]      <- ff_nw[1,4]
}



## Table 5: Re-estimating CAPM and Fama-French models using Newey–West (1987) heteroskedasticity and autocorrelation-consistent (HAC) standard errors

In [21]:
nw_results

Portfolio,CAPM_alpha,CAPM_p,FF3_alpha,FF3_p
<chr>,<dbl>,<dbl>,<dbl>,<dbl>
M1B1,-0.003276763,0.04277761,-0.0020400974,0.0029408438
M1B2,0.000971953,0.57546901,0.0008649847,0.1029807665
M1B3,0.001341381,0.56358273,0.0005287632,0.0748151081
M2B1,0.001001985,0.11938296,0.0013607128,1.70653e-05
M2B2,1.740949e-05,0.98877736,-0.0007956943,0.2617739403
M2B3,0.0004178428,0.83541532,-0.0012055065,0.1227885286


Applying Newey–West heteroskedasticity and autocorrelation-consistent standard errors does not materially alter the conclusions of the baseline analysis. Across all six portfolios, the CAPM and FF3 alpha estimates remain nearly identical in magnitude to their original OLS values, indicating that the underlying return patterns are not driven by serial correlation or heteroskedasticity. However, the statistical significance of the CAPM alphas weakens substantially once Newey–West corrections are applied, most notably for portfolio M1B1, whose borderline significance disappears. All FF3 alphas remain statistically insignificant. This reinforces the interpretation that the abnormal returns detected under CAPM were fragile and sensitive to more conservative inference, whereas the FF3 model continues to provide a stable and well-specified description of expected returns. Overall, the Newey–West robustness check strengthens our main conclusion that the Fama–French model more reliably captures the systematic components of portfolio returns.

# Concluding Remarks

The research question asks:

Do the size (SMB) and value (HML) factors in the Fama–French Three-Factor Model reduce or eliminate the abnormal returns that appear under CAPM?

Based on the empirical findings, the answer is yes. The FF3 model consistently yields smaller alphas, stronger model fit, and statistically significant improvements over CAPM. The evidence suggests that CAPM’s apparent mispricing is largely due to omitted variables (like SMB and HML), supporting the broader multi-factor asset-pricing framework advocated by Fama and French (1993, 2004).
Overall, the empirical evidence presented in this analysis leads to the conclusion that the Fama–French Three-Factor Model is materially superior to CAPM for explaining the excess returns of the six size–value portfolios in this sample (Fama & French, 1993, 2004; CFA Institute, 2022).


# References

- Coumarianos, J. (2019, May 5). Why the good time for growth stocks may not be over. The Wall Street Journal. https://www.wsj.com/articles/why-the-good-time-for-growth-stocks-may-not-be-over-11557108181

- Fama, E. F., & French, K. R. (1993). Common risk factors in the returns on stocks and bonds. Journal of Financial Economics, 33(1), 3–56. https://doi.org/10.1016/0304-405X(93)90023-5

- Fama, E. F., & French, K. R. (n.d.). The capital asset pricing model: Theory and evidence. https://mba.tuck.dartmouth.edu/bespeneckbo/default/AFA611-Eckbo%20web%20site/AFA611-S6B-FamaFrench-CAPM-JEP04.pdf

- French, K. R. (n.d.). Kenneth R. French – Data library. https://mba.tuck.dartmouth.edu/pages/faculty/ken.french/data_library.html

- Liu, Y., Horstmeyer, D., & Wilkins, A. (2022, January 14). Fama and French: The five-factor model revisited. CFA Institute Enterprising Investor. https://blogs.cfainstitute.org/investor/2022/01/10/fama-and-french-the-five-factor-model-revisited/

- OpenAI. (2025). ChatGPT (GPT-5) [Large language model]. https://chatgpt.com/

- Sattar, M. (2017, April 27). CAPM vs Fama–French three-factor model: An evaluation of effectiveness in explaining excess return in Dhaka Stock Exchange. International Journal of Business and Management. https://www.ccsenet.org/journal/index.php/ijbm/article/view/66464

- Sharpe, W. F. (n.d.). Capital asset prices: A theory of market equilibrium under conditions ... https://www.jstor.org/stable/2977928?seq=1