# Fama-French Three-Factor Model

`Fama-French Three-factor Model` is an extension of the Capital Asset Pricing Model (CAPM). \
The Fama-French model aims to describe stock returns through three factors: \
(1) market risk \
(2) the outperformance of small-cap companies relative to large-cap companies \
(3) the outperformance of high book-to-market value companies versus low book-to-market value companies. \
The rationale behind the model is that high value and small-cap companies tend to regularly outperform the overall market. (CFI TEAM)

In [1]:
import pandas as pd
import yfinance as yf
import statsmodels.formula.api as smf

1- Define parameters

In [7]:
RISKY_ASSET = 'META'
START_DATE = '2013-12-31'
END_DATE = '2018-12-31'

2- Load data from the source CSV file and keep only the monthly data

In [3]:
# Load data from CSV
factor_df = pd.read_csv('.\data\F-F_Research_Data_Factors.CSV', skiprows=3)

# Identify where the annual data starts
STR_TO_MATCH = ' Annual Factors: January-December '
indices = factor_df.iloc[:, 0] == STR_TO_MATCH
start_of_annual = factor_df[indices].index[0]

# Keep only monthly data
factor_df = factor_df[factor_df.index < start_of_annual]

3- Rename columns of the DataFrame set a datetime index and filter by dates

In [4]:
# Rename columns
factor_df.columns = ['date', 'mkt', 'smb', 'hml', 'rf']

# Convert strings to datetime
factor_df['date'] = pd.to_datetime(factor_df['date'], format='%Y%m').dt.strftime("%Y-%m")

# Set index
factor_df = factor_df.set_index('date')

# Filter only required dates
factor_df = factor_df.loc[START_DATE:END_DATE]

4- Convert the values to numeric and divide by 100

In [5]:
factor_df = factor_df.apply(pd.to_numeric, errors='coerce').div(100)
factor_df.head()

Unnamed: 0_level_0,mkt,smb,hml,rf
date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
2014-01,-0.0332,0.0085,-0.0209,0.0
2014-02,0.0465,0.0034,-0.004,0.0
2014-03,0.0043,-0.0189,0.0509,0.0
2014-04,-0.0019,-0.0424,0.0114,0.0
2014-05,0.0206,-0.0186,-0.0027,0.0


5- Download the prices of the risky asset

In [8]:
asset_df = yf.download(RISKY_ASSET, start=START_DATE, end=END_DATE)
print(f'Download {asset_df.shape[0]} rowns of data')

[*********************100%%**********************]  1 of 1 completed
Download 1258 rowns of data


6- Calculate monthly returns on the risky asset

In [9]:
y = asset_df['Adj Close'].resample('M') \
                            .last() \
                            .pct_change() \
                            .dropna()

y.index = y.index.strftime('%Y-%m')
y.name = 'rtn'
y.head()

Date
2014-01    0.144922
2014-02    0.094135
2014-03   -0.120070
2014-04   -0.007636
2014-05    0.058883
Name: rtn, dtype: float64

7- Merge the datasets and calculate excess returns

In [10]:
ff_data = factor_df.join(y)
ff_data['excess_rtn'] = ff_data.rtn - ff_data.rf

8- Estimate the Three-Factor Model

In [11]:
# Define and fit the regression model
ff_model = smf.ols(formula='excess_rtn ~ mkt + smb + hml', data=ff_data).fit()

# Print results
print(ff_model.summary())

                            OLS Regression Results                            
Dep. Variable:             excess_rtn   R-squared:                       0.217
Model:                            OLS   Adj. R-squared:                  0.175
Method:                 Least Squares   F-statistic:                     5.175
Date:                Wed, 14 Feb 2024   Prob (F-statistic):            0.00316
Time:                        18:07:12   Log-Likelihood:                 88.392
No. Observations:                  60   AIC:                            -168.8
Df Residuals:                      56   BIC:                            -160.4
Df Model:                           3                                         
Covariance Type:            nonrobust                                         
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
Intercept      0.0105      0.008      1.373      0.1