## Fama french regression in python

$ r_{i} - r_{f} = \alpha_{i} + \beta_i (r_{M} - r_{f}) + s_i \cdot SMB + h_i \cdot HML + \epsilon_i$


where 

$ r_{i} - r_{f}$ is called excess return 

SMB captures size effect

HML captures value effect


In [2]:
import pandas_datareader.data as reader
import pandas as pd
import datetime as dt
import statsmodels.api as sm
 


In [12]:
import yfinance


end = dt.date(2024, 5, 30)
start = end + dt.timedelta(days = -5 * 365)
# FDGRX fidelity
tickers = ['FDGRX']

data_df = yfinance.download(tickers, start, end)
data_df.head()

[*********************100%%**********************]  1 of 1 completed


Unnamed: 0_level_0,Open,High,Low,Close,Adj Close,Volume
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2019-06-03,18.01,18.01,18.01,18.01,12.973203,0
2019-06-04,18.5,18.5,18.5,18.5,13.326166,0
2019-06-05,18.690001,18.690001,18.690001,18.690001,13.463032,0
2019-06-06,18.75,18.75,18.75,18.75,13.50625,0
2019-06-07,19.059999,19.059999,19.059999,19.059999,13.729552,0


In [13]:
data_df = data_df['Adj Close']
data_df.head()

Date
2019-06-03    12.973203
2019-06-04    13.326166
2019-06-05    13.463032
2019-06-06    13.506250
2019-06-07    13.729552
Name: Adj Close, dtype: float64

In [19]:
daily_returns = data_df.pct_change().dropna(axis = 0)
daily_returns.head()

Date
2019-06-04    0.027207
2019-06-05    0.010270
2019-06-06    0.003210
2019-06-07    0.016533
2019-06-10    0.003673
Name: Adj Close, dtype: float64

In [25]:
monthly_returns = daily_returns\
                    .resample('M')\
                    .agg(lambda x: (x+1).prod() - 1)
monthly_returns.shape

(60,)

In [27]:
monthly_returns.head()

Date
2019-06-30    0.088284
2019-07-31    0.016837
2019-08-31   -0.013046
2019-09-30   -0.016777
2019-10-31    0.038780
Freq: M, Name: Adj Close, dtype: float64

$ r_{i} - r_{f} = \alpha_{i} + \beta_i (r_{M} - r_{f}) + s_i \cdot SMB + h_i \cdot HML + \epsilon_i$

We want to get market factor , the SMB and HML. We will use kenneth french's data library


In [38]:
# BUT we forgot to take rf rate under consideration
# https://mba.tuck.dartmouth.edu/pages/faculty/ken.french/data_library.html
factors = reader.DataReader('F-F_Research_Data_Factors', 
                       'famafrench', 
                       start = monthly_returns.index[0],
                       end = monthly_returns.index[-1]
                       )

  factors = reader.DataReader('F-F_Research_Data_Factors',
  factors = reader.DataReader('F-F_Research_Data_Factors',


In [39]:
print(f"start={start} end={end}")
print(factors.keys())
print(factors[0].head())
print(factors[1].head())
print(factors['DESCR'])


start=2019-06-01 end=2024-05-30
dict_keys([0, 1, 'DESCR'])
         Mkt-RF   SMB   HML    RF
Date                             
2019-06    6.93  0.29 -0.71  0.18
2019-07    1.19 -1.93  0.48  0.19
2019-08   -2.58 -2.38 -4.78  0.16
2019-09    1.43 -0.96  6.75  0.18
2019-10    2.06  0.29 -1.91  0.16
      Mkt-RF    SMB    HML    RF
Date                            
2019   28.28  -6.14 -10.46  2.15
2020   23.66  13.18 -46.67  0.45
2021   23.56  -3.89  25.49  0.04
2022  -21.60  -6.95  25.81  1.43
2023   21.70  -3.24 -13.60  4.95
F-F Research Data Factors
-------------------------

This file was created by CMPT_ME_BEME_RETS using the 202404 CRSP database. The 1-month TBill return is from Ibbotson and Associates, Inc. Copyright 2024 Kenneth R. French

  0 : (59 rows x 4 cols)
  1 : Annual Factors: January-December (5 rows x 4 cols)


In [48]:
# we want monthly returns
print(factors[0].head())
print(factors[0].tail())




         Mkt-RF   SMB   HML    RF
Date                             
2019-06    6.93  0.29 -0.71  0.18
2019-07    1.19 -1.93  0.48  0.19
2019-08   -2.58 -2.38 -4.78  0.16
2019-09    1.43 -0.96  6.75  0.18
2019-10    2.06  0.29 -1.91  0.16
         Mkt-RF   SMB   HML    RF
Date                             
2023-12    4.87  6.34  4.93  0.43
2024-01    0.70 -5.09 -2.38  0.47
2024-02    5.06 -0.24 -3.49  0.42
2024-03    2.83 -2.49  4.19  0.43
2024-04   -4.67 -2.39 -0.51  0.47


In [47]:
# Lets we what we have in our df => starts from June. Which matches with above
print(monthly_returns.head())
print(monthly_returns.tail())

Date
2019-06-30    0.088284
2019-07-31    0.016837
2019-08-31   -0.013046
2019-09-30   -0.016777
2019-10-31    0.038780
Freq: M, Name: Adj Close, dtype: float64
Date
2024-01-31    0.030388
2024-02-29    0.090909
2024-03-31    0.026756
2024-04-30   -0.044517
2024-05-31    0.096023
Freq: M, Name: Adj Close, dtype: float64


In [45]:
monthly_returns.shape
factors[0].shape

(59, 4)

In [49]:
# index of monthly_returns is 2019-06-30 format but index of factors[0].tail() is 2019-06 format
# let me try converting 2019-06-30 to 2019-06


In [56]:
monthly_returns.index = monthly_returns.index.to_period()
monthly_returns.head()

Date
2019-06    0.088284
2019-07    0.016837
2019-08   -0.013046
2019-09   -0.016777
2019-10    0.038780
Freq: M, Name: Adj Close, dtype: float64

In [52]:
factors[0].index

PeriodIndex(['2019-06', '2019-07', '2019-08', '2019-09', '2019-10', '2019-11',
             '2019-12', '2020-01', '2020-02', '2020-03', '2020-04', '2020-05',
             '2020-06', '2020-07', '2020-08', '2020-09', '2020-10', '2020-11',
             '2020-12', '2021-01', '2021-02', '2021-03', '2021-04', '2021-05',
             '2021-06', '2021-07', '2021-08', '2021-09', '2021-10', '2021-11',
             '2021-12', '2022-01', '2022-02', '2022-03', '2022-04', '2022-05',
             '2022-06', '2022-07', '2022-08', '2022-09', '2022-10', '2022-11',
             '2022-12', '2023-01', '2023-02', '2023-03', '2023-04', '2023-05',
             '2023-06', '2023-07', '2023-08', '2023-09', '2023-10', '2023-11',
             '2023-12', '2024-01', '2024-02', '2024-03', '2024-04'],
            dtype='period[M]', name='Date')

In [60]:
monthly_returns.to_frame().head()
factors[0]

Unnamed: 0_level_0,Mkt-RF,SMB,HML,RF
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
2019-06,6.93,0.29,-0.71,0.18
2019-07,1.19,-1.93,0.48,0.19
2019-08,-2.58,-2.38,-4.78,0.16
2019-09,1.43,-0.96,6.75,0.18
2019-10,2.06,0.29,-1.91,0.16
2019-11,3.87,0.77,-2.02,0.12
2019-12,2.77,0.73,1.75,0.14
2020-01,-0.11,-3.11,-6.25,0.13
2020-02,-8.13,1.07,-3.81,0.12
2020-03,-13.39,-4.83,-13.87,0.13


In [63]:
merged_df = monthly_returns.to_frame().join(factors[0])
merged_df.head()

Unnamed: 0_level_0,Adj Close,Mkt-RF,SMB,HML,RF
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
2019-06,0.088284,6.93,0.29,-0.71,0.18
2019-07,0.016837,1.19,-1.93,0.48,0.19
2019-08,-0.013046,-2.58,-2.38,-4.78,0.16
2019-09,-0.016777,1.43,-0.96,6.75,0.18
2019-10,0.03878,2.06,0.29,-1.91,0.16


In [65]:
merged_df = merged_df.dropna(axis=0)

In [66]:
merged_df.tail()

Unnamed: 0_level_0,Adj Close,Mkt-RF,SMB,HML,RF
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
2023-12,0.059996,4.87,6.34,4.93,0.43
2024-01,0.030388,0.7,-5.09,-2.38,0.47
2024-02,0.090909,5.06,-0.24,-3.49,0.42
2024-03,0.026756,2.83,-2.49,4.19,0.43
2024-04,-0.044517,-4.67,-2.39,-0.51,0.47


In [68]:
merged_df[['Mkt-RF','SMB',	'HML', 'RF']] = merged_df[['Mkt-RF','SMB',	'HML', 'RF']]/100


In [69]:
# we have monthly retyrnsn in Adj Close
merged_df['returns-rf'] = merged_df["Adj Close"] - merged_df.RF

In [71]:
merged_df

Unnamed: 0_level_0,Adj Close,Mkt-RF,SMB,HML,RF,returns-rf
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2019-06,0.088284,0.0693,0.0029,-0.0071,0.0018,0.086484
2019-07,0.016837,0.0119,-0.0193,0.0048,0.0019,0.014937
2019-08,-0.013046,-0.0258,-0.0238,-0.0478,0.0016,-0.014646
2019-09,-0.016777,0.0143,-0.0096,0.0675,0.0018,-0.018577
2019-10,0.03878,0.0206,0.0029,-0.0191,0.0016,0.03718
2019-11,0.072175,0.0387,0.0077,-0.0202,0.0012,0.070975
2019-12,0.029354,0.0277,0.0073,0.0175,0.0014,0.027954
2020-01,0.027622,-0.0011,-0.0311,-0.0625,0.0013,0.026322
2020-02,-0.044191,-0.0813,0.0107,-0.0381,0.0012,-0.045391
2020-03,-0.102479,-0.1339,-0.0483,-0.1387,0.0013,-0.103779


$ r_{i} - r_{f} = \alpha_{i} + \beta_i (r_{M} - r_{f}) + s_i \cdot SMB + h_i \cdot HML + \epsilon_i$

we can run regression now

In [72]:
## Regression
y = merged_df['returns-rf']
X = merged_df[["Mkt-RF","SMB",	"HML"]]

# intercept not added by default, we need to add
X_sm = sm.add_constant(X)
X_sm.head()


Unnamed: 0_level_0,const,Mkt-RF,SMB,HML
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
2019-06,1.0,0.0693,0.0029,-0.0071
2019-07,1.0,0.0119,-0.0193,0.0048
2019-08,1.0,-0.0258,-0.0238,-0.0478
2019-09,1.0,0.0143,-0.0096,0.0675
2019-10,1.0,0.0206,0.0029,-0.0191


In [73]:
model = sm.OLS(y,X_sm)
results = model.fit()
results.summary()

0,1,2,3
Dep. Variable:,returns-rf,R-squared:,0.953
Model:,OLS,Adj. R-squared:,0.951
Method:,Least Squares,F-statistic:,372.5
Date:,"Mon, 03 Jun 2024",Prob (F-statistic):,1.68e-36
Time:,16:43:02,Log-Likelihood:,165.34
No. Observations:,59,AIC:,-322.7
Df Residuals:,55,BIC:,-314.4
Df Model:,3,,
Covariance Type:,nonrobust,,

0,1,2,3,4,5,6
,coef,std err,t,P>|t|,[0.025,0.975]
const,0.0056,0.002,2.755,0.008,0.002,0.010
Mkt-RF,1.1470,0.038,29.881,0.000,1.070,1.224
SMB,0.1603,0.071,2.258,0.028,0.018,0.302
HML,-0.4284,0.042,-10.172,0.000,-0.513,-0.344

0,1,2,3
Omnibus:,19.567,Durbin-Watson:,2.356
Prob(Omnibus):,0.0,Jarque-Bera (JB):,36.217
Skew:,-1.041,Prob(JB):,1.37e-08
Kurtosis:,6.225,Cond. No.,36.6


Models turns out to be 

$ r_{i} - r_{f} = 0.0056 + 1.1470 * (r_{M} - r_{f}) + 0.1603 \cdot SMB -0.4284 \cdot HML + \epsilon_i$


Note that r^2 is quite high 0.953

$s_i = 0.1603$ is positive , this means fund FDGRX is investing in small stocks as compared to big stocks. (excess return is more correlated to s Minus Big factor)

$h_i =  -0.4284$ his means fund FDGRX is investing in growth more than value stocks ( H - L). it is expected as Funds name id Fidelity Growth..


$ alpha = const	 = 0.0056$ which means it is a return above the risk factors which is good.