This is a notebook to illustrate the French Fama 3 Factor model.  Essentially, it is a practise of linear regression.  The notebook is a replication of the orignal repository(https://github.com/pranav0904/Fama-French-Three-Factor-Model) and I have made some improvements and modifications.

I have added remarks and modified a few things as personal learning process.

# Fama–French three-factor model

In asset pricing and portfolio management the Fama–French three-factor model is a model designed to describe stock returns.

**The three factors are:**

1. Rm-Rf : Market risk premuim defined by the market return minus the risk free return
2. SMB : Small [market capitalization] Minus Big
3. HML : High [book-to-market ratio] Minus Low

The market return is predicted according to the following:

$$r = r_f + \beta_1(r_m-r_f) + \beta_2(SMB) + \beta_3(HML) + \alpha$$

where $r$ is the return for a given asset, $r_f$ is the risk free return, $r_m$ is the market return, $\alpha$ is the traditional $\alpha$ we know in finance.

If we recall the CAPM, we see the French Fama model is a modification of the CAPM with two additional terms.  And the $\beta_1$ from French Fama model will be slightly different from the traditional $\beta_1$ from the CAPM.

To use French Fama 3 Factor Model to predict the return of a given asset, we can just do a linear regression.

In the following, we implement the linear regression via projection (see the Element of Statistical Learning for explaination of linear regression and its geometric intuition) and using scipy package.

In [121]:
import yfinance as yf
import numpy as np
import pandas as pd

import plotly.express as px

In [122]:
def market_return(List, Start, End):
  j = 0
  T = 250
  N = len(List)
  PORTFOLIO = np.zeros((T, N))
  PRICE = np.zeros((T+1, N))

  for i in List:
    stock_symbol = yf.Ticker(i)
    data = stock_symbol.history(start=Start, end=End)
    # We assume price is the average between open and close price
    Price = (data['Open'] + data['Close'])/2
    # Here we calculate the log return
    Return = Price.div(Price.shift(1)).dropna()
    Return = np.log(Return)

    PORTFOLIO[:, j] = Return
    PRICE[:, j] = Price
    j+=1

  return PORTFOLIO, PRICE

In [123]:
def time_series_plot(data):
  fig = px.line(data)
  fig.show()

Here we load a few stocks for illustration purpose

In [124]:
Stock_symbols = ['JILL', 'ELTK', 'ONVO', 'UAVS', 'AMZN', 'GOOG', 'ORCL', 'MSFT']

Portfolio_Return, Portfolio_Price = market_return(Stock_symbols, '2019-01-02', '2019-12-31')
[T, N] = Portfolio_Return.shape

Now we plot the price of our stocks

In [125]:
time_series_plot(Portfolio_Price)

Here we plot the return of our stocks

In [126]:
time_series_plot(Portfolio_Return)

In [127]:
FAMA_FRENCH_3 = pd.read_csv('/content/F-F_Research_Data_Factors_daily.CSV',
                            names=['Mkt-RF', 'SMB', 'HML', 'RF'], skiprows=24392, nrows=250)

In [128]:
FAMA_FRENCH_3.head()

Unnamed: 0,Mkt-RF,SMB,HML,RF
20190102,0.23,0.57,1.1,0.01
20190103,-2.45,0.4,1.26,0.01
20190104,3.55,0.43,-0.72,0.01
20190107,0.94,0.96,-0.78,0.01
20190108,1.01,0.54,-0.64,0.01


In [129]:
# Converting series data to array
MKT = FAMA_FRENCH_3['Mkt-RF'].to_numpy()
RF  = FAMA_FRENCH_3['RF'].to_numpy()

SMB = FAMA_FRENCH_3['SMB']
HML = FAMA_FRENCH_3['HML']


F = np.column_stack((np.ones((T)), MKT, SMB, HML))
K = F.shape[1]

We see we added a column of 1's and the second, third, and fourth column corresponds to the fama factors we loaded above

In [130]:
F[:10]

array([[ 1.  ,  0.23,  0.57,  1.1 ],
       [ 1.  , -2.45,  0.4 ,  1.26],
       [ 1.  ,  3.55,  0.43, -0.72],
       [ 1.  ,  0.94,  0.96, -0.78],
       [ 1.  ,  1.01,  0.54, -0.64],
       [ 1.  ,  0.56,  0.46,  0.06],
       [ 1.  ,  0.42,  0.03, -0.47],
       [ 1.  , -0.01,  0.13,  0.19],
       [ 1.  , -0.6 , -0.6 ,  0.94],
       [ 1.  ,  1.06,  0.  , -0.86]])

# Using projection to obtain French Fama coefficients

We recall again the french fama formula:

Here we use the projection formula to calculate the coefficients for the linear regression.  As we recall from Chapter 3 of the element of statistical learning, for a regression model of the form $y=X\beta+\epsilon$, the estimated coefficient $\hat{\beta}$ is given by $(X^TX)^{-1}X^Ty$ (We are writing everything in vector form).  

We can see $\beta$'s are coefficients in the standard linear regression model.  Geometrically, it is the projection of the asset return $r_a$ to the market return $r_m$.  As we recall from Chapter 3 of the element of statistical learning, for a regression model of the form $y=X\beta+\epsilon$, the estimated coefficient $\hat{\beta}$ is given by $(X^TX)^{-1}X^Ty$ (We are writing everything in vector form).  So in the following we use this formula to calculate the coefficients

In [131]:
def get_beta(Portfolio_Return, RF, MKT):

    #Initializing beta with zero for each stock
    beta = np.zeros((K,N))
    for i in range(0,N):
        y = Portfolio_Return[:,i]
        x = F
        beta[:,i] = np.linalg.inv(x.conj().T @ x) @ (x.conj().T) @ y
    return beta

Now, we get the beta's as the following.

The beta matrix is $4\times8$ as we have 4 coefficients and 8 stocks.  Each column corresponds to the coefficients calculated based on the linear regression for a given asset.

In [132]:
beta = get_beta(Portfolio_Return, RF, MKT)
beta

array([[-0.00563906,  0.0021249 , -0.00331409, -0.00140744,  0.00059597,
         0.00074099,  0.00058325,  0.00162206],
       [ 0.00495478, -0.00240032,  0.00128262,  0.00231723,  0.00255241,
         0.00250559,  0.00193669,  0.00203196],
       [ 0.00706307,  0.01735697,  0.01711495, -0.00317548,  0.0032251 ,
         0.00179751,  0.00032727,  0.00141692],
       [ 0.00467676, -0.0124355 ,  0.00610864, -0.00481799, -0.00157961,
        -0.00248362,  0.00096744, -0.00210312]])

# Calculate $\alpha$

Recall the French Fama formula:

$$r = r_f + \beta_1(r_m-r_f) + \beta_2(SMB) + \beta_3(HML) + \alpha$$

Alpha is a measure of the performance of an investment as compared to a suitable benchmark index.
1. Positive Alpha : Outperformed the overall market
2. Negative Alpha : Underperformed the overall market

$\alpha$ can be calculated by simple rearranging the French Fama formula:

$$\alpha = r - r_f - \beta_1(r_m-r_f) - \beta_2(SMB) - \beta_3(HML)$$

where all the quantities are defined as above

In [133]:
# ACTIVE RETURNS

alpha = np.zeros((N,T))
sigma = np.zeros((N,N,T))

for i in range(0,T):
    y = Portfolio_Return[i,:].conj().T
    x = beta.conj().T
    lmbd = np.linalg.inv(x.conj().T @ x) @ (x.conj().T) @ y
    alpha[:,i] = y - x @ lmbd      # (Nx1) - (NxK)(Kx1)
    sigma[:,:,i] = alpha[:,i] @ alpha[:,i].conj().T

ALPHA = np.mean(alpha,axis=1) # N x 1
SIGMA = np.mean(sigma,axis=2) # N x N

Now we plot our $\alpha$ for different assets

In [134]:
#Active Returns of the Portfolio
fig = px.line(ALPHA)
fig.show()

# Using statsmodel Package to calculate Linear Regression

In this part, we use statsmodel package to calculate the french fama coefficients and we show they are the same as the ones we obtained from the projection formula

In [135]:
import matplotlib.pyplot as plt
import statsmodels.api as sm

In [136]:
FAMA_FRENCH_3 = FAMA_FRENCH_3.drop(['RF'], axis=1)

In [137]:
X = FAMA_FRENCH_3

X1 = sm.add_constant(X) # Add a constant to the independent value

for i in range(0, N):
    y = Portfolio_Return[:,i]

    # make regression model
    model = sm.OLS(y, X1)

    # fit model and print results
    results = model.fit()

    #Coeficient values:
    print(f"{i+1}) Stock Symbol: ",Stock_symbols[i])
    print("\n Const: ",results.params[0])
    print(" Mkt-RF: ",results.params[1]," SMB: ",results.params[2]," HML: ",results.params[3])
    print("\n Summary: ")
    #print(results.summary(),"\n")

1) Stock Symbol:  JILL

 Const:  -0.005639060880318628
 Mkt-RF:  0.00495478434078936  SMB:  0.0070630746739093276  HML:  0.0046767550615601245

 Summary: 
2) Stock Symbol:  ELTK

 Const:  0.0021249040387892747
 Mkt-RF:  -0.0024003234807799096  SMB:  0.01735696728202961  HML:  -0.012435495705233751

 Summary: 
3) Stock Symbol:  ONVO

 Const:  -0.0033140875634543442
 Mkt-RF:  0.0012826192790499904  SMB:  0.01711494973211254  HML:  0.006108637259576584

 Summary: 
4) Stock Symbol:  UAVS

 Const:  -0.0014074411208938895
 Mkt-RF:  0.0023172297622286674  SMB:  -0.003175483330388466  HML:  -0.004817990273355817

 Summary: 
5) Stock Symbol:  AMZN

 Const:  0.0005959696198530178
 Mkt-RF:  0.0025524144366675747  SMB:  0.003225100865662163  HML:  -0.0015796080299700127

 Summary: 
6) Stock Symbol:  GOOG

 Const:  0.0007409945597867028
 Mkt-RF:  0.002505592457858133  SMB:  0.001797511412050734  HML:  -0.0024836150588598573

 Summary: 
7) Stock Symbol:  ORCL

 Const:  0.0005832507745064204
 Mkt-RF:


Series.__getitem__ treating keys as positions is deprecated. In a future version, integer keys will always be treated as labels (consistent with DataFrame behavior). To access a value by position, use `ser.iloc[pos]`


Series.__getitem__ treating keys as positions is deprecated. In a future version, integer keys will always be treated as labels (consistent with DataFrame behavior). To access a value by position, use `ser.iloc[pos]`



If we compare with above, we see the coefficients are almost the same and we conclude the two methods are consistent.