# Method 2
<br>
Here, I tried a regression-based method from extended Fama-French 5-factors model to decompose return into factor contributions and residual(idiosyncratic) return.<br>
The 5 factors are: <br><br>
Mkt-Rf - market factor (beta)<br> 
SMB (Small Minus Big) - the average return on the nine small stock portfolios minus the average return on the nine big stock portfolios<br>
HML (High Minus Low) - the average return on the two value portfolios minus the average return on the two growth portfolios<br>
RMW (Robust Minus Weak) - the average return on the two robust operating profitability portfolios minus the average return on the two weak operating profitability portfolios<br>
CMA (Conservative Minus Aggressive) - the average return on the two conservative investment portfolios minus the average return on the two aggressive investment portfolios<br><br>

[The detail for constructing the factors could be found here](http://mba.tuck.dartmouth.edu/pages/faculty/ken.french/Data_Library/f-f_5_factors_2x3.html).<br>
<br>I used AAPL as the example, firstly I downloaded the adjusted daily return from Yahoo Finance. Then I use a 4-week observation window (4-weeks of daily return per regression) to perform the regression and use a 1-week stride (one regression per week). After the regression I calculate the contribution for each factor and the residual return component.<br>
<br>
All the return data in the regression result part are (averaged) daily returns.<br>

In [11]:
import pandas as pd
import numpy as np
import pandas_datareader.data as web
import datetime
import matplotlib.pyplot as plt
import math
import time
import warnings
import statsmodels.api as sm
warnings.filterwarnings("ignore")

In [2]:
factor_return = pd.read_csv('./data/factor.CSV',index_col = 0, parse_dates = True)
factor_return = factor_return.iloc[:,:5]/100
factor_return_weekly = factor_return.resample('W-FRI').sum()

In [6]:
def calculate(sym, n_week = 4):
    start = datetime.datetime(2015, 1, 1)
    end = datetime.datetime(2017, 12, 31)
    stk = web.DataReader(sym, 'yahoo',start,end)
    stk_return = stk['Adj Close'].pct_change().fillna(0)
    stk_weekly_return = stk.resample('W-FRI').last()['Adj Close'].pct_change().fillna(0)
    week_index = stk_weekly_return.index
    cols = ['Mkt-RF','SMB','HML','RMW','CMA','Residual','Total return']
    result = pd.DataFrame(index = week_index[n_week:],columns = cols)
    for i in range(len(week_index)-n_week):
        start = week_index[i]
        end = week_index[i+n_week]
        y = stk_return[start:end]
        x = factor_return[start:end]
        x_with_const = sm.add_constant(x)
        model = sm.OLS(y,x_with_const)
        results = model.fit()
        tmp = (results.params[1:]*x.mean())
        tmp['Residual'] = y.mean() - tmp.sum()
        tmp['Total return'] = y.mean()
        result.loc[end] = tmp
        result.fillna(0,inplace=True)
    return result

In [7]:
AAPL_result = calculate('AAPL')
# May need to rerun if error raised.
# It's due to the instability of Yahoo Finance API

In [9]:
AAPL_result.tail(10)

Unnamed: 0_level_0,Mkt-RF,SMB,HML,RMW,CMA,Residual,Total return
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
2017-10-27,0.00185,0.000463,-1.6e-05,-0.000553,0.003681,-0.002422,0.003005
2017-11-03,0.000542,-0.001074,5.7e-05,-0.00187,0.002555,0.004858,0.005068
2017-11-10,0.000404,-0.000204,0.000352,-0.001312,0.001361,0.00505,0.005652
2017-11-17,0.000419,2.6e-05,0.000868,-0.001505,0.000663,0.003929,0.0044
2017-11-24,0.000814,-3e-06,0.003416,-0.001054,0.000365,0.002033,0.005569
2017-12-01,0.001576,-1.1e-05,0.000171,-0.001559,3e-06,0.000931,0.001112
2017-12-08,0.001653,-1.7e-05,-0.000517,-0.001081,2.3e-05,-0.001721,-0.00166
2017-12-15,0.001545,1.4e-05,-0.000405,-0.000871,-3.5e-05,0.000629,0.000876
2017-12-22,0.001712,1.5e-05,-0.000533,-0.000625,-7.1e-05,-0.000444,5.2e-05
2017-12-29,0.000735,9.4e-05,3e-06,-9.1e-05,-0.000864,-0.000603,-0.000725


For example, the last row means that. From 2017-12-01 to 2017-12-29 (4-week window), the average daily return for AAPL is -0.0725%, Market factor, SMB factor, HML factor, RMW factor and CMA factor contribute 0.0735%, 0.0094%, 0.0003%, -0.0091, -0.0864%, -0.0603% respectively, and the daily return that cannot be explained by the 5 factors is -0.0603%  <br>


In [10]:
AAPL_result.iloc[:,:6].corr()

Unnamed: 0,Mkt-RF,SMB,HML,RMW,CMA,Residual
Mkt-RF,1.0,-0.13514,-0.125198,-0.329728,0.279505,0.023929
SMB,-0.13514,1.0,0.139149,0.014163,0.055167,-0.287684
HML,-0.125198,0.139149,1.0,0.030794,-0.316666,-0.271366
RMW,-0.329728,0.014163,0.030794,1.0,-0.58085,-0.138424
CMA,0.279505,0.055167,-0.316666,-0.58085,1.0,0.154846
Residual,0.023929,-0.287684,-0.271366,-0.138424,0.154846,1.0


Finally, I calculate the correlation from different contributions, the relatively low correlation confirms that the these factors are indeed different sources for explaining stock return.<br>