# Factor Analysis

My goal here is to analyse the performance of diverse mutual funds using factor analysis. I will analyse the performance from 2015 to the end of 2020.

Mutual funds to be analysed:
- Fidelity Advisor Small Cap Value Fund Class I (FCVIX)
- Vanguard S&P Small-Cap 600 Value Index Fund Institutional Shares (VSMVX)
- BlackRock Health Sciences Opportunities Portfolio Institutional Shares (SHSSX)
- Fidelity Small Cap Value Fund (FCPVX)
- Vanguard Dividend Growth Fund Investor Shares (VDIGX)

In [1]:
import pandas as pd
import numpy as np
import edhec_risk_kit_01 as erk
import yfinance as yf

In [2]:
factors = pd.read_csv('F-F_Research_Data_5_Factors_2x3.CSV', skiprows=2, index_col=0, parse_dates=True)
factors = factors[:694]
factors.index = pd.to_datetime(factors.index, format='%Y%m').to_period('M')
factors = factors['2015':'2020']
factors = factors.astype('float')
factors = factors.drop('RF', axis=1)
factors

Unnamed: 0,Mkt-RF,SMB,HML,RMW,CMA
2015-01,-3.11,-0.86,-3.56,1.68,-1.65
2015-02,6.14,0.23,-1.81,-1.16,-1.78
2015-03,-1.12,3.04,-0.41,0.10,-0.51
2015-04,0.59,-3.06,1.88,-0.04,-0.45
2015-05,1.36,0.80,-1.10,-1.82,-0.74
...,...,...,...,...,...
2020-08,7.63,-0.94,-2.94,4.27,-1.44
2020-09,-3.63,0.07,-2.51,-1.15,-1.77
2020-10,-2.10,4.76,4.03,-0.60,-0.53
2020-11,12.47,6.75,2.11,-2.78,1.05


In [3]:
tickers = ['FCVIX', 'VSMVX', 'SHSSX', 'FCPVX', 'VDIGX']
funds = yf.download(tickers, start='2015-01-01', end='2020-12-31')
funds = funds['Adj Close']
funds

[*********************100%***********************]  5 of 5 completed


Unnamed: 0_level_0,FCPVX,FCVIX,SHSSX,VDIGX,VSMVX
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
2014-12-31,18.022139,18.018858,49.945477,20.623581,191.836548
2015-01-02,17.888847,17.885672,50.111404,20.569988,190.486313
2015-01-05,17.603235,17.600264,49.886913,20.275238,186.988251
2015-01-06,17.355709,17.343391,49.506245,20.141262,183.653442
2015-01-07,17.584196,17.571718,50.794655,20.400288,185.112473
...,...,...,...,...,...
2020-12-23,16.900000,16.900000,77.029999,33.007191,291.163940
2020-12-24,16.900000,16.900000,77.099998,33.135933,291.173889
2020-12-28,16.930000,16.930000,77.209999,33.343899,293.238098
2020-12-29,16.790001,16.790001,77.320000,32.914364,289.179474


In [4]:
funds_rets = funds.pct_change().dropna()
funds_rets_m = funds_rets.resample('M').apply(erk.compound).to_period('M')
funds_rets_m_FCPVX = funds_rets_m['FCPVX']
funds_rets_m_FCPVX = pd.DataFrame(funds_rets_m_FCPVX)
funds_rets_m_FCPVX

Unnamed: 0_level_0,FCPVX
Date,Unnamed: 1_level_1
2015-01,-0.029583
2015-02,0.044638
2015-03,0.007816
2015-04,-0.011375
2015-05,0.004707
...,...
2020-08,0.049012
2020-09,-0.035908
2020-10,0.028169
2020-11,0.199391


We know have our explanatory variables (factors) and our dependent variables (mutual funds). We can construct our factor model for each dependent variable.

In [5]:
import statsmodels.api as sm
exp_var = factors.copy()
exp_var['Alpha'] = 1
exp_var
coeffs = sm.OLS(funds_rets_m['FCPVX'], exp_var).fit()
coeffs.summary()

0,1,2,3
Dep. Variable:,FCPVX,R-squared:,0.814
Model:,OLS,Adj. R-squared:,0.8
Method:,Least Squares,F-statistic:,57.92
Date:,"Thu, 10 Jun 2021",Prob (F-statistic):,8.159999999999999e-23
Time:,23:34:52,Log-Likelihood:,158.2
No. Observations:,72,AIC:,-304.4
Df Residuals:,66,BIC:,-290.7
Df Model:,5,,
Covariance Type:,nonrobust,,

0,1,2,3,4,5,6
,coef,std err,t,P>|t|,[0.025,0.975]
Mkt-RF,0.0091,0.001,10.555,0.000,0.007,0.011
SMB,0.0062,0.002,4.044,0.000,0.003,0.009
HML,0.0040,0.001,2.834,0.006,0.001,0.007
RMW,0.0013,0.002,0.550,0.584,-0.003,0.006
CMA,-0.0036,0.003,-1.400,0.166,-0.009,0.002
Alpha,-0.0064,0.004,-1.791,0.078,-0.013,0.001

0,1,2,3
Omnibus:,82.561,Durbin-Watson:,2.097
Prob(Omnibus):,0.0,Jarque-Bera (JB):,1028.587
Skew:,-3.344,Prob(JB):,4.420000000000001e-224
Kurtosis:,20.266,Cond. No.,5.34


In [6]:
coeffs = pd.DataFrame()
tvalues = pd.DataFrame()
for fund in funds_rets_m.columns: coeffs[fund] = sm.OLS(funds_rets_m[fund], exp_var).fit().params
for fund in funds_rets_m.columns: tvalues[fund] = sm.OLS(funds_rets_m[fund], exp_var).fit().tvalues
summary = pd.merge(coeffs, tvalues, on=coeffs.index, suffixes=[' Coeffs', ' TValues'], left_index=True, right_index=True)
summary

Unnamed: 0,FCPVX Coeffs,FCVIX Coeffs,SHSSX Coeffs,VDIGX Coeffs,VSMVX Coeffs,FCPVX TValues,FCVIX TValues,SHSSX TValues,VDIGX TValues,VSMVX TValues
Mkt-RF,0.009097,0.009121,0.008482,0.008762,0.009871,10.554614,10.574628,10.238457,19.48382,37.695829
SMB,0.006154,0.006126,0.001409,-0.002206,0.008143,4.04445,4.022921,0.963666,-2.778641,17.615624
HML,0.004006,0.004002,-0.005117,-0.00076,0.003767,2.834284,2.829634,-3.766394,-1.030469,8.773482
RMW,0.00133,0.001298,-0.004465,-0.000333,0.001599,0.550144,0.536447,-1.921441,-0.263877,2.176789
CMA,-0.003572,-0.003563,0.001655,0.002059,0.001202,-1.400191,-1.395422,0.674795,1.546794,1.551178
Alpha,-0.006366,-0.006386,-0.00492,-0.002291,0.000881,-1.791009,-1.795109,-1.439946,-1.235031,0.815322


The issue with this method is that coefficients are not easily interpretable. Using Sharpe Style Analysis, we can get numbers that represent weights since we add a constraint that results sum to 1.

In [7]:
weights = pd.DataFrame()
for fund in funds_rets_m: weights[fund] = erk.style_analysis(funds_rets_m[fund], factors)
weights = weights.round(2)
weights

Unnamed: 0,FCPVX,FCVIX,SHSSX,VDIGX,VSMVX
Mkt-RF,0.02,0.02,0.02,0.02,0.02
SMB,0.18,0.18,0.17,0.17,0.18
HML,0.0,0.0,0.0,0.0,0.0
RMW,0.48,0.48,0.49,0.49,0.48
CMA,0.32,0.32,0.32,0.32,0.32


Just a recap of what the funds are supposed to be :
- Fidelity Advisor Small Cap Value Fund Class I (FCVIX)
- Vanguard S&P Small-Cap 600 Value Index Fund Institutional Shares (VSMVX)
- BlackRock Health Sciences Opportunities Portfolio Institutional Shares (SHSSX)
- Fidelity Small Cap Value Fund (FCPVX)
- Vanguard Dividend Growth Fund Investor Shares (VDIGX)

