# Asset Pricing: Empirical Analysis #2

## Implementation of the APT approach proposed by Ross

Goal: Estimate the multi-beta relationship for a global stock index and a sectoral sub-index


APT is based on the basic idea that there are no arbitrage opportunities that last over time. In effect, an asset A that is as risky as asset B, but more profitable, would see its demand increase rapidly, until its profitability became equal to that of asset B, thus cancelling out any arbitrage opportunity.

The other basic assumption of APT is that the expected profitability of a stock can be modelled by a linear function of various macro-economic or sector-specific factors, weighted according to their impact on the stock by a specific beta coefficient.

These factors are diverse and can range from oil prices to US GDP, from European key rates to the exchange rate of a currency pair. These are all factors likely to influence the price of the asset under study.

The model proposed by Ross, is based on a multi-factor model, where the returns of an asset are related to several macroeconomic factors. These factors could include inflation rates, interest rates, economic indicators, etc. The model assumes that the expected return on an asset is a linear function of these factors.

The period of study start from 2014 to 2019

**Factors are:**
* I - Inflation : Log relative of US consumer Price Index
* TB - Treasury bill rate : end of period return on 1-month bills
* LGB - Long-term government bonds : Return on LT government bonds
* IP - Industrial production : industrial production during month
* Baa - Low grade bond
* EWNY - return on equally weighted portfolio of NYSE listed stocks 
* VWNY - return on a value-weighted portfolio of NYSE listed stocks
* CG - Growth rate in real per capita consumtion
* OG - log relative of producer price index/crude petroleum series

**Derived factors :** 
* Monthly Growth IP: $MP(t) = log(IP(t)/IP_t-1)$
* Annual growth IP: $YP(t) = log(IP(t)/IP_t-12))$
* Annual growth IP: $E(I(t)) = expected infla$
* Unexpected Inflation: $UI(t) = I(t) - E(I(t)|t-1)$
* Real interest ex post: $RHO(t) = TP(t-1) - I(t)$
* Change in expected infla $DEI(t) = E(I(t+1)|t) - E(I(t)|t-1)$
* Risk premium: $UPR(t) = Baa(t) - LGB(t)$
* Term Structure: $UTS(t) = LGB(t) - TB(t-1)$

To collect the data we will use the FRED API and Yahoo Finance API

In [26]:
# basic libs
import pandas as pd
import numpy as np
from datetime import datetime

# stats libs
import statsmodels.api as sm
import statsmodels.regression.linear_model as lm

# import yahoo finance to collect stocks data
import yfinance as yf

In [2]:
from fredapi import Fred

# Get data from FRED :
def get_FRED_series(ticker):
    FRED_API_KEY = "9a54ab68d82273ea59014b16364b5bdd"
    fred = Fred(api_key=FRED_API_KEY)
    data = fred.get_series(ticker)
    data = data.dropna()
    data = pd.DataFrame(data)
    data.index = pd.to_datetime(data.index)
    return data

### Import the data

#### Inflation Factor (I)

Freqency monthly

Inflation, consumer prices for the United States (FPCPITOTLZGUSA)

In [3]:
I_factor = get_FRED_series("FPCPITOTLZGUSA")["2013":"2019"]
I_factor = I_factor.resample('M').ffill()
I_factor.index = I_factor.index + pd.DateOffset(days=1)
I_factor = I_factor.rename(columns={0: 'I_factor'})
I_factor

Unnamed: 0,I_factor
2013-02-01,1.464833
2013-03-01,1.464833
2013-04-01,1.464833
2013-05-01,1.464833
2013-06-01,1.464833
...,...
2018-10-01,2.442583
2018-11-01,2.442583
2018-12-01,2.442583
2019-01-01,2.442583


#### Treasury bill rate factor (TB)

Frequency : daily 
Let's resample to monthly frequency

4-Week Treasury Bill Secondary Market Rate, Discount Basis (DTB4WK)

In [4]:
TB_factor = get_FRED_series("DTB4WK")["2013":"2019"].interpolate()
TB_factor = TB_factor.resample("M").mean()
TB_factor.index = TB_factor.index + pd.DateOffset(days=1)
TB_factor = TB_factor.rename(columns={0: 'TB_factor'})
TB_factor

Unnamed: 0,TB_factor
2013-02-01,0.051429
2013-03-01,0.076842
2013-04-01,0.077500
2013-05-01,0.049091
2013-06-01,0.020455
...,...
2019-09-01,2.031364
2019-10-01,1.952000
2019-11-01,1.696818
2019-12-01,1.553684


#### Long-term government bonds factor (LGB)

Frequency : Daily

Let's resample to monthly frequency

Market Yield on U.S. Treasury Securities at 30-Year Constant Maturity, Quoted on an Investment Basis (DGS30)

In [5]:
LGB_factor = get_FRED_series("DGS30")["2013":"2019"].interpolate()
LGB_factor =  LGB_factor.resample("M").mean()
LGB_factor.index = LGB_factor.index + pd.DateOffset(days=1)
LGB_factor = LGB_factor.rename(columns={0: 'LGB_factor'})
LGB_factor

Unnamed: 0,LGB_factor
2013-02-01,3.080476
2013-03-01,3.165263
2013-04-01,3.162500
2013-05-01,2.932727
2013-06-01,3.112727
...,...
2019-09-01,2.119091
2019-10-01,2.158000
2019-11-01,2.190455
2019-12-01,2.280526


#### Industrial Production factor (IP)

Frequency : Monthly

Industrial Production: Total Index

The industrial production (IP) index measures the real output of all relevant establishments located in the United States, regardless of their ownership, but not those located in U.S. territories.

In [6]:
IP_factor = get_FRED_series("INDPRO")["2014":"2020"]
IP_factor = IP_factor.rename(columns={0: 'IP_factor'})
IP_factor

Unnamed: 0,IP_factor
2014-01-01,99.9990
2014-02-01,100.7583
2014-03-01,101.7767
2014-04-01,101.8425
2014-05-01,102.2594
...,...
2020-08-01,95.8881
2020-09-01,95.8444
2020-10-01,96.4292
2020-11-01,96.8564


#### Low grade bonds factor (Baa)

Frequency : Monthly

Moody's Seasoned Baa Corporate Bond Yield
Financial instruments are based on bonds with maturities 20 years and above.

In [7]:
Baa_factor = get_FRED_series("BAA")["2014":"2020"]
Baa_factor = Baa_factor.rename(columns={0: 'Baa_factor'})
Baa_factor

Unnamed: 0,Baa_factor
2014-01-01,5.19
2014-02-01,5.10
2014-03-01,5.06
2014-04-01,4.90
2014-05-01,4.76
...,...
2020-08-01,3.27
2020-09-01,3.36
2020-10-01,3.44
2020-11-01,3.30


#### Equally weighted equities factor (EWNY)

Frequency : Daily
Let's resample in monthly franquency 

Invesco Equally-Wtd S&P 500 A 

In [8]:
EWNY_factor = yf.download("VADAX", start = "2012-01-01", end = "2019-12-31")['Adj Close']
EWNY_factor = pd.DataFrame(EWNY_factor)
EWNY_factor =  EWNY_factor.resample("M").mean()
EWNY_factor = EWNY_factor.loc["2010":"2020"]
EWNY_factor.index = EWNY_factor.index + pd.DateOffset(days=1)
EWNY_factor = EWNY_factor.rename(columns={'Adj Close': 'EWNY_factor'})
EWNY_factor

[*********************100%%**********************]  1 of 1 completed


Unnamed: 0_level_0,EWNY_factor
Date,Unnamed: 1_level_1
2012-02-01,18.732402
2012-03-01,19.662437
2012-04-01,20.029885
2012-05-01,19.866583
2012-06-01,19.158003
...,...
2019-09-01,44.961273
2019-10-01,46.626998
2019-11-01,46.395535
2019-12-01,48.360168


#### Value weighted equities factor (VWNY)

Frequency : Daily
Let's resample in monthly franquency 

S&P 500 EQUAL WEIGHT INDEX (SP500)

In [9]:
VWNY_factor = get_FRED_series("SP500")["2013":"2019"]
VWNY_factor = VWNY_factor.resample("M").mean()
VWNY_factor.index = VWNY_factor.index + pd.DateOffset(days=1)
VWNY_factor = VWNY_factor.rename(columns={0: 'VWNY_factor'})
VWNY_factor

Unnamed: 0,VWNY_factor
2013-12-01,1797.783000
2014-01-01,1807.775238
2014-02-01,1822.356667
2014-03-01,1817.034737
2014-04-01,1863.523333
...,...
2019-09-01,2897.498182
2019-10-01,2982.156000
2019-11-01,2977.675217
2019-12-01,3104.904500


#### Consumption factor (CG)

Frequency: Monthly

Real personal consumption expenditures per capita (A794RX0Q048SBEA)

In [10]:
CG_factor = get_FRED_series("A794RX0Q048SBEA")["2013":"2020"].pct_change().dropna()
CG_factor = CG_factor.resample("M").ffill()
CG_factor.index = CG_factor.index + pd.DateOffset(days=1)
CG_factor = CG_factor.rename(columns={0: 'CG_factor'})
CG_factor

Unnamed: 0,CG_factor
2013-05-01,0.001151
2013-06-01,0.001151
2013-07-01,0.001151
2013-08-01,0.001978
2013-09-01,0.001978
...,...
2020-07-01,-0.086576
2020-08-01,0.088262
2020-09-01,0.088262
2020-10-01,0.088262


#### Oil price factor (OG)

Frequency: Monthly

Spot Crude Oil Price: West Texas Intermediate (WTI) (WTISPLC)

In [11]:
OG_factor = get_FRED_series("WTISPLC")["2010":"2020"].pct_change().dropna()
OG_factor = OG_factor.rename(columns={0: 'OG_factor'})
OG_factor

Unnamed: 0,OG_factor
2010-02-01,-0.023012
2010-03-01,0.063072
2010-04-01,0.039882
2010-05-01,-0.125947
2010-06-01,0.020450
...,...
2020-08-01,0.040039
2020-09-01,-0.064006
2020-10-01,-0.005804
2020-11-01,0.039086


#### EI factor (Expected inflation)

Frenquency: Monthly

Median expected price change next 12 months, Surveys of Consumers. The most recent value is not shown due to an agreement with the source.

Source : University of Michigan: Inflation Expectation (MICH)

In [12]:
EI_factor = get_FRED_series("MICH")["2014":"2020"]
EI_factor = EI_factor.rename(columns={0: 'EI_factor'})
EI_factor

Unnamed: 0,EI_factor
2014-01-01,3.1
2014-02-01,3.2
2014-03-01,3.2
2014-04-01,3.2
2014-05-01,3.3
...,...
2020-08-01,3.1
2020-09-01,2.6
2020-10-01,2.6
2020-11-01,2.8


In [16]:
list_df = [I_factor,TB_factor,LGB_factor,IP_factor,Baa_factor,EWNY_factor,VWNY_factor,CG_factor,OG_factor,EI_factor]

merged_df = pd.merge(I_factor, TB_factor, left_index=True, right_index=True)
merged_df = pd.merge(merged_df, LGB_factor, left_index=True, right_index=True)
merged_df = pd.merge(merged_df, IP_factor, left_index=True, right_index=True)
merged_df = pd.merge(merged_df, Baa_factor, left_index=True, right_index=True)
merged_df = pd.merge(merged_df, EWNY_factor, left_index=True, right_index=True)
merged_df = pd.merge(merged_df, VWNY_factor, left_index=True, right_index=True)
merged_df = pd.merge(merged_df, CG_factor, left_index=True, right_index=True)
merged_df = pd.merge(merged_df, OG_factor, left_index=True, right_index=True)
merged_df = pd.merge(merged_df, EI_factor, left_index=True, right_index=True)
merged_df

Unnamed: 0,I_factor,TB_factor,LGB_factor,IP_factor,Baa_factor,EWNY_factor,VWNY_factor,CG_factor,OG_factor,EI_factor
2014-01-01,1.464833,0.017143,3.889048,99.9990,5.19,27.764893,1807.775238,0.006670,-0.030831,3.1
2014-02-01,1.622223,0.016667,3.769048,100.7583,5.10,28.103847,1822.356667,0.001776,0.065525,3.2
2014-03-01,1.622223,0.046842,3.662632,101.7767,5.06,28.257299,1817.034737,0.001776,-0.000198,3.2
2014-04-01,1.622223,0.051429,3.620952,101.8425,4.90,29.104840,1863.523333,0.001776,0.012599,3.2
2014-05-01,1.622223,0.023333,3.517619,102.2594,4.76,29.115611,1864.263333,0.007778,0.001078,3.3
...,...,...,...,...,...,...,...,...,...,...
2018-10-01,2.442583,2.000526,3.151053,103.9397,5.07,45.449986,2901.500526,0.003037,0.007404,2.9
2018-11-01,2.442583,2.139091,3.339545,104.0007,5.22,43.194252,2785.464783,0.001658,-0.194912,2.8
2018-12-01,2.442583,2.192500,3.361000,103.9946,5.13,42.759029,2723.229524,0.001658,-0.130618,2.7
2019-01-01,2.442583,2.323684,3.095789,103.3730,5.12,40.273489,2567.307368,0.001658,0.037561,2.7


### Add Fama Factors

In [17]:
df_fama_5 = pd.read_csv('data/fama_french_5_factors.csv', skiprows=3, header=0, names=['Date', 'Mkt-RF', 'SMB', 'HML', 'RMW', 'CMA', 'RF'])

# Convert the 'Date' column to datetime and set it as the index
df_fama_5['Date'] = pd.to_datetime(df_fama_5['Date'], format='%Y%m%d')
df_fama_5.set_index('Date', inplace=True)

df_fama_5

Unnamed: 0_level_0,Mkt-RF,SMB,HML,RMW,CMA,RF
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
1963-07-01,-0.67,0.02,-0.35,0.03,0.13,0.012
1963-07-02,0.79,-0.28,0.28,-0.08,-0.21,0.012
1963-07-03,0.63,-0.18,-0.10,0.13,-0.25,0.012
1963-07-05,0.40,0.09,-0.28,0.07,-0.30,0.012
1963-07-08,-0.63,0.07,-0.20,-0.27,0.06,0.012
...,...,...,...,...,...,...
2023-07-25,0.25,-0.23,-0.79,0.47,-0.41,0.022
2023-07-26,0.02,0.87,1.03,-0.35,0.65,0.022
2023-07-27,-0.74,-0.80,0.27,0.38,0.14,0.022
2023-07-28,1.14,0.41,-0.33,-0.75,-0.40,0.022


In [18]:
fama_factors = df_fama_5[["Mkt-RF", "SMB", "HML", "RMW", "CMA"]]
fama_factors

Unnamed: 0_level_0,Mkt-RF,SMB,HML,RMW,CMA
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
1963-07-01,-0.67,0.02,-0.35,0.03,0.13
1963-07-02,0.79,-0.28,0.28,-0.08,-0.21
1963-07-03,0.63,-0.18,-0.10,0.13,-0.25
1963-07-05,0.40,0.09,-0.28,0.07,-0.30
1963-07-08,-0.63,0.07,-0.20,-0.27,0.06
...,...,...,...,...,...
2023-07-25,0.25,-0.23,-0.79,0.47,-0.41
2023-07-26,0.02,0.87,1.03,-0.35,0.65
2023-07-27,-0.74,-0.80,0.27,0.38,0.14
2023-07-28,1.14,0.41,-0.33,-0.75,-0.40


In [19]:
merged_df = pd.merge(merged_df, fama_factors, left_index=True, right_index=True)
merged_df

Unnamed: 0,I_factor,TB_factor,LGB_factor,IP_factor,Baa_factor,EWNY_factor,VWNY_factor,CG_factor,OG_factor,EI_factor,Mkt-RF,SMB,HML,RMW,CMA
2014-04-01,1.622223,0.051429,3.620952,101.8425,4.9,29.10484,1863.523333,0.001776,0.012599,3.2,0.87,0.65,-0.37,-0.18,-0.29
2014-05-01,1.622223,0.023333,3.517619,102.2594,4.76,29.115611,1864.263333,0.007778,0.001078,3.3,0.04,-0.18,-0.16,-0.5,-0.16
2014-07-01,1.622223,0.024286,3.42,102.8163,4.73,30.493862,1947.087619,0.007778,-0.020796,3.3,0.74,0.43,-0.39,0.07,-0.1
2014-08-01,1.622223,0.024091,3.331818,102.6562,4.69,30.818152,1973.1,0.007692,-0.068057,3.2,-0.32,-0.23,-0.05,0.09,0.16
2014-10-01,1.622223,0.011429,3.26,102.9892,4.69,31.084982,1993.22619,0.007692,-0.094518,2.9,-1.39,-0.14,0.3,0.2,-0.07
2014-12-01,1.622223,0.041667,3.038333,103.6345,4.74,31.917171,2044.572105,0.009327,-0.217707,2.8,-0.89,-0.89,0.62,0.1,0.48
2015-04-01,0.118627,0.022727,2.626364,101.244,4.48,32.897358,2079.990455,0.005937,0.138645,2.6,-0.38,0.34,0.44,-0.17,0.24
2015-05-01,0.118627,0.015909,2.585909,100.783,4.89,33.284009,2094.862857,0.00508,0.088522,2.8,1.01,-0.31,-0.6,0.25,-0.1
2015-06-01,0.118627,0.013,2.955,100.4781,5.13,33.391661,2111.9435,0.00508,0.00928,2.7,0.17,-0.06,-0.22,0.24,-0.34
2015-07-01,0.118627,0.004091,3.111818,101.1052,5.2,33.126581,2099.283636,0.00508,-0.149114,2.8,0.61,-0.76,-0.03,0.21,0.08


### Compute derived factors

* Monthly Growth IP: $MP(t) = log(IP(t)/IP_t-1)$
* Annual growth IP: $YP(t) = log(IP(t)/IP_t-12))$
* Annual growth IP: $E(I(t)) = expected infla$
* Unexpected Inflation: $UI(t) = I(t) - E(I(t)|t-1)$
* Real interest ex post: $RHO(t) = TP(t-1) - I(t)$
* Change in expected infla $DEI(t) = E(I(t+1)|t) - E(I(t)|t-1)$
* Risk premium: $UPR(t) = Baa(t) - LGB(t)$
* Term Structure: $UTS(t) = LGB(t) - TB(t-1)$

In [20]:
merged_df["MP_derived_factor"] = np.log(merged_df["IP_factor"] / merged_df["IP_factor"].shift(1))
merged_df

Unnamed: 0,I_factor,TB_factor,LGB_factor,IP_factor,Baa_factor,EWNY_factor,VWNY_factor,CG_factor,OG_factor,EI_factor,Mkt-RF,SMB,HML,RMW,CMA,MP_derived_factor
2014-04-01,1.622223,0.051429,3.620952,101.8425,4.9,29.10484,1863.523333,0.001776,0.012599,3.2,0.87,0.65,-0.37,-0.18,-0.29,
2014-05-01,1.622223,0.023333,3.517619,102.2594,4.76,29.115611,1864.263333,0.007778,0.001078,3.3,0.04,-0.18,-0.16,-0.5,-0.16,0.004085
2014-07-01,1.622223,0.024286,3.42,102.8163,4.73,30.493862,1947.087619,0.007778,-0.020796,3.3,0.74,0.43,-0.39,0.07,-0.1,0.005431
2014-08-01,1.622223,0.024091,3.331818,102.6562,4.69,30.818152,1973.1,0.007692,-0.068057,3.2,-0.32,-0.23,-0.05,0.09,0.16,-0.001558
2014-10-01,1.622223,0.011429,3.26,102.9892,4.69,31.084982,1993.22619,0.007692,-0.094518,2.9,-1.39,-0.14,0.3,0.2,-0.07,0.003239
2014-12-01,1.622223,0.041667,3.038333,103.6345,4.74,31.917171,2044.572105,0.009327,-0.217707,2.8,-0.89,-0.89,0.62,0.1,0.48,0.006246
2015-04-01,0.118627,0.022727,2.626364,101.244,4.48,32.897358,2079.990455,0.005937,0.138645,2.6,-0.38,0.34,0.44,-0.17,0.24,-0.023337
2015-05-01,0.118627,0.015909,2.585909,100.783,4.89,33.284009,2094.862857,0.00508,0.088522,2.8,1.01,-0.31,-0.6,0.25,-0.1,-0.004564
2015-06-01,0.118627,0.013,2.955,100.4781,5.13,33.391661,2111.9435,0.00508,0.00928,2.7,0.17,-0.06,-0.22,0.24,-0.34,-0.00303
2015-07-01,0.118627,0.004091,3.111818,101.1052,5.2,33.126581,2099.283636,0.00508,-0.149114,2.8,0.61,-0.76,-0.03,0.21,0.08,0.006222


In [22]:
merged_df["MP_derived_factor"] = np.log(merged_df["IP_factor"] / merged_df["IP_factor"].shift(1))

merged_df["YP_derived_factor"] = np.log(merged_df["IP_factor"]/merged_df["IP_factor"].shift(12))

merged_df["UI_derived_factor"] = merged_df["I_factor"] - merged_df["EI_factor"].shift(1)

merged_df["RHO_derived_factor"] = merged_df["TB_factor"].shift(1)-merged_df["I_factor"]

merged_df["DEI_derived_factor"] = merged_df["EI_factor"] - merged_df["EI_factor"].shift(1)

merged_df["UPR_derived_factor"] = merged_df["Baa_factor"] - merged_df["LGB_factor"]

merged_df["UTS_derived_factor"] = merged_df["LGB_factor"] - merged_df["TB_factor"].shift(1)

merged_df.dropna(inplace=True)

In [23]:
merged_df

Unnamed: 0,I_factor,TB_factor,LGB_factor,IP_factor,Baa_factor,EWNY_factor,VWNY_factor,CG_factor,OG_factor,EI_factor,...,HML,RMW,CMA,MP_derived_factor,YP_derived_factor,UI_derived_factor,RHO_derived_factor,DEI_derived_factor,UPR_derived_factor,UTS_derived_factor
2015-12-01,0.118627,0.068947,3.03,98.939,5.46,32.149798,2080.6165,0.001956,-0.123704,2.6,...,0.25,-0.08,-0.11,-0.012536,-0.028924,-2.581373,-0.119103,-0.1,2.43,3.030476
2016-02-01,1.261583,0.224211,2.858421,98.9136,5.34,29.352912,1918.597895,0.005756,-0.042929,2.5,...,-1.0,0.06,-0.35,-0.000257,-0.033266,-1.338417,-1.192636,-0.1,2.481579,2.789474
2016-03-01,1.261583,0.251,2.623,98.1907,5.13,29.295104,1904.4185,0.005756,0.238456,2.7,...,0.39,-0.58,-0.61,-0.007335,-0.046032,-1.238417,-1.037373,0.2,2.507,2.398789
2016-04-01,1.261583,0.250909,2.684545,98.4669,4.79,31.709133,2021.954091,0.005756,0.08522,2.8,...,-0.62,-0.39,-0.1,0.002809,-0.041665,-1.438417,-1.010583,0.1,2.105455,2.433545
2016-06-01,1.261583,0.221429,2.627619,98.7275,4.53,32.607006,2065.550952,0.003151,0.043888,2.6,...,-0.2,-0.29,0.02,0.002643,-0.042261,-1.538417,-1.010674,-0.2,1.902381,2.37671
2016-07-01,1.261583,0.218636,2.452273,98.836,4.22,33.052938,2083.891364,0.003151,-0.08429,2.7,...,-0.43,-0.07,0.27,0.001098,-0.047408,-1.338417,-1.040155,0.1,1.767727,2.230844
2016-08-01,1.261583,0.2585,2.227,98.7554,4.24,34.094876,2148.902,0.004926,0.001568,2.5,...,-0.9,0.47,-0.74,-0.000816,-0.024887,-1.438417,-1.042947,-0.2,2.013,2.008364
2016-09-01,1.261583,0.257826,2.261739,98.6596,4.31,34.637547,2177.482174,0.004926,0.010286,2.4,...,-0.5,0.1,-0.14,-0.000971,-0.021294,-1.238417,-1.003083,-0.1,2.048261,2.003239
2016-11-01,1.261583,0.236,2.5005,98.3452,4.71,34.071153,2143.020952,0.003201,-0.082764,2.4,...,0.2,-0.5,-0.06,-0.003192,-0.021456,-1.138417,-1.003757,0.0,2.2095,2.242674
2016-12-01,1.261583,0.2895,2.862,99.0314,4.83,34.725016,2164.985714,0.003201,0.138195,2.2,...,2.03,0.31,0.62,0.006953,-0.020725,-1.138417,-1.025583,-0.2,1.968,2.626


In [24]:
derived_factor = ["MP_derived_factor","YP_derived_factor","UI_derived_factor","RHO_derived_factor","DEI_derived_factor","UPR_derived_factor","UTS_derived_factor","EI_factor","Mkt-RF","SMB","HML"]
df_derived_factor = merged_df[derived_factor]
df_derived_factor

Unnamed: 0,MP_derived_factor,YP_derived_factor,UI_derived_factor,RHO_derived_factor,DEI_derived_factor,UPR_derived_factor,UTS_derived_factor,EI_factor,Mkt-RF,SMB,HML
2015-12-01,-0.012536,-0.028924,-2.581373,-0.119103,-0.1,2.43,3.030476,2.6,0.97,-0.63,0.25
2016-02-01,-0.000257,-0.033266,-1.338417,-1.192636,-0.1,2.481579,2.789474,2.5,-0.04,-0.3,-1.0
2016-03-01,-0.007335,-0.046032,-1.238417,-1.037373,0.2,2.507,2.398789,2.7,2.34,-0.65,0.39
2016-04-01,0.002809,-0.041665,-1.438417,-1.010583,0.1,2.105455,2.433545,2.8,0.64,-0.32,-0.62
2016-06-01,0.002643,-0.042261,-1.538417,-1.010674,-0.2,1.902381,2.37671,2.6,0.2,0.63,-0.2
2016-07-01,0.001098,-0.047408,-1.338417,-1.040155,0.1,1.767727,2.230844,2.7,0.24,0.47,-0.43
2016-08-01,-0.000816,-0.024887,-1.438417,-1.042947,-0.2,2.013,2.008364,2.5,-0.16,0.06,-0.9
2016-09-01,-0.000971,-0.021294,-1.238417,-1.003083,-0.1,2.048261,2.003239,2.4,0.03,0.07,-0.5
2016-11-01,-0.003192,-0.021456,-1.138417,-1.003757,0.0,2.2095,2.242674,2.4,-0.68,-0.38,0.2
2016-12-01,0.006953,-0.020725,-1.138417,-1.025583,-0.2,1.968,2.626,2.2,-0.36,-0.39,2.03


#### Fit the following regression

$VWNY = \beta_1 MP(t) + \beta_2 YP(t) + \beta_3 E[I(t)] + \beta_4 UI(t) + \beta_5 RHO(t) + \beta_6 DEI(t) + \beta_7 URP(t) + \beta_8 UTS(t) + \beta_9 EI(t) + \beta_{10} Mkt-RF + \beta_{11} SMB + \beta_{12} HML$

In [28]:
y = merged_df["VWNY_factor"]
X = df_derived_factor
apt = lm.OLS(y, X).fit()
apt.summary()

0,1,2,3
Dep. Variable:,VWNY_factor,R-squared (uncentered):,0.999
Model:,OLS,Adj. R-squared (uncentered):,0.998
Method:,Least Squares,F-statistic:,1427.0
Date:,"Wed, 15 Nov 2023",Prob (F-statistic):,1.45e-20
Time:,23:35:02,Log-Likelihood:,-148.91
No. Observations:,26,AIC:,319.8
Df Residuals:,15,BIC:,333.7
Df Model:,11,,
Covariance Type:,nonrobust,,

0,1,2,3,4,5,6
,coef,std err,t,P>|t|,[0.025,0.975]
MP_derived_factor,-5777.2154,4150.184,-1.392,0.184,-1.46e+04,3068.693
YP_derived_factor,3840.7767,1819.132,2.111,0.052,-36.612,7718.166
UI_derived_factor,231.1141,142.253,1.625,0.125,-72.091,534.319
RHO_derived_factor,49.0936,143.035,0.343,0.736,-255.778,353.965
DEI_derived_factor,-731.9672,189.680,-3.859,0.002,-1136.260,-327.674
UPR_derived_factor,-12.9581,108.607,-0.119,0.907,-244.447,218.531
UTS_derived_factor,74.0988,106.424,0.696,0.497,-152.739,300.937
EI_factor,948.3188,92.073,10.300,0.000,752.069,1144.568
Mkt-RF,-67.4256,29.380,-2.295,0.037,-130.048,-4.803

0,1,2,3
Omnibus:,0.851,Durbin-Watson:,1.574
Prob(Omnibus):,0.653,Jarque-Bera (JB):,0.856
Skew:,0.273,Prob(JB):,0.652
Kurtosis:,2.299,Cond. No.,885.0
