Quantitative Finance Project
By: Varun Gopal, Tyler Dixon, and Abhinav Kakumanu

Objective of this project is to highlight the methods learned in class and apply them with market data.

Our project revolves around examining economic data and building a predictive model that can help guide returns.

In [2]:
# Importing basic files

import numpy as np
import pandas as pd
import matplotlib as mpl
import matplotlib.pyplot as plt

import pandas_datareader as pdr
import pandas_datareader.famafrench

In [3]:
# Get all the datasets 
pandas_datareader.famafrench.get_available_datasets()

['F-F_Research_Data_Factors',
 'F-F_Research_Data_Factors_weekly',
 'F-F_Research_Data_Factors_daily',
 'F-F_Research_Data_5_Factors_2x3',
 'F-F_Research_Data_5_Factors_2x3_daily',
 'Portfolios_Formed_on_ME',
 'Portfolios_Formed_on_ME_Wout_Div',
 'Portfolios_Formed_on_ME_Daily',
 'Portfolios_Formed_on_BE-ME',
 'Portfolios_Formed_on_BE-ME_Wout_Div',
 'Portfolios_Formed_on_BE-ME_Daily',
 'Portfolios_Formed_on_OP',
 'Portfolios_Formed_on_OP_Wout_Div',
 'Portfolios_Formed_on_OP_Daily',
 'Portfolios_Formed_on_INV',
 'Portfolios_Formed_on_INV_Wout_Div',
 'Portfolios_Formed_on_INV_Daily',
 '6_Portfolios_2x3',
 '6_Portfolios_2x3_Wout_Div',
 '6_Portfolios_2x3_weekly',
 '6_Portfolios_2x3_daily',
 '25_Portfolios_5x5',
 '25_Portfolios_5x5_Wout_Div',
 '25_Portfolios_5x5_Daily',
 '100_Portfolios_10x10',
 '100_Portfolios_10x10_Wout_Div',
 '100_Portfolios_10x10_Daily',
 '6_Portfolios_ME_OP_2x3',
 '6_Portfolios_ME_OP_2x3_Wout_Div',
 '6_Portfolios_ME_OP_2x3_daily',
 '25_Portfolios_ME_OP_5x5',
 '25_Portf

In [4]:
ff = pdr.get_data_famafrench('10_Industry_Portfolios', 1926)

In [22]:
type(ff)

dict

In [24]:
ff.keys()

dict_keys([0, 1, 2, 3, 4, 5, 6, 7, 'DESCR'])

In [29]:
ff.keys()
ff[0]

Unnamed: 0_level_0,NoDur,Durbl,Manuf,Enrgy,HiTec,Telcm,Shops,Hlth,Utils,Other
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1
1926-07,1.45,15.55,4.69,-1.18,2.90,0.83,0.11,1.77,7.04,2.13
1926-08,3.97,3.68,2.81,3.47,2.66,2.17,-0.71,4.25,-1.69,4.35
1926-09,1.14,4.80,1.15,-3.39,-0.38,2.41,0.21,0.69,2.04,0.29
1926-10,-1.24,-8.23,-3.63,-0.78,-4.58,-0.11,-2.29,-0.57,-2.63,-2.84
1926-11,5.20,-0.19,4.10,0.01,4.71,1.63,6.43,5.42,3.71,2.11
...,...,...,...,...,...,...,...,...,...,...
2022-10,9.94,-6.27,12.22,23.60,4.85,10.94,4.30,8.84,3.50,11.60
2022-11,5.27,-7.57,9.02,0.97,5.24,2.32,3.86,5.46,6.82,6.00
2022-12,-2.67,-27.47,-2.36,-4.16,-7.88,-6.76,-7.96,-1.73,-1.15,-5.27
2023-01,-0.28,28.35,5.95,2.87,9.81,13.45,9.74,-1.02,-1.26,7.08


In [34]:
industry_name = [i for i in ff[0].columns]
industry_name

['NoDur',
 'Durbl',
 'Manuf',
 'Enrgy',
 'HiTec',
 'Telcm',
 'Shops',
 'Hlth ',
 'Utils',
 'Other']

We have gotten all the data from 1926 for different industries. Here are the industries:
1. Consumer Nondurables: Food, Tobacco, Textiles, Apparel, Leather, Toys
2. Consumer Durables -- Cars, TVs, Furniture, Household Appliances
3. Manufacturing -- Machinery, Trucks, Planes, Chemicals, Off Furn, Paper, Com Printing
4. Oil, Gas, and Coal Extraction and Products
5. Business Equipment -- Computers, Software, and Electronic Equipment
6. Telephone and Television Transmission
7. Wholesale, Retail, and Some Services (Laundries, Repair Shops)
8. Healthcare, Medical Equipment, and Drugs
9. Utilities
10. Other -- Mines, Constr, BldMt, Trans, Hotels, Bus Serv, Entertainment, Finance

The next step will be to gather important economic data from the Federal Reserve and map it alongside the data. 
First question is what do we want to map?

We will do our best to get variables that will not impact one specific industry. However, there may be data that skew towards a specific industry. We will highlight the bias. 

According to https://groww.in/blog/macroeconomic-factors-that-influence-us-stock-markets
1. Gross Domestic Product (GDP)
2. Inflation
3. Unemployment Rate (Payrolls)
4. Retail Sales
5. Industrial Output 

We will also get market data on interest rates, corporate profits, and corporate debt.


Get the data from Federal Reserve Economic Data (FRED)

Another Hypothesis is that markets will react to any economic indicators quickly. Since indicators such as GDP are lagging indicators, we will need market representations (such as spreads) that can help understand how stocks react to the propsect of economic events.

In [7]:
# RGDP is real GDP... since we will examine inflation later
rgdp = pdr.get_data_fred('GDPC1', 1950)

In [8]:
rgdp.pct_change().dropna()

Unnamed: 0_level_0,GDPC1
DATE,Unnamed: 1_level_1
1950-04-01,0.030498
1950-07-01,0.038644
1950-10-01,0.019148
1951-01-01,0.013582
1951-04-01,0.017327
...,...
2021-10-01,0.016957
2022-01-01,-0.004103
2022-04-01,-0.001446
2022-07-01,0.008012


In [9]:
#CPI to capture inflation
cpi = pdr.get_data_fred('FPCPITOTLZGUSA',1960)

In [10]:
# We will create dummy variables to create various states
cpi['High'] = (cpi > 2).astype(int)

In [11]:
cpi

Unnamed: 0_level_0,FPCPITOTLZGUSA,High
DATE,Unnamed: 1_level_1,Unnamed: 2_level_1
1960-01-01,1.457976,0
1961-01-01,1.070724,0
1962-01-01,1.198773,0
1963-01-01,1.239669,0
1964-01-01,1.278912,0
...,...,...
2017-01-01,2.130110,1
2018-01-01,2.442583,1
2019-01-01,1.812210,0
2020-01-01,1.233584,0


In [12]:
# Corporate Profits
corp_profit = pdr.get_data_fred('CP', 1947)

In [13]:
corp_profit.pct_change().dropna()

Unnamed: 0_level_0,CP
DATE,Unnamed: 1_level_1
1947-04-01,-0.053801
1947-07-01,-0.010775
1947-10-01,0.091762
1948-01-01,0.057369
1948-04-01,0.050634
...,...
2021-10-01,-0.023466
2022-01-01,0.026317
2022-04-01,0.074003
2022-07-01,-0.050048


In [14]:
# Unemployment Rate
unemp = pdr.get_data_fred('UNRATE', 1948)

In [15]:
unemp

Unnamed: 0_level_0,UNRATE
DATE,Unnamed: 1_level_1
1948-01-01,3.4
1948-02-01,3.8
1948-03-01,4.0
1948-04-01,3.9
1948-05-01,3.5
...,...
2022-11-01,3.6
2022-12-01,3.5
2023-01-01,3.4
2023-02-01,3.6


In [16]:
# Interest Rates (10 yr treasury rates)
irate = pdr.get_data_fred('FEDFUNDS', 1953)

In [18]:
irate.pct_change().dropna()

Unnamed: 0_level_0,FEDFUNDS
DATE,Unnamed: 1_level_1
1954-08-01,0.525000
1954-09-01,-0.122951
1954-10-01,-0.205607
1954-11-01,-0.023529
1954-12-01,0.542169
...,...
2022-11-01,0.227273
2022-12-01,0.084656
2023-01-01,0.056098
2023-02-01,0.055427
