# TABLE OF CONTENTS

* [1. INTRODUCTION](#section-one)
* [2. SETUP](#section-two)
    - [2.1 Draw Packages](#subsection-two-one)
    - [2.2 Import Data](#subsection-two-two)
    - [2.3 Wrangle Data](#subsection-two-three)
* [3. WHAT DO DECADES OF ASSET RETURNS TELL US ABOUT INVESTING?](#section-three)
    - [3.1 Question 1: What is the risk/return profile for different asset class?](#subsection-three-one)
    - [3.2 Question 2: Which asset class is a good hedge against recession?](#subsection-three-two)
    - [3.3 Question 3: Does volatility foretell future return?](#subsection-three-three)
* [4. CONCLUSION](#section-four)
* [5. REFERENCES](#section-five)

<a id='section-one'></a>
# 1. INTRODUCTION

The Covid19 pandemic has driven the economies around the world into deep recessions. The various forms of mandated lockdowns have caused both supply-side and demand-side shocks to the world economies. The scale of the economic downturn is unprecedented. In this notebook, we will first explore the risk/return profile of different asset classes (e.g., S&P500, Bonds, Gold, Oil, Bitcoin, etc.). Then, we explore which asset class is a good hedge against recession. Is the conventional wisdom that gold is a good hedge against recession true? Finally, the Covid19 pandemic has heightened the volatilities of asset returns. What is the relationship between asset volatilities and returns? Do asset return volatilities historically foretell future returns? 

<a id='section-two'></a>
# 2. SETUP


<a id='subsection-two-one'></a>
## 2.1 Draw Packages

In [None]:
# for numerical analyiss
import numpy as np

# to store and process data in dataframe
import pandas as pd

# to interface with operating system
import os

# for offline ploting
import matplotlib.pyplot as plt

# interactive visualization
import plotly.express as px
import seaborn as sns

sns.set()

from plotly.offline import plot, iplot, init_notebook_mode

init_notebook_mode(connected=True)

import plotly.graph_objs as go
import plotly.figure_factory as ff
from plotly.subplots import make_subplots

# for trendlines
import statsmodels

<a id='subsection-two-two'></a>
## 2.2 Import Data

In [None]:
# A. Kaggle

# Create an empty list
files = []

# Fill the list with the file names of the CSV files in the Kaggle folder
for dirname, _, filenames in os.walk('../input/econfin-test'):
    for filename in filenames:
        files.append(os.path.join(dirname, filename))

# Sort the file names
files = sorted(files)

# Output the list of sorted file names
files

'''
# B. Jupyterlab/Jupyter Notebook
local_path = 'C:/jupyter_workspace/Fintech/Codes/Lab 3 Diving into The History of Finance/'

# Create an empty list
files = []

# Fill the list with the file names of the CSV files in the Kaggle folder
for dirname, _, filenames in os.walk(local_path + 'data/'):
    for filename in filenames:
        files.append(os.path.join(dirname, filename))

# Sort the file names
files = sorted(files)

# Output the list of sorted file names
files
'''

In [None]:
# Read the CSV files through list comprehension, which can be broken into three parts
# 1. OUTPUT EXPRESSION [pd.read_csv(f, na_values=['.'])] --- Note: this turns character '.' values into missing value
# 2. INPUT SEQUENCE [for f] 
# 3. CONDITION (OPTIONAL) [in files] 
dataframe = [pd.read_csv(f, na_values=['.']) for f in files]

# Define dataframe name, which becomes the dictionary key
dataframe_name = ['btc','cpi','gold','snp','high_yield_bond','inv_grade_bond','moderna','employment','tesla_robinhood','trea_20y_bond','trea_10y_yield','tesla','fed_bs','wti']

# dataframe name = dictionary key, dataframe = dictionary value
dataframe_dict = dict(zip(dataframe_name, dataframe))

Details on list comprehension [HERE](https://towardsdatascience.com/comprehending-the-concept-of-comprehensions-in-python-c9dafce5111).   
Details on dictionary data structure [HERE](https://realpython.com/python-dicts/).

<a id='subsection-two-three'></a>
## 2.3 Wrangle Data

In [None]:
# 1. S&P 
snp = dataframe_dict['snp']
snp['Date'] = pd.to_datetime(snp['Date'])
snp.rename(columns={'Adj Close':'snp'}, inplace=True)
snp['snp_return'] = snp['snp'].pct_change()
snp['snp_volatility_1m'] = (snp['snp_return'].rolling(20).std())*(20)**(1/2) # Annualize daily standard deviation
snp['snp_volatility_1y'] = (snp['snp_return'].rolling(252).std())*(252)**(1/2) # 252 trading days per year
snp = snp[['Date','snp','snp_return','snp_volatility_1m','snp_volatility_1y']]
# Calculate 1-month forward cumulative returns
snp['one_month_forward_snp_return'] = snp['snp_return'][::-1].rolling(window=20, min_periods=1).sum()[::-1]

In [None]:
# 2. Bitcoin
btc = dataframe_dict['btc']
btc['Date'] = pd.to_datetime(btc['Date'])
btc.rename(columns={'Adj Close':'btc'}, inplace=True)
btc['btc_return'] = btc['btc'].pct_change()
btc['btc_volatility_1m'] = (btc['btc_return'].rolling(20).std())*(20)**(1/2) 
btc['btc_volatility_1y'] = (btc['btc_return'].rolling(252).std())*(252)**(1/2) 
btc = btc[['Date','btc','btc_return','btc_volatility_1m','btc_volatility_1y']]
btc['one_month_forward_btc_return'] = btc['btc_return'][::-1].rolling(window=20, min_periods=1).sum()[::-1]

In [None]:
# 3. Gold
gold = dataframe_dict['gold']
gold['Date'] = pd.to_datetime(gold['DATE'])
gold.rename(columns={'GOLDPMGBD228NLBM':'gold'}, inplace=True)
gold['gold_lag1'] = gold['gold'].shift(1)
gold['gold_lag2'] = gold['gold'].shift(2)
gold['gold'] = gold['gold'].fillna(gold['gold_lag1'])
gold['gold'] = gold['gold'].fillna(gold['gold_lag2'])
gold['gold'] = gold['gold'].astype('float64')
gold['gold_return'] = gold['gold'].pct_change()
gold['gold_volatility_1m'] = (gold['gold_return'].rolling(20).std())*(20)**(1/2) 
gold['gold_volatility_1y'] = (gold['gold_return'].rolling(252).std())*(252)**(1/2) 
gold = gold[['Date','gold','gold_return','gold_volatility_1m','gold_volatility_1y']]
gold['one_month_forward_gold_return'] = gold['gold_return'][::-1].rolling(window=20, min_periods=1).sum()[::-1]

In [None]:
# 4. High Yield Bond
high_yield_bond = dataframe_dict['high_yield_bond']
high_yield_bond['Date'] = pd.to_datetime(high_yield_bond['Date'])
high_yield_bond.rename(columns={'Adj Close':'high_yield_bond'}, inplace=True)
high_yield_bond['high_yield_bond_return'] = high_yield_bond['high_yield_bond'].pct_change()
high_yield_bond['high_yield_bond_volatility_1m'] = (high_yield_bond['high_yield_bond_return'].rolling(20).std())*(20)**(1/2)
high_yield_bond['high_yield_bond_volatility_1y'] = (high_yield_bond['high_yield_bond_return'].rolling(252).std())*(252)**(1/2)
high_yield_bond = high_yield_bond[['Date','high_yield_bond','high_yield_bond_return','high_yield_bond_volatility_1m',
                                   'high_yield_bond_volatility_1y']]
high_yield_bond['one_month_forward_high_yield_bond_return'] = high_yield_bond['high_yield_bond_return'][::-1].rolling(window=20, min_periods=1).sum()[::-1]

In [None]:
# 5. Investment Grade Bond
inv_grade_bond = dataframe_dict['inv_grade_bond']
inv_grade_bond['Date'] = pd.to_datetime(inv_grade_bond['Date'])
inv_grade_bond.rename(columns={'Adj Close':'inv_grade_bond'}, inplace=True)
inv_grade_bond['inv_grade_bond_return'] = inv_grade_bond['inv_grade_bond'].pct_change()
inv_grade_bond['inv_grade_bond_volatility_1m'] = (inv_grade_bond['inv_grade_bond_return'].rolling(20).std())*(20)**(1/2)
inv_grade_bond['inv_grade_bond_volatility_1y'] = (inv_grade_bond['inv_grade_bond_return'].rolling(252).std())*(252)**(1/2)
inv_grade_bond = inv_grade_bond[['Date','inv_grade_bond','inv_grade_bond_return','inv_grade_bond_volatility_1m',
                                 'inv_grade_bond_volatility_1y']]
inv_grade_bond['one_month_forward_inv_grade_bond_return'] = inv_grade_bond['inv_grade_bond_return'][::-1].rolling(window=20, min_periods=1).sum()[::-1]

In [None]:
# 6. Crude Oil WTI
wti = dataframe_dict['wti']
wti['Date'] = pd.to_datetime(wti['DATE'])
wti.rename(columns={'WTISPLC':'wti'}, inplace=True)
wti['wti_return'] = wti['wti'].pct_change()
wti['wti_volatility_1m'] = wti['wti_return'].rolling(20).std()*(20)**(1/2)
wti['wti_volatility_1y'] = wti['wti_return'].rolling(252).std()*(252)**(1/2)
wti = wti[['Date','wti','wti_return','wti_volatility_1m','wti_volatility_1y']]
wti['one_month_forward_wti_return'] = wti['wti_return'][::-1].rolling(window=20, min_periods=1).sum()[::-1]

In [None]:
#7. Inflation
cpi = dataframe_dict['cpi']
cpi['Date'] = pd.to_datetime(cpi['DATE'])
cpi.rename(columns={'CUUR0000SEHE':'cpi'}, inplace=True)

# forward fill missing values in the earlier years
cpi = cpi.fillna(method='ffill')

cpi = cpi[['Date','cpi']]

In [None]:
#8. Employment
employment = dataframe_dict['employment']
employment['Date'] = pd.to_datetime(employment['DATE'])
employment.rename(columns={'PAYEMS_CHG':'employment'}, inplace=True)
employment = employment[['Date','employment']]

In [None]:
#9. US Fed's Balance Sheet
fed_bs = dataframe_dict['fed_bs']
fed_bs['Date'] = pd.to_datetime(fed_bs['DATE'])
fed_bs.rename(columns={'WALCL':'fed_bs'}, inplace=True)
fed_bs = fed_bs[['Date','fed_bs']]

In [None]:
# Import datasets with Pandas method read_csv
#nber_recession_indicator_day = pd.read_csv(local_path + 'USRECD.csv')
nber_recession_indicator_day = pd.read_csv('../input/nber-based-recession-indicators-united-states/USRECD.csv')

# Convert data types
nber_recession_indicator_day['Date'] = pd.to_datetime(nber_recession_indicator_day['date'])
nber_recession_indicator_day['recession'] = nber_recession_indicator_day['value'].astype('bool')

# Subset data columns
nber_recession_indicator_day = nber_recession_indicator_day[['Date','recession']]

In [None]:
# Merge datasets together
asset_classes = [btc,cpi,gold,high_yield_bond,inv_grade_bond,employment,fed_bs,wti]

baseline = pd.merge(snp,nber_recession_indicator_day,how='left',left_on='Date', right_on='Date')

for asset_class in asset_classes:
    baseline = pd.merge(baseline,asset_class,how='left',left_on='Date', right_on='Date')

# Backfilling missing values,  
baseline.loc[baseline.Date >= '2020-03-01', 'recession'] = 1
baseline['recession'] = baseline['recession'].fillna(0).astype(bool)

baseline.info()

Details on merge, join, and concat [HERE](https://realpython.com/pandas-merge-join-and-concat/).

<a id='section-three'></a>
# 3. WHAT DO DECADES OF ASSET RETURNS TELL US ABOUT INVESTING?

<a id='subsection-three-one'></a>
## 3.1 Question 1: What is the risk/return profile for different asset class?

In [None]:
baseline.tail()

In [None]:
# Index Date
baseline.set_index('Date', inplace=True)
baseline.tail()

In [None]:
# Re-sample the dataset every year and calculate the sum of returns
baseline_yearly_return = baseline[['snp_return', 'btc_return', 'gold_return', 'high_yield_bond_return',  
                            'inv_grade_bond_return', 'wti_return']].dropna().resample('Y').sum().reset_index()

print(baseline_yearly_return['Date'].min()) # 2010-12-31
baseline_yearly_return.head()

Details on method resample [HERE](https://www.geeksforgeeks.org/python-pandas-dataframe-resample/).

In [None]:
# Re-sample the dataset every year and calculate the mean of 1-year volatility
baseline_yearly_volatility_1y = baseline[['snp_volatility_1y', 'btc_volatility_1y', 'gold_volatility_1y', 
                                          'high_yield_bond_volatility_1y', 'inv_grade_bond_volatility_1y', 
                                          'wti_volatility_1y']].dropna().resample('Y').mean().reset_index()

baseline_yearly = baseline_yearly_return.merge(baseline_yearly_volatility_1y, left_on='Date', right_on='Date')

baseline_yearly.head()

In [None]:
# Reshape dataset wide to tall with method melt
baseline_yearly_reshaped = baseline_yearly.melt(id_vars='Date', var_name='key', value_name='value')
baseline_yearly_reshaped.head()

For more details on method melt [HERE](https://www.geeksforgeeks.org/python-pandas-melt/)

In [None]:
baseline_yearly_reshaped['metric'] = np.where(baseline_yearly_reshaped['key'].str.contains(pat = 'return'), 'return', 'volatility')
baseline_yearly_reshaped['position']= baseline_yearly_reshaped['key'].str.find('_') 
baseline_yearly_reshaped['asset_class']= baseline_yearly_reshaped['key'].str.slice(0,3,1)
baseline_yearly_reshaped = baseline_yearly_reshaped[['Date','metric','asset_class','value']]
baseline_yearly_reshaped.head()

In [None]:
# Display return and volatility for each asset class
print(baseline_yearly_reshaped[baseline_yearly_reshaped['metric'] == 'return'].groupby('asset_class').mean())
print(baseline_yearly_reshaped[baseline_yearly_reshaped['metric'] == 'volatility'].groupby('asset_class').mean())

In [None]:
baseline.tail()

In [None]:
# Reset index
baseline.reset_index(inplace=True)
baseline.tail()

In [None]:
# Output summary statistics
baseline[['snp_return', 'snp_volatility_1y', 'btc_return', 'btc_volatility_1y', 'gold_return', 'gold_volatility_1y', 
                  'high_yield_bond_return', 'high_yield_bond_volatility_1y', 'inv_grade_bond_return', 
                  'inv_grade_bond_volatility_1y', 'wti_return', 'wti_volatility_1y']].describe()

<a id='subsection-three-two'></a>
## 3.2 Question 2: Which asset class is a good hedge against recession?

The conventional wisdom: Gold is a good hedge against recession. Is it true?

In [None]:
# Plot a jointplot with a regression line
sns.jointplot(x = 'gold_return', y = 'snp_return', data = baseline, kind='reg')

S&P and gold returns seem uncorrelated.

In [None]:
def plot_chart(dataframe):
    fig = px.scatter(baseline[baseline[dataframe].notnull()], x='Date', y=dataframe, color='recession', color_discrete_sequence=['#636EFA', '#FFA15A'], width=1200)
    fig.update_traces(mode='markers', marker_size=4)
    fig.update_layout(title=dataframe, xaxis_title='', yaxis_title='')
    fig.show()

In [None]:
plot_chart('snp')

S&P500 mostly decline during recessions. This is not surprising. 1929-1933 Great Depression vs 2020 Great Compression. What is driving the strong rebound?

In [None]:
plot_chart('gold')

Gold does not seem to be a good hedge against recession.

In [None]:
plot_chart('btc')

Some claim that Bitcoin serves a good hedge against recession/market crash, e.g. [HERE](https://medium.com/@sanneh.si/bitcoin-confirmed-as-a-hedge-against-the-stock-market-crash-71390a55d4c3). 2020 is the first time Bitcoin-as-a-hedge-against-market-crash hypothesis is tested. Unfortunately, it seems to move somewhat in tandem with S&P500. Time will tell. Perhaps, this time is different.

In [None]:
# Plot pairplot
baseline_returns = baseline[['snp_return', 'btc_return', 'gold_return', 'high_yield_bond_return', 'inv_grade_bond_return', 'wti_return', 'recession']]

sns.pairplot(baseline_returns, hue='recession')

Empirical distributions: Recession vs. Not. Distributions are mostly more spread out during recessions. Which pairs have relationships? Positive relationships: (S&P500 & High Yield Bond, Investment Grade Bond & High Yield Bond).

<a id='subsection-three-three'></a>
## 3.3 Question 3: Does volatility foretell future return?    

In [None]:
def plot_chart_vol_ret(dataframe):
    fig = px.scatter(baseline[baseline[dataframe+'_return'].notnull()], x=dataframe + '_volatility_1m', 
                     y='one_month_forward_' + dataframe + '_return', width=800,
                     trendline = 'ols')
    fig.update_layout(title=str(dataframe) + ' volatility vs one-month forward return', xaxis_title='', yaxis_title='')
    fig.show()
    
def plot_chart_vol_ret_by_recession(dataframe):
    fig = px.scatter(baseline[baseline[dataframe+'_return'].notnull()], x=dataframe + '_volatility_1m', 
                     color='recession', y='one_month_forward_' + dataframe + '_return', 
                     color_discrete_sequence=['#636EFA', '#FFA15A'], width=800,
                     trendline = 'ols')
    fig.update_layout(title=str(dataframe) + ' volatility vs one-month forward return', xaxis_title='', yaxis_title='')
    fig.show()

In [None]:
plot_chart_vol_ret('snp')

The relationship is at best weak.

In [None]:
plot_chart_vol_ret_by_recession('snp')

The relationship between snp volatility and one-month forward return is stronger when there is no recession.

<a id='section-four'></a>
# 4. CONCLUSION

In [None]:
# Plot heatmap of the relationships across different asset classes
baseline_corr = baseline[['snp_return', 'snp_volatility_1y', 'btc_return', 'btc_volatility_1y',
                         'gold_return', 'gold_volatility_1y', 'high_yield_bond_return', 'high_yield_bond_volatility_1y',
                         'inv_grade_bond_return', 'inv_grade_bond_volatility_1y', 'wti_return', 'wti_volatility_1y',
                         'recession']].dropna().corr()

fig, ax = plt.subplots(figsize=(20,10)) 
sns.heatmap(baseline_corr, annot=True, ax = ax)

snp_return vs gold_return. snp_return vs btc_return. Past volatility correlates negatively with returns.

<a id='section-five'></a>
# 5. REFERENCES

[Can Volatility Predict Returns?](https://www.firstandmainfinancial.com/sites/default/files/users/erikwolfersnew/PDFs/Can%20Volatility%20Predict%20Returns_%20%281%29.pdf), 2016, Dimensional Fund Advisors.

[Risk and volatility: econometric models and financial practice](https://www.nobelprize.org/uploads/2018/06/engle-lecture.pdf), 2003, Nobel Lecture.

Datasets are from [Yahoo Finance](https://finance.yahoo.com/), [Federal Reserve Bank of St. Louis](https://fred.stlouisfed.org/)  
