## Checking in on the data as we head into the election

I thought it may be interesting to take the betting odds from predictit.org for Biden and Trump and treat them like they were an investment alongside the US Sectors and Industries for clues as to what the market is saying in respect to what will do poorly or well under each potential outcome.

data sources: predictit.org for odds on election outcome and tiingo for security price data

# Part 1: Data Gathering

In [66]:
import pandas as pd
import requests
from pandas_datareader import data as web
import pandas as pd
from dateutil.relativedelta import relativedelta
from datetime import datetime
import seaborn as sns
%matplotlib inline

  import pandas.util.testing as tm


In [67]:
import os
os.environ['TIINGO_API_KEY'] = "put your api key here"

In [68]:
sectors = ["XLY","XLP","XLE","XLF","XLV","XLI","XLB","XLRE","XLK","XLC","XLU","SPY"]

In [69]:
industries = ["MOO","IBB","PBW","GDX","ITB","KIE","FDN","AMLP","XME","XOP","OIH","VNQ","KRE","XRT","SMH","IGV","PHO"]

In [117]:
names_key = {"MOO": "agriculture","IBB": "biotech","PBW": "clean_energy","GDX": "gold_miners",
             "ITB": "homebuilders","KIE": "insurance","FDN": "internet","AMLP": "mlp","XME": "metals_mining",
             "XOP": "oil_and_gas_ep","OIH": "oil_services","VNQ": "reits","KRE": "regional_banks","XRT": "retail",
             "SMH": "semi_conductors","IGV":"software","PHO": "water_resources",
             "XLY": "consumer_discretionary","XLP": "consumer_staples","XLE": "energy","XLF": "financials",
             "XLV": "health_care","XLI": "industrials","XLB": "materials","XLRE": "real_estate",
             "XLK": "technology","XLC": "communications","XLU": "utilities", "SPY": "sp500", "Biden": "biden_odds", "Trump": "trump_odds"
            }

In [71]:
tickers = sectors + industries

In [72]:
adjusted_close = pd.DataFrame(columns=tickers)
null_tickers = []

for ticker in tickers:
  try:
    data_panel = web.DataReader([ticker], "tiingo").loc[ticker]['adjClose'].to_frame()
    data_panel.columns = [ticker]
    data_panel.index = pd.to_datetime(data_panel.index)
    if data_panel.index.max().tz_localize(None) < datetime.today() - relativedelta(days=5):
        print("{} most recent date is {}".format(ticker, str(data_panel.index.max())))
    else:
        adjusted_close[ticker] = data_panel[ticker]
  except:
    null_tickers.append(ticker)
    print("{} not found".format(ticker))

In [73]:
adjusted_close.index = adjusted_close.index.tz_localize(None)

In [74]:
election_data = pd.read_csv('election_data.csv')

In [75]:
election_data =  election_data.set_index(['date', 'name'])

In [76]:
election_data = election_data.unstack(level=-1)

In [77]:
election_data = election_data.close_price.rename_axis([None], axis=1).reset_index()
election_data = election_data.set_index('date')

In [78]:
election_data.index = pd.to_datetime(election_data.index)

In [79]:
election_data = election_data.sort_index(ascending=True)

In [80]:
all_data = pd.merge(election_data, adjusted_close, left_index=True, right_index=True)

In [81]:
daily_changes = all_data.pct_change()

In [82]:
thirty_day_corr = daily_changes.iloc[-21:].corr()

In [83]:
sixty_day_corr = daily_changes.iloc[-42:].corr()

In [84]:
two_week_corr = daily_changes.iloc[-10:].corr()

# Part 2: Analysis and Exploring the Data for Clues

I might lose some people here, but what I am trying to do is examine how the daily changes in the betting odds for Trump and Biden have correlated with the daily changes in US Sectors and Industries. 
This is the same analysis you would do to see how correlated a basket of securities may be with one another, or the holdings in your portfolio. We are taking the daily prices, and getting the percentage changes between each day to see how they move with or against eachother. 

First, we will look at the last 21 days (roughly 30 calendar days) to see how the sectors and industries have moved with the changes in the betting odds of each candidate. 

In [85]:
thirty_day_corr = thirty_day_corr[['Biden', 'Trump']]

In [86]:
thirty_day_corr = thirty_day_corr.drop(['Trump', 'Biden'], axis=0)

In [87]:
thirty_day_corr['name'] = thirty_day_corr.index.to_series().map(names_key)

Inverse correlation with Biden probability of win (top 5)

In [89]:
thirty_day_corr.sort_values(by='Biden').head(5)

Unnamed: 0,Biden,Trump,name
AMLP,-0.358687,0.222781,mlp
XOP,-0.354613,0.257035,oil_and_gas_ep
XLE,-0.337303,0.288056,energy
XLI,-0.3293,0.308651,industrials
OIH,-0.293264,0.184944,oil_services


Nothing surprising with the results here, as they are relatively intuitive and in line with what you would think. 4 out of 5 sectors/industries that have had the lowest correlation with Biden's odds happen to be related to oil and energy production/consumption.

Positive correlation with Biden probability of win (bottom 5)

In [90]:
thirty_day_corr.sort_values(by='Biden').tail(5)

Unnamed: 0,Biden,Trump,name
SMH,0.191788,0.315877,semi_conductors
XRT,0.199777,0.03955,retail
XLP,0.201265,0.29347,consumer_staples
IGV,0.208014,0.240449,software
ITB,0.244726,-0.201353,homebuilders


The most correlated sectors (meaning when Biden's odds go up, these sectors tend to go up in price and vice-a-versa) are Homebuilders, Software, Consumer Staples, Retail and Semi-Conductors. Kind of random overall and certainly not as intuitive as the negative correlation group.

Inverse correlation with Trump probability of win (top 5)

In [91]:
thirty_day_corr.sort_values(by='Trump').head(5)

Unnamed: 0,Biden,Trump,name
ITB,0.244726,-0.201353,homebuilders
VNQ,0.09878,-0.145101,reits
XLRE,0.072001,-0.105775,real_estate
PBW,0.121667,-0.069509,clean_energy
XLU,-0.105744,0.012653,utilities


The lowest correlation with Trump's odds of winning make sense for the most part. REITs and Real Estate generally will be unaffected by changes in tax code due to the way they are structured, and Clean Energy and Utilities are favored under a Biden win (relatively speaking). 

Positive correlation with Trump probability of win (top 5)

In [92]:
thirty_day_corr.sort_values(by='Trump').tail(5)

Unnamed: 0,Biden,Trump,name
XLV,-0.107735,0.423726,health_care
SPY,-0.032973,0.428318,sp500
XLK,0.043714,0.465146,technology
XLC,-0.072248,0.512056,communications
MOO,-0.230897,0.590906,agriculture


The highest correlation with a Trumps election odds are Healthcare, the S&P 500, Technology, Communications and Agriculture. The fact that healthcare has such a positive correlation to Trump's odds, but is not in the top five for negative correlation with Biden odds may show that healthcare is already priced for a Biden win at this point and may offer some relative value should Trump end up winning.

Now what about the last two months? This may help weed out some randomness from the last month (or not).

In [93]:
sixty_day_corr = sixty_day_corr[['Biden', 'Trump']]

In [94]:
sixty_day_corr = sixty_day_corr.drop(['Trump', 'Biden'], axis=0)

In [95]:
sixty_day_corr['name'] = sixty_day_corr.index.to_series().map(names_key)

Inverse correlation with Biden probability of win (top 5)

In [97]:
sixty_day_corr.sort_values(by='Biden').head(5)

Unnamed: 0,Biden,Trump,name
XOP,-0.18491,0.146874,oil_and_gas_ep
XLE,-0.154462,0.138974,energy
XLB,-0.151697,0.112168,materials
MOO,-0.141861,0.208704,agriculture
XLI,-0.131766,0.05644,industrials


The two lowest correlated sectors/industry are again energy related which is no surprise. 

Positive correlation with Biden probability of win (bottom 5)

In [98]:
sixty_day_corr.sort_values(by='Biden').tail(5)

Unnamed: 0,Biden,Trump,name
IBB,0.099513,0.073609,biotech
XLRE,0.102936,-0.156825,real_estate
VNQ,0.105734,-0.166976,reits
XLP,0.163886,0.034102,consumer_staples
XLU,0.186581,-0.199489,utilities


Biotech, REITs, Utilities, Consumer Staples make up the top 5, but Biotech is very close to 0. Outside of Utilities and Consumer Staples these look more random/uncorrelated than anything else. Not much here.

Inverse correlation with Trump probability of win (top 5)

In [99]:
sixty_day_corr.sort_values(by='Trump').head(5)

Unnamed: 0,Biden,Trump,name
XLU,0.186581,-0.199489,utilities
VNQ,0.105734,-0.166976,reits
XLRE,0.102936,-0.156825,real_estate
ITB,0.087515,-0.106483,homebuilders
PBW,0.008528,-0.017871,clean_energy


Utilities appear to be a fairly correlated with both Biden (positive) and Trump (negative), so the market is clearly tit for tat when it comes to predicted impact of either candidate. Same appears to be true fro REITs and Real Estate. This may be where the money has been going as a hedge against a Trump loss and also impacts from higher corporate taxes.

Positive correlation with Trump probability of win (top 5)

In [100]:
sixty_day_corr.sort_values(by='Trump').tail(5)

Unnamed: 0,Biden,Trump,name
FDN,-0.087476,0.176426,internet
IGV,-0.115033,0.192522,software
MOO,-0.141861,0.208704,agriculture
XLC,-0.097236,0.21592,communications
XLK,-0.050322,0.25373,technology


Finally, the most correlated sectors with Trump's betting odds mostly involve technology. I am wondering if the market is pricing in the potential for unfavorable treatment towards the technology giants or a potential break up of the megacap tech companies?

In [111]:
last_month_performance = (all_data.loc['2020-10-23']-all_data.loc['2020-09-25'])/all_data.loc['2020-09-25']

In [114]:
last_month_performance = last_month_performance.sort_values().to_frame()

In [118]:
last_month_performance['name'] = last_month_performance.index.to_series().map(names_key)

In [120]:
last_month_performance.columns = ['last_30_day_performance', 'name']

I thought it would be interesting to look at the performance across the US Sectors and Industries from September 25 to today. This was the date that the odds really took off for a Biden win. It looks as though his odds jumped around 12.3% while Trump's declined around 10.87%. What has moved the most/least from that point to today?

In [121]:
last_month_performance

Unnamed: 0,last_30_day_performance,name
Trump,-0.108696,trump_odds
XLE,0.006958,energy
GDX,0.010411,gold_miners
IBB,0.022627,biotech
XLRE,0.02769,real_estate
VNQ,0.034234,reits
XLP,0.037278,consumer_staples
XLK,0.039494,technology
MOO,0.040534,agriculture
ITB,0.04355,homebuilders


What is interesting is that every single sector and industry is positive performance wise from that day to today. Regional banks are up over 23%, followed by 19.9% from Clean Energy and around 15.85% for MLPs. This performance is very odd and sends a fairly mixed message overall.

By now, almost everyone who follows the market has heard about the statistic that the market's performance 3-months leading up to the election is a fairly accurate predictor as to who wins. Positive performance typically indicates the incumbent party (Trump) wins while negative performance indicates the challenger wins (Biden). Let's see what that signal is telling us as of Friday's close?

In [122]:
three_month_rule_performance = (all_data.loc['2020-10-23']-all_data.loc['2020-08-05'])/all_data.loc['2020-08-05']

In [123]:
three_month_rule_performance = three_month_rule_performance.sort_values().to_frame()

In [124]:
three_month_rule_performance['name'] = three_month_rule_performance.index.to_series().map(names_key)

In [125]:
three_month_rule_performance.columns = ['three_months_to_election_performance', 'name']

In [126]:
three_month_rule_performance

Unnamed: 0,three_months_to_election_performance,name
OIH,-0.255269,oil_services
XLE,-0.17428,energy
XOP,-0.161826,oil_and_gas_ep
GDX,-0.128228,gold_miners
AMLP,-0.06677,mlp
Trump,-0.02381,trump_odds
IBB,-0.005279,biotech
XLRE,-0.002495,real_estate
VNQ,-0.001599,reits
XLV,0.015165,health_care


As of Friday 10/23/2020 the S&P 500 was a little over 4.5% positive from the beginning of 3 months to the election (8/5/2020). However, as I am writing this, the S&P 500 is down a little over 2%, which is about halfway to 0% performance. If you believe the betting odds and think Biden is going to win, and also believe in the 3-month rule, then you should expected a greater than 2.5% loss for the S&P 500 over the next week and a half!

If you believe the odds are worthless, but still believe in the 3-month rule, depending on who you believe will win, the market could be flat (Trump win), go down (Biden win), or go up a lot (Everyone wins)!