# PS 88 - Lab 2.1-  Varieties of Accountability

In class I showed a graph that plotted GDP growth during a president's term and how well the incumbent party did in the next election. This is often viewed as important evidence that voters reward or punish politicians based on how the economy performs under their control, which could put more competent leaders in office and give politicians incentive to work hard to give voters good outcomes.

Let's first replicate the code to make that graph here.

In [None]:
# Importing libraries
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
%matplotlib inline

In [None]:
# Loading data
pv = pd.read_stata("data/presvote.dta")
# Subsetting to years after 1940
# There isn't much good GDP data before and the great depression/rebound is a weird time
pv = pv[pv['year'] >=1940]
# Making the plot and labeling axes
sns.scatterplot(x='gdpchange', y='incvote', data=pv)
plt.xlabel('GDP Growth')
plt.ylabel('Incumbent Vote Share')

Recall that gdp growth could be a misleading indicator of whether individual financial situations are improving. Some have argued that something called "real disposable income" (RDI) is a better measure of this (see <a href="https://www.bea.gov/resources/learning-center/what-to-know-income-saving">here</a> for a comparison of some different related variables).  Fortunately our data table has this information too. In particular, the `RDIg_term` column contains the real disposable income growth over the four years preceding the election.

Theoretically, it makes sense to focus on these four years since we'd like to know how things went under the control of the incumbent (or the incumbent party).

**Question 1.1: Modify the code below to change the x axis to real disposable income growth over the four years preceding the election.**

In [None]:
# Code for 1.1: Change someting here to plot real disposable income growth
sns.scatterplot(x=..., y='incvote', data=pv)
# Change something here to label the axis properly
plt.xlabel(...)
plt.ylabel('Incumbent Vote Share')

Another common argument is that people don't necessarily think carefully about how their economic situation changed over the entire time the incumbent was in office, but only think more about the recent past. One way we can test this is by looking at RDI growth over the year leading up to the election. This is captured by the variable `RDIyrgrowth`. 

**Question 1.2. Write code to make a scatterplot with RDI growth over the year leading up to the election on the x axis and the incumbent vote share on the y axis (feel free to copy from your answer to 1.2!)**

In [None]:
# Code for 1.2 here

**Question 1.3 Compare the results of three graphs we have made so far. What might the say about the applicability of our model of political accountability? (Note: there are lots of potential answers here!)**

*Answer for 1.3*

To preview something we will learn later in class, we can also produce a similar graph but add a *line of best fit*, which describes the average trend in the data.

(We do this with a function called `regplot` in the Seaborn library, which we imported as `sns`)

In [None]:
# Creating a scatterplot with a line of best fit. 
# The ci=None option removes confidence intervals
sns.regplot(x='RDIyrgrowth', y='incvote', data=pv, ci=None)
plt.xlabel('RDI Growth')
plt.ylabel('Incumbent Vote Share')

One way to think about this line is saying "given a level of growth, what is our best prediction about the incumbent vote share?" 

There are lots of cool things we can do with this (again, more to come!) but one that is interesting in light of our accountability model is that we can think of elections that are far from this line as ones where the outcome is different than we would predict based on how the economy was doing.

To see what years had the incumbent do better or worse than expected, we can add some labels to the points (really don't sweat the details of the code here)

In [None]:
# removing NA to avoid annoying erros
pvtoplot = pv[['RDIyrgrowth', 'incvote', 'year']].dropna()
pvtoplot['year'] = pvtoplot['year'].astype(int)

# The ci=None option removes confidence intervals
sns.regplot(x='RDIyrgrowth', y='incvote', data=pv, ci=None)
plt.xlabel('RDI election year growth')
plt.ylabel('Incumbent Vote Share')

# Looping through to label points with the year
for x, y, z in zip(pvtoplot['RDIyrgrowth'], pvtoplot['incvote'], pvtoplot['year']):
 # the position of the data label relative to the data point can be adjusted by adding/subtracting a value from the x &/ y coordinates
 plt.text(x = x+.025, # x-coordinate position of data label
 y = y-.01, # y-coordinate position of data label, adjusted to be 150 below the data point
 s = z) # data label, formatted to ignore decimals
 # set colour of line

**Question 1.4. Note that 2020 is a year where the incumbent did much worse than the best fit line predicts. Why might that be (there are multiple good ways to answer this!)**

*Answer for 1.4*

**Question 1.5 [OPTIONAL] Redo the first three graphs but with the `regplot` function rather than the `scatterplot` function. Is which graphs is the line relatively steep or not steep? How does that relate to your answer to 1.3?**