# Voter Turnout Demographics

Census Bureau's Current Population Survey(or CPS for short). The CPS provides a comprehensive snapshot of voter turnout among various demographic groups.
Data from http://www.electproject.org/home/voter-turnout/demographics.

In [179]:
#import libraries
import pandas as pd
from matplotlib import pyplot as plt
import datetime as dt
import matplotlib.ticker as mtick
#%matplotlib notebook

Import the data stored in "CPS Turnout Rates - Race and Ethnicity.csv" file.

In [180]:
data= pd.read_csv("C:\\Users\\arzum\\Downloads\\CPS Turnout Rates - Race and Ethnicity.csv", nrows=4, index_col='Turnout Rate').transpose()
#check for imported data
data.describe()
data.columns

Index(['Non-Hispanic White', 'Non-Hispanic Black', 'Hispanic', 'Other'], dtype='object', name='Turnout Rate')

The index must be a datetime object to plot time series. 

In [181]:
data.index=pd.to_datetime(data.index, format='%Y').year
data.columns

Index(['Non-Hispanic White', 'Non-Hispanic Black', 'Hispanic', 'Other'], dtype='object', name='Turnout Rate')

Clean the data from $\%$ string, assert that it is type float.  

In [182]:
data['Non-Hispanic White'] = data['Non-Hispanic White'].str.replace('\%','', regex=True).astype(float)
data['Non-Hispanic White']

2020    72.6
2018    55.2
2016    64.7
2014    40.8
2012    61.8
2010    45.0
2008    65.2
2006    44.7
2004    64.3
2002    42.7
2000    57.6
1998    40.6
1996    54.5
1994    44.2
1992    61.6
1990    40.9
1988    55.7
1986    39.8
Name: Non-Hispanic White, dtype: float64

Repeat the same exercise for other variables.

In [183]:
data['Non-Hispanic Black'] = data['Non-Hispanic Black'].str.replace('\%','', regex=True).astype(float)
data['Non-Hispanic Black']

2020    65.6
2018    51.3
2016    59.9
2014    36.4
2012    67.4
2010    41.6
2008    69.1
2006    36.6
2004    61.4
2002    37.7
2000    52.9
1998    36.0
1996    48.1
1994    33.2
1992    50.6
1990    33.0
1988    46.8
1986    35.8
Name: Non-Hispanic Black, dtype: float64

In [184]:
data['Hispanic'] = data['Hispanic'].str.replace('\%','', regex=True).astype(float)
data['Hispanic']

2020    52.5
2018    36.9
2016    44.9
2014    21.1
2012    43.1
2010    26.6
2008    46.5
2006    25.5
2004    42.9
2002    25.5
2000    38.9
1998    26.5
1996    37.9
1994    27.3
1992    41.5
1990    26.0
1988    38.5
1986    28.2
Name: Hispanic, dtype: float64

In [185]:
data['Other'] = data['Other'].str.replace('\%','', regex=True).astype(float)
data['Other'] 

Plot the voter turnout in the Midterm elections for the period of 1986-2020. Plots should be made for every Race and Ethnicity. 

In [186]:
fig, ax = plt.subplots()
ax.plot(data.index, data['Non-Hispanic White'], label='Non-Hispanic White',marker='o' )
ax.plot(data.index, data['Non-Hispanic Black'], label= 'Non-Hispanic Black',marker='o')
ax.plot(data.index, data['Hispanic'], label='Hispanic',marker='o' )
ax.plot(data.index, data["Other"], label='Other',marker='o' )
plt.style.use('default')
fmt = '%.0f%%' # Format you want the ticks, e.g. '40%'
yticks = mtick.FormatStrFormatter(fmt)
ax.yaxis.set_major_formatter(yticks)
ax.grid(True)
plt.xlabel("Year")
plt.ylabel("Percentage")
plt.title("Turnout Rates: Race and Ethnicity")
plt.legend()
plt.show()

<IPython.core.display.Javascript object>

## Turnout Rates by Age Categories
Import and clean the data.

In [158]:
data_age= pd.read_csv("C:\\Users\\arzum\\Downloads\\CPS Turnout Rates - Age.csv", nrows=4, index_col='Turnout Rate').transpose()
data_age.columns
data_age.index=pd.to_datetime(data_age.index, format='%Y').year
data_age['18-29']=data_age['18-29'].str.replace('\%','',regex=True).astype(float)
data_age['30-44']=data_age['30-44'].str.replace('\%','',regex=True).astype(float)
data_age['45-59']=data_age['45-59'].str.replace('\%','',regex=True).astype(float)
data_age['60+']=data_age['60+'].str.replace('\%','',regex=True).astype(float)
#Summary statistics
data_age.describe()

Turnout Rate,18-29,30-44,45-59,60+
count,18.0,18.0,18.0,18.0
mean,31.283333,47.3,58.566667,63.411111
std,12.142742,11.025852,9.290096,7.410953
min,16.3,30.1,42.6,53.7
25%,20.3,36.85,50.175,57.125
50%,32.85,48.15,59.35,65.1
75%,42.25,56.725,66.15,69.275
max,52.5,64.4,72.9,78.0


Make the plots. 

In [177]:
fig, ax = plt.subplots()
ax.plot(data_age.index, data_age['18-29'], label='18-29',marker='o' )
ax.plot(data_age.index, data_age['30-44'], label= '30-44',marker='o')
ax.plot(data_age.index, data_age['45-59'], label='45-59',marker='o' )
ax.plot(data_age.index, data_age['60+'], label='60+',marker='o' )
plt.style.use('default')
fmt = '%.0f%%' # Format you want the ticks, e.g. '40%'
yticks = mtick.FormatStrFormatter(fmt)
ax.yaxis.set_major_formatter(yticks)
ax.grid(True)
plt.xlabel("Year")
plt.ylabel("Turnout Rates")
plt.title("Citizen Voting-Age Population: Turnout Rates by Age")
plt.legend()
plt.show()

<IPython.core.display.Javascript object>

## Turnout Rates by Education Level 
Import and clean the data.

In [167]:
data_edu= pd.read_csv("C:\\Users\\arzum\\Downloads\\CPS Turnout Rates - Education.csv", nrows=4, index_col='Turnout Rate').transpose()
data_edu.columns
data_edu.index=pd.to_datetime(data_edu.index, format='%Y').year
data_edu['Less Than High School']=data_edu['Less Than High School'].str.replace('\%','',regex=True).astype(float)
data_edu['High School Grad']=data_edu['High School Grad'].str.replace('\%','',regex=True).astype(float)
data_edu['Some College to College Grad']=data_edu['Some College to College Grad'].str.replace('\%','',regex=True).astype(float)
data_edu['Post-Graduate']=data_edu['Post-Graduate'].str.replace('\%','',regex=True).astype(float)
#Basic summary statistics
data_edu.describe()

Turnout Rate,Less Than High School,High School Grad,Some College to College Grad,Post-Graduate
count,18.0,18.0,18.0,18.0
mean,28.816667,42.966667,58.733333,76.472222
std,6.467134,8.451801,11.589752,10.473148
min,17.6,29.0,41.4,59.9
25%,23.075,34.575,47.75,67.025
50%,31.25,45.75,60.5,79.45
75%,34.05,49.175,68.375,84.975
max,36.9,54.2,76.9,90.4


Make the plot. 

In [172]:
fig, ax = plt.subplots()
ax.plot(data_edu.index, data_edu['Less Than High School'], label='Less Than High School', marker='o')
ax.plot(data_edu.index, data_edu['High School Grad'], label= 'High School Grad',marker='o')
ax.plot(data_edu.index, data_edu['Some College to College Grad'], label='Some College to College Grad',marker='o' )
ax.plot(data_edu.index, data_edu['Post-Graduate'], label='Post-Graduate',marker='o')
plt.style.use('default')
fmt = '%.0f%%' # Format you want the ticks, e.g. '40%'
yticks = mtick.FormatStrFormatter(fmt)
ax.yaxis.set_major_formatter(yticks)
ax.grid(True)
plt.xlabel("Year")
plt.ylabel("Turnout Rates")
plt.title("Turnout Rates by Education")
plt.legend()
plt.show()

<IPython.core.display.Javascript object>