# Does the Electoral College Just Disenfranchise Democrats?

With the election behind us once again the electoral college, that forever most silly feature of American presidential elections is bubbling up as a part of the conversation. Typically when the topic comes up it usually pits Democrats against Republicans, with Democrats favoring an overturn of the system and with the Republicans favoring the system in place. While it's pretty apparent that the electoral college does disadvantage Democrats who are primarily located in more urban states. I was curious to see if it in fact also disenfranchises Republicans as well. The strongest argument that I hear in favor for the electoral college is that it gives a bump to more rural populations who would otherwise never have their concerns met. And it's typically Republican voices that bring up this concern. Which I don't fault them for advocating for their consituents. But I'm not sure if addressing that concern on a state level really makes much sense. While our typical idea of a rural voter might be someone living in Kansas or other midwestern states. I think we underestimate how rural/Republican huge swaths of even very populous states can be. So I decided to pull some state level data on the population, number of registered voters, and their partisan composition and see if there is a difference in partisan composition in more populous states compared to the average composition. 

I wrote all of this in a jupyter notebook and you can pull this notebook and the associated data [here](https://github.com/coreyclip/coreyclip.github.io/tree/master/jupyternotebooks)

#### Sources for the data 
* partisan breakdown of states where available: 
https://en.wikipedia.org/wiki/Political_party_strength_in_U.S._states

* registered voters by state: 
https://worldpopulationreview.com/state-rankings/number-of-registered-voters-by-state
worldpopulationreview further posts sources from invidual state agencies handling the number of registered voters by state


In [10]:
import pandas as pd
import matplotlib.pyplot as plt
import os

## Data Overview

Here's the top 5 rows of the dataset. I calculated out the number of registered Democrats and Republicans per state myself from the original datasets. 

In [25]:
df = pd.read_csv('data/number_of_registered_voters_by_state.csv')
df['Democrats'] = df['totalRegistered'] * df['DemPercentage']
df['Republicans'] = df['totalRegistered'] * df['RepubPercentage']
df['IndPercentage'] = 1 - (df['DemPercentage'] + df['RepubPercentage'])
df.set_index('State', inplace=True)
df.head()

Unnamed: 0_level_0,totalRegistered,Pop,registeredPerc,asOf,DemPercentage,RepubPercentage,Democrats,Republicans,IndPercentage
State,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
Alabama,3708804,4908620,0.7556,11/4/2020,0.35,0.52,1298081.4,1928578.08,0.13
Alaska,597319,734002,0.8138,11/3/2020,0.13,0.24,77651.47,143356.56,0.63
Arizona,4281152,7378490,0.5802,11/4/2020,0.33,0.35,1412780.16,1498403.2,0.32
Arkansas,1755775,3039000,0.5777,6/3/2020,0.35,0.48,614521.25,842772.0,0.17
California,22047448,39937500,0.552,10/19/2020,0.45,0.24,9921351.6,5291387.52,0.31


First I opted to look at our dataset sorted by the number of registered Republicans. We can see that the up until recently solidly red state Texas contains the most Republicans but in second comes a solidly blue state California. What is worth noting is that three states traditionally thought of as swing states: Ohio, Florida, and Pennsylvania do come up the next three states with the most Republican voters. 

In [16]:
df.sort_values('Republicans', ascending=False).head(5)

Unnamed: 0_level_0,totalRegistered,Pop,registeredPerc,asOf,DemPercentage,RepubPercentage,Democrats,Republicans
State,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
Texas,16211198,29472300,0.55,3/1/2020,0.39,0.42,6322367.22,6808703.16
California,22047448,39937500,0.552,10/19/2020,0.45,0.24,9921351.6,5291387.52
Florida,14065627,21993000,0.6396,8/31/2020,0.37,0.35,5204281.99,4922969.45
Ohio,7774767,11747700,0.6618,3/17/2020,0.41,0.45,3187654.47,3498645.15
Pennsylvania,9091371,12820900,0.7091,11/2/2020,0.48,0.38,4363858.08,3454720.98


Looking at the states with the most Democrat voters. We can see that we basically have the same states but swap out Ohio for New York. 

In [21]:
df.sort_values('Democrats', ascending=False).head(5).reset_index()

Unnamed: 0,State,totalRegistered,Pop,registeredPerc,asOf,DemPercentage,RepubPercentage,Democrats,Republicans
0,California,22047448,39937500,0.552,10/19/2020,0.45,0.24,9921351.6,5291387.52
1,New York,13555547,19440500,0.6973,11/1/2020,0.51,0.22,6913328.97,2982220.34
2,Texas,16211198,29472300,0.55,3/1/2020,0.39,0.42,6322367.22,6808703.16
3,Florida,14065627,21993000,0.6396,8/31/2020,0.37,0.35,5204281.99,4922969.45
4,Pennsylvania,9091371,12820900,0.7091,11/2/2020,0.48,0.38,4363858.08,3454720.98


Looking at the five most populous states in the US we can see that this is basically the same set of states with the most democratic voters. 

In [20]:
df.sort_values('totalRegistered', ascending=False).head(5)

Unnamed: 0_level_0,totalRegistered,Pop,registeredPerc,asOf,DemPercentage,RepubPercentage,Democrats,Republicans
State,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
California,22047448,39937500,0.552,10/19/2020,0.45,0.24,9921351.6,5291387.52
Texas,16211198,29472300,0.55,3/1/2020,0.39,0.42,6322367.22,6808703.16
Florida,14065627,21993000,0.6396,8/31/2020,0.37,0.35,5204281.99,4922969.45
New York,13555547,19440500,0.6973,11/1/2020,0.51,0.22,6913328.97,2982220.34
Pennsylvania,9091371,12820900,0.7091,11/2/2020,0.48,0.38,4363858.08,3454720.98


In [23]:
limit = 5
print(f"percent dem in top {limit} republican states: " + str(df.sort_values('Republicans', ascending=False).head(limit)['DemPercentage'].mean().round(2) * 100) + '%')
print(f"percent dem in top {limit} democratic states: " + str(df.sort_values('Democrats', ascending=False).head(limit)['DemPercentage'].mean().round(2) * 100) + '%')
print(f"percent republican in top {limit} democratic states: " + str(df.sort_values('Democrats', ascending=False).head(limit)['RepubPercentage'].mean().round(2) * 100) + "%")
print(f"percent republican in top {limit} republican states: " + str(df.sort_values('Republicans', ascending=False).head(limit)['RepubPercentage'].mean().round(2)* 100) + '%')
print(f"average percent republican in the {limit} most populous states: " + str(df.sort_values('Pop', ascending=False)['RepubPercentage'].mean().round(2) * 100) + "%")
print(f"average percent democrat in the {limit} most populous states: " + str(df.sort_values('Pop', ascending=False)['DemPercentage'].mean().round(2) * 100) + "%")

percent dem in top 5 republican states: 42.0%
percent dem in top 5 democratic states: 44.0%
percent republican in top 5 democratic states: 32.0%
percent republican in top 5 republican states: 37.0%
average percent republican in the 5 most populous states: 37.0%
average percent democrat in the 5 most populous states: 38.0%
