# Electoral College Analysis

The number of electoral votes is based largely on the number of seats in Congress.  Most people understand how the size of the US Senate is determined-- 2 senators per state X 50 states == 100 senators.

The size of the House of Representatives needs a bit more explaining.

## The House of Representatives

The House of Representatives is meant to be proportional in size to the population of the state.

At the start of the Uninted States, there were only 65 seats in the House of Representatives.

George Washington considered the ideal ratio of representatives to voters to be close to 1:30,000.

As of 2016, the *average* ratio is closer to 1:700,000!

### The Reappointment Act of 1929
A combined census and reapportionment bill that sets the method for apportioning seats in the U.S. House of Representatives according to each census such that there are always 435 seats.

A truly proportional division of seats would produce fractional seats, which doesn't work well for voting scenarios.
But since the stakes are high, a formal process needs to be in place.  The process, called "the method of equal proportions" is:

* Every state gets 1 seat to start.  This leaves 538 - 50 == 385 seats.
* The *priority number* for each state is computed.
* The state with the highest priority number is awarded a seat from those remaining.
* The priority number for each state is recomputed, and the process repeats until all the seats have been awarded.

The priority number for a state is computed as follows:
$$A_n = \frac{P}{\sqrt{(n(n+1)}}$$

Where $A_n$ is the priority number for a state that has $n$ seats, and $P$ is the population of the state.  The starting priority for each state is therefore:
$$A_1 = \frac{P}{\sqrt{2}}$$


## Electoral College Votes per State

* Each state gets 1 EC vote for each member of congress.
* 100 senators + 435 representatives + 3 for the District of Columbia (see the 23rd Amendment) = 538 electoral votes

    


  

In [122]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib notebook

In [123]:
pd.options.display.float_format = '{:,.5f}'.format

# The 2010 Census

Data taken from:  https://www.census.gov/popest/data/datasets.html

Section "2010 Census Modified Race Data Summary File"

In [124]:
df = pd.read_csv("stco-mr2010_al_mo.csv", encoding="latin-1")
df = df[["STNAME", "RESPOP"]]
df = df.groupby("STNAME").sum()
df.head()

df2 = pd.read_csv("stco-mr2010_mt_wy.csv", encoding="latin-1")
df2 = df2[["STNAME", "RESPOP"]]
df2 = df2.groupby("STNAME").sum()
df2.head()

pops = df.append(df2)
del df
del df2
print(pops.head())
print(pops.tail())

              RESPOP
STNAME              
Alabama      4779736
Alaska        710231
Arizona      6392017
Arkansas     2915918
California  37253956
                RESPOP
STNAME                
Virginia       8001024
Washington     6724540
West Virginia  1852994
Wisconsin      5686986
Wyoming         563626


In [125]:
total_pop = pops.sum()["RESPOP"]
dc_pop = pops.at["District of Columbia", "RESPOP"]
total_state_pop = total_pop - dc_pop
print("Total US Population: {:,}".format(total_pop))

Total US Population: 308,745,538


If House seats were divided exactly proportionally to state populations ...

In [126]:
pops["Fractional Reps"] = (pops["RESPOP"] / total_state_pop) * 435.0
pops.head()

Unnamed: 0_level_0,RESPOP,Fractional Reps
STNAME,Unnamed: 1_level_1,Unnamed: 2_level_1
Alabama,4779736,6.74745
Alaska,710231,1.00262
Arizona,6392017,9.02347
Arkansas,2915918,4.11634
California,37253956,52.59061


## Implementation of the method of equal proportions

In [127]:
import math

states = list(pops.index)
idx = states.index("District of Columbia")
states = states[0:idx] + states[idx+1:]
state_seats = {}
for state in states:
    state_seats[state] = 1
seats_remaining = 435 - 50
while seats_remaining > 0:
    max_priority = -1
    winning_state = None
    for state in states:
        pop = pops.at[state, "RESPOP"]
        n = state_seats[state]
        priority = pop / math.sqrt(n * (n + 1))
        if priority > max_priority:
            max_priority = priority
            winning_state = state
    n = state_seats[winning_state]
    n = n + 1
    seats_remaining = seats_remaining - 1
    state_seats[winning_state] = n

### Convert the results from above into a DataFrame

In [128]:
d = dict([(state, [state_seats[state]]) for state in states])
awarded = pd.DataFrame.from_dict(d, orient='index')
awarded.rename(columns={0: "Seats"}, inplace=True)
awarded.head()

Unnamed: 0,Seats
Florida,27
Minnesota,8
Kansas,4
New Hampshire,2
Missouri,8


### Join the results to the main DataFrame

In [129]:
df = pd.merge(pops, awarded, left_index=True, right_index=True)
df.head()

Unnamed: 0,RESPOP,Fractional Reps,Seats
Florida,18801310,26.54141,27
Minnesota,5303925,7.48744,8
Kansas,2853118,4.02769,4
New Hampshire,1316470,1.85843,2
Missouri,5988927,8.45444,8


### A quick check to make sure the seats and fractions both add up to 435 ...

In [130]:
df.sum()

RESPOP            308,143,815.00000
Fractional Reps           435.00000
Seats                     435.00000
dtype: float64

### Add the senate seats to get the EC votes per state

In [131]:
df["EC Votes"] = df["Seats"] + 2

### Add a row for Washington D.C. and give it 3 Electoral Votes

In [132]:
s = pd.Series({
            "RESPOP": dc_pop,
            "EC Votes": 3
        }, name="District of Columbia")
df = df.append(s)

### Compute the fraction of EC votes for each entry, and its fraction of the total population.

In [133]:
df["Fraction EC Votes"] = df["EC Votes"] / 538.0
df["Fraction Total Pop"] = df["RESPOP"] / total_pop


### Another tally so we can make sure the fractions add up.


In [134]:
df.sum()



RESPOP               308,745,538.00000
Fractional Reps              435.00000
Seats                        435.00000
EC Votes                     538.00000
Fraction EC Votes              1.00000
Fraction Total Pop             1.00000
dtype: float64

## Analysis

The chart below is arranged from the state with the highest population to the state with the least population (the District of Columbia is also included).  State population is based on the 2010 census.

The *RESPOP* column is the residential population of the state.

*Fractional Reps* is the number of seats in the House of Representatives a state would receive if the seats could be fractional.  *Seats* shows the actual number of seats awarded, which never appears to be more than 1 away from the fraction that would be awarded.  For some states, this means the gain of a seat, while for others it means a loss, but it is relatively close to the actual proportion.

*EC Votes* shows the number of electoral votes each state is awarded.  This is equal to the number of seats it has in the House, plus 2 seats in the senate.  The exception is the District of Columbia, which just gets 3 EC votes flat out, with no regard to its population.

*Fraction EC Votes* shows the fraction of EC votes the state has with respect to the total 538 votes.  This shows that California has a little over 10% of all the EC votes.

*Fraction Total Pop* shows the fraction of the states's population with respect to the total US population (considering only the 50 states and the Disrtict of Columbia).  This shows that California has a little over 12% of the total US population.

One of the arguments I often hear in favor of the EC is that it prevents populous states from dominating the election.  Looking at the last 2 columns of this table, I'd argue that this does not appear to be the case.  

I'm going to make the assumption that a state's population is directly proportional to it's voting population, which may not be entirely accurate.  I might try to get eligible voter data and refine my results a bit.

The fact that every state gets 2 EC votes for its senators does skew the votes a little in favor of states with small populations.  Wyoming gets a 0.6% of the vote in the EC instead of 0.2% of the vote in a popular election.  However, it still has a tiny fraction of the overall vote compared to California, with gets 10% of the EC vote or 12% of the popular vote.  This is due to the fact that the EC votes a state gets are largely dominated by its seats in the House, which are based on residential population.

Going back to my earlier assumption, if the population of a state were incredily skewed to citizens too young to vote (i.e. most of the population was under 18), then the EC would actually benefit that state, as House seats are awarded based on population, not just *voter* population.


In [135]:
df.sort_values("Fraction Total Pop", ascending=False)

Unnamed: 0,RESPOP,Fractional Reps,Seats,EC Votes,Fraction EC Votes,Fraction Total Pop
California,37253956.0,52.59061,53.0,55.0,0.10223,0.12066
Texas,25145561.0,35.49745,36.0,38.0,0.07063,0.08144
New York,19378102.0,27.35565,27.0,29.0,0.0539,0.06276
Florida,18801310.0,26.54141,27.0,29.0,0.0539,0.0609
Illinois,12830632.0,18.11273,18.0,20.0,0.03717,0.04156
Pennsylvania,12702379.0,17.93168,18.0,20.0,0.03717,0.04114
Ohio,11536504.0,16.28583,16.0,18.0,0.03346,0.03737
Michigan,9883640.0,13.95252,14.0,16.0,0.02974,0.03201
Georgia,9687653.0,13.67585,14.0,16.0,0.02974,0.03138
North Carolina,9535483.0,13.46104,13.0,15.0,0.02788,0.03088
