# Electoral College Voting Power Analysis

This project analyzes the voting power of each state in the United States based on the Electoral College system. The goal is to highlight the discrepancies in voting power depending on the state in which a voter resides, demonstrating how this system contradicts the philosophy of "one person, one vote."

## Objectives
- Quantify the voting power of each state's citizens based on their electoral votes and population.
- Visualize the differences in voting power among states.
- Discuss the implications of these discrepancies on democratic principles and election outcomes.


## Importing Required Libraries
Let's start by importing the necessary libraries for data collection, analysis, and visualization.

In [None]:
# Install required packages
from bs4 import BeautifulSoup
import pandas as pd
import requests
import geopandas as gpd
import plotly.express as px
import json
from shapely.geometry import MultiPolygon, Polygon
from shapely.affinity import scale, translate

pd.set_option('display.max_columns', None)

## Data Collection

### Electoral Votes
We will collect the number of electoral votes for each state from the National Archives using BeautifulSoup. HTML scraping is needed to find the table on the page and extract the text results for cleaning. The strings are split by the dash delimiter to create the state and votes columns, stripped of whitespace. Then a simple regex pattern extracts the numbers from the votes strings and converts them to integers.

In [None]:
electoral_college_url = "https://www.archives.gov/electoral-college/allocation"
r = requests.get(electoral_college_url)
html_doc = r.text
soup = BeautifulSoup(html_doc, 'html.parser')

# Find the table body
table = soup.find('tbody')

# Initialize a list to store the rows of data
data = []

# Find table rows and table data within the table
for tr in table.find_all('tr'):
    row = []
    for td in tr.find_all('td'):
        row.append(td.text.strip())
    data.append(row)

# Convert the list of lists (rows) into individual list elements
state_votes = [element for list in data for element in list]

# Convert list elements into a df
ec_df = pd.DataFrame(state_votes)

# Split string 'Alabama - 9 votes' into 'Alabama' and '9 votes'
split_states = ec_df[0].str.split("-", expand=True)
ec_df = ec_df.assign(state=split_states[0].str.strip(), votes=split_states[1].str.strip())

# Remove votes from string and convert to integer
pattern = r'(\d+)'
ec_df['votes'] = ec_df['votes'].str.extract(pattern).astype(int)

# Drop original column, keeping only state and vote columns, and save as electoral votes df
electoral_votes = ec_df.drop(columns=[0])
#print(electoral_votes)

The results of electoral_votes appears like this:
| state        | votes |
|--------------|:-----:|
| Alabama      | 9     |
| Kentucky     | 8     |
| North Dakota | 3     |
| Alaska       | 3     |
| Louisiana    | 8     |

### Census Population
Next, we will gather the latest population estimates for each state from the United States Census Bureau. This data is already in good form and needs few manipulations. We also merge the datasets together, drop unnecessary columns, and scale the population into millions.

In [None]:
## Gather Census Population ##
census_url = 'https://www2.census.gov/programs-surveys/popest/datasets/2020-2023/state/totals/NST-EST2023-ALLDATA.csv'
census_df = pd.read_csv(census_url)

# Retain only relevant columns
census_df = census_df[['NAME', 'POPESTIMATE2023']]

# Filter to only states present in the electoral college dataset
states_df = census_df[census_df['NAME'].isin(electoral_votes['state'])]

# Merge into one df based on state name
merged_df = electoral_votes.merge(states_df, left_on='state', right_on='NAME')

# Drop unnecessary column and rename population column for consistency
df = merged_df.drop(columns=['NAME']).rename(columns={'POPESTIMATE2023': 'population_2023'})

# Scale the population into millions for readability
df['pop_mlns'] = df['population_2023']/1000000

## Calculating Voting Power
We will now calculate the voting power for each state by dividing the number of electoral votes by the population in millions of each state stored as the votes per million people. While this metric can be used to understand voting power for each individual in each state, we also want to compare the results between states and use it for color scaling. To assist with this, the average of every state's votes per million people is taken as the baseline for scale. This average is 2.17 electoral votes per million people as of 2023.

In [None]:
# Calculate the electoral votes per million people
df['votes_per_mln_people'] = round(df['votes']/df['pop_mlns'], 2)

# Determine a baseline of average electoral votes per million people and calculate the % of each state over or under
avg_vote = df['votes_per_mln_people'].mean()
df['voting_power_over_avg'] = round((df['votes_per_mln_people'] - avg_vote)/avg_vote * 100, 1)

#print(df)

The results of df appears like this:
| state        | votes | population_2023 | pop_mlns | votes_per_mln_people | voting_power_over_avg |
|--------------|:-----:|:---------------:|:--------:|:--------------------:|:---------------------:|
| Alabama      | 9     | 5108468         | 5.108468 | 1.76                 | -18.9                 |
| Kentucky     | 8     | 4526154         | 4.526154 | 1.77                 | -18.4                 |
| North Dakota | 3     | 783926          | 0.783926 | 3.83                 | 76.6                  |
| Alaska       | 3     | 733406          | 0.733406 | 4.09                 | 88.6                  |
| Louisiana    | 8     | 4573749         | 4.573749 | 1.75                 | -19.3                 |

## Visualization
We will create visualizations to show the voting power distribution across the United States using geopandas and a shapefile from the US Census.

In [None]:
## Load US Shapefile ##
us_url = 'https://www2.census.gov/geo/tiger/GENZ2018/shp/cb_2018_us_state_20m.zip'
gdf = gpd.read_file(us_url)
#print(gdf)

# Merge shapefile with calculated df
merged = gdf.merge(df, left_on='NAME', right_on='state')

To better visualize Alaska and Hawaii, we'll transform their coordinates to bring them closer to the continental US and scale their size. Because the labels plotted on top of the states have their own geocoordinates, these must be remapped to use for later.

In [None]:
# Custom function to transform the coordinates of Alaska and Hawaii
def transform_state_geometry(geometry, scale_factor, x_offset, y_offset):
    if isinstance(geometry, MultiPolygon):
        return MultiPolygon([translate(scale(part, xfact=scale_factor, yfact=scale_factor, origin='center'), xoff=x_offset, yoff=y_offset) for part in geometry.geoms])
    elif isinstance(geometry, Polygon):
        return translate(scale(geometry, xfact=scale_factor, yfact=scale_factor, origin='center'), xoff=x_offset, yoff=y_offset)
    else:
        raise TypeError(f"Geometry type {type(geometry)} is not supported for transformation.")

# Apply transformations to Alaska and Hawaii to move them closer to the continent
for idx, row in merged.iterrows():
    if row['STUSPS'] == 'AK':
        merged.at[idx, 'geometry'] = transform_state_geometry(row['geometry'], 0.35, 20, -10)
    elif row['STUSPS'] == 'HI':
        merged.at[idx, 'geometry'] = transform_state_geometry(row['geometry'], 1.2, 30, 10)

# Recalculate the representative points for label placement
merged['rep_point'] = merged['geometry'].apply(lambda geom: geom.representative_point().coords[:][0])

We will now create a choropleth map to visualize the voting power distribution. The map will be scaled to highlight the differences in voting power among states. Since several states are severe outliers, such as Wyoming, Vermont, and DC, a color scale using quantiles is necessary. Otherwise, a majority of the states will appear around average (white). Another custom color scale is necessary to be able to center the color spectrum at 0 and still consider the outliers.

In [None]:
# Calculate the 5th and 95th percentiles for color scaling, dealing with outliers
voting_power_percentiles = merged['voting_power_over_avg'].quantile([0.05, 0.95])
vmin, vmax = voting_power_percentiles[0.05], voting_power_percentiles[0.95]

# Define custom color scale to center at 0
color_scale = [
    [0, 'red'],
    [0.5 * (1 + vmin / vmax), 'white'],
    [1, 'blue']
]

merged_geojson = json.loads(merged.to_json())

Some changes are needed to improve the readability of the map. Labels of the electoral votes per million people are added to the states, details are added to the hovering pane, color scales are added, the map is zoomed in, and the map is centered using latitude and longitude. We also add a title, the average votes that we use for scale as the subtitle, and the top 5 and bottom 5 states by voting power are added underneath the title.

In [None]:
# Plot figure
fig = px.choropleth(
    merged,
    geojson=merged_geojson,
    locations=merged.index,
    color='voting_power_over_avg',
    hover_name='NAME',
    hover_data={'population_2023': True, 'votes': True, 'voting_power_over_avg': True, 'STUSPS': False},
    color_continuous_scale=color_scale,
    range_color=(vmin, vmax),
    labels={
        'population_2023': 'Population (2023)',
        'votes': 'Electoral Votes',
        'voting_power_over_avg': 'Percentage Difference from Average (%)'
    }
)

# Add state labels
for i, row in merged.iterrows():
    fig.add_scattergeo(
        locationmode='USA-states',
        lon=[row['rep_point'][0]],
        lat=[row['rep_point'][1]],
        text=row["votes_per_mln_people"],
        mode='text',
        textfont=dict(size=10, color='black', family="Arial", weight="bold"),
        showlegend=False,
        hovertext=f'{row["votes_per_mln_people"]} Electoral Votes per Million People',
        hoverinfo='text'
    )

# Adjust centering and scaling of map
fig.update_geos(
    projection_scale=5,
    center=dict(lat=40, lon=-100),
    visible=False
)

fig.update_layout(
    title={
        'text': f"Voting Power per Million People by State<br><sup>Average Electoral Votes per Million People: {avg_vote:.2f}</sup>",
        'x':0.5
    }
)

# Identify the top 5 and bottom 5 states based on their voting power
top_5_states = merged.nlargest(5, 'votes_per_mln_people')
bottom_5_states = merged.nsmallest(5, 'votes_per_mln_people')

# Add annotation of the top/bottom 5 states for the plot
top_5_text = "<br>".join([f"{row['STUSPS']}: {row['votes_per_mln_people']} votes/million | {row['voting_power_over_avg']}%" for idx, row in top_5_states.iterrows()])
bottom_5_text = "<br>".join([f"{row['STUSPS']}: {row['votes_per_mln_people']} votes/million | {row['voting_power_over_avg']}%" for idx, row in bottom_5_states.iterrows()])

annotations = [
    dict(
        x=0.45,
        y=0.95,
        xref='paper',
        yref='paper',
        text=f"<b>Top 5 States</b><br>{top_5_text}",
        showarrow=False,
        align='center'
    ),
    dict(
        x=0.75,
        y=0.95,
        xref='paper',
        yref='paper',
        text=f"<b>Bottom 5 States</b><br>{bottom_5_text}",
        showarrow=False,
        align='center'
    )
]

fig.update_layout(annotations=annotations)

fig.show()

The resulting map displays the United States with each state highlighted in a bolder red color for the least amount of electoral votes per person (least voting power), bolder blue color for largest votes per person (largest voting power), and towards white when near the national average of 2.17 electoral votes per million people. Each state has a label showing its number of electoral votes per million people. When hovered, it provides additional text explaining this. When the state itself is hovered over (not the label), details about the state's population, number of electoral votes, and the percentage above or below the national average are displayed.

In result, we see that some states provide significantly more voting power to each individual than others. For example, Wyoming provides 5.14 electoral votes per million people (137% above the average), whereas Texas provides 1.31 electoral votes per million people (33.6% below the average). There is an observable pattern where state population and voting power are inversely related. California, Texas, Florida, New York, and Florida are amongst the highest populations with the lowest voting power. This is not always the case, and is more apparent where population is much higher or lower than the average.

## Final Note
The Electoral College was originally established to ensure that presidential candidates represented the interests of states with smaller populations, preventing them from being overshadowed by more populous states. This system allocates a fixed number of electoral votes to each state, based on its representation in Congress (the sum of its Senators and Representatives). However, this arrangement has significant implications for the principle of "one person, one vote."

In recent presidential elections, there have been instances where the candidate who won the popular vote did not win the presidency due to the distribution of electoral votes. This discrepancy highlights how the voting power of individuals can vary greatly depending on the state in which they reside. For example, a vote in a smaller state like Wyoming carries more weight in the Electoral College than a vote in a larger state like California. This uneven distribution of voting power raises questions about the fairness and democratic nature of the Electoral College system, as it can lead to election outcomes that do not reflect the popular will of the nation.

