# In What Electoral System Would Donald Trump Have Won The 2020 US Election?

## Introduction

The Electoral College system has been a concrete part of the US presidential election since the founding of the country. In this system, the citizens of each state cast their votes for their choice of electors, who will cast their votes for the presidential candidates. The number of electors of each state is the sum of the number of Senates and the number of Representatives. The Electoral College consists of 538 electors in total, and each elector's vote is counted as 1 vote in the general election. In the end, the candidate who wins over the majority - which means 270 votes - wins the election. The system was created out of good intention to balance the power of small states and large states, however, criticism is getting louder after several instances in which the winner of the popular vote did not win the election due to the outcome of the Electoral College vote. This has led to questions about whether the system accurately represents the will of the US citizens and if there might be a better way to elect the President.

As the fact is that Joe Biden won in the 2020 election, the goal of the paper is to find out whether there exists any electoral system which would have allowed Donald Trump to win in the 2020 election based the original votes.

The proposed electoral systems in this paper ALL have a common presumption, that the voting unit (county or state) which will be responsible to cast ALL of its votes to one candidate. This concept, known as the "winner-take-all" principle, is similar to the existing voting system in 48 of the 50 states and Washington, D.C. in the US.

As data analysis shows that Republican voters are disproportionally located in rural areas and small cities, the 4 electoral systems proposed in the paper all put higher focus on the number of voting unit (county or state), instead of the population of voting unit (county or state). Therefore, 3 of 4 electoral systems makes Donald to win in the 2020 election.

### Variables
- Input $X_1$: Raw votes of DEM in counties
- Input $X_2$: Raw votes of REP in counties
- Input $X_3$: Raw total votes in counties
-- $X_1$, $X_2$ and $X_3$ are all vectors, each component $X_{1,i}$, $X_{2,i}$, $X_{3,i}$, where i $\in$ [1,n], represents the raw votes of DEM, the raw votes of REP and the raw total votes in a specific county, n is the count of counties.
- Output $Y_1$: Fabricated Votes of DEM in US
- Output $Y_2$: Fabricated Votes of REP in US

* **Electoral System I**
Each county has same number of votes as its voters, it casts all of its votes on the party (candidate) that won the majority of votes.
According to the Electoral System I,

$$ Y_1 = \sum \limits _{i=1} ^{|Counties|} (if X_{1,i} > $X_1$), X_{3,i}, 0) $$

$$ Y_2 = \sum \limits _{i=1} ^{|Counties|} (if X_{1,i} < X_{2,i}), X_{3,i}, 0) $$

* **Electoral System II**
Each county has one vote, it casts its vote on the party (candidate) that won the majority of votes.**
According to the Electoral System II,

$$ Y_1 = \sum \limits _{i=1} ^{|Counties|} (if X_{1,i} > X_{2,i}), 1, 0) $$

$$ Y_2 = \sum \limits _{i=1} ^{|Counties|} (if X_{1,i} < X_{2,i}), 1, 0) $$

* **Electoral System III**
Each state has same number of votes as its voters, it casts all of its votes on the party (candidate) that won the majority of counties.
According to the Electoral System III,

$$ Y_{1} = \sum \limits _{i=1} ^{|States|} (if (DEMC_{i} > REPC_{i}), TVofState_{i}, 0) $$

$$ Y_{2} = \sum \limits _{i=1} ^{|States|} (if (DEMC_{i} < REPC_{i}), TVofState_{i}, 0) $$

$$ DEMC_{i} = \sum \limits _{j=1} ^{|Counties-in-a-state_i|} (if (X_{1,j} > X_{2,j}), 1, 0) $$

$$ REPC_{i} = \sum \limits _{j=1} ^{|Counties-in-a-state_i|} ((if (X_{1,j} < X_{2,j}), 1, 0) $$

$$ TVofState_{i} = \sum \limits _{j=1} ^{|Counties-in-a-state_i|} X3_{j} $$

* **Electoral System IV**
Each state has one vote, it casts its vote on the party (candidate) that won the majority of counties
According to the Electoral System IV,

$$ Y_{1} = \sum \limits _{i=1} ^{|States|} (if (DEMC_{i} > REPC_{i}), 1, 0) $$

$$ Y_{2} = \sum \limits _{i=1} ^{|States|} (if (DEMC_{i} < REPC_{i}), 1, 0) $$

$$ DEMC_{i} = \sum \limits _{j=1} ^{|Counties-in-a-state_i|} (if (X_{1,j} > X_{2,j}), 1, 0) $$

$$ REPC_{i} = \sum \limits _{j=1} ^{|Counties-in-a-state_i|} (if (X_{1,j} < X_{2,j}), 1, 0) $$

## Data Cleaning/Loading

### Datasets used in this article
1. Kaggle: US Election 2020
    The information includes, states, counties, candidates, parties, and their respective votes.
2. SimpleMaps: US cities except for District of Columbia
    The information includes counties' latitude, and longitude.
3. Census Reporter: District of Columbia
    The information includes 8 Wards of District of Columbia for their latitude, and longitude.
    Note: This information was searched manually, and is manually entered onto the second dataset.

### Data Loading

In [518]:
import pandas as pd
import numpy as np; np.random.seed(42)
import matplotlib.pyplot as plt
import seaborn as sns

*Note a wrong data in president_county_candidate.csv: Maine, Glenwood Plt., Donald Trump, corrected based on
https://www.nytimes.com/interactive/2020/11/03/us/elections/results-maine.html

In [519]:
df = pd.read_csv('/Users/jiaxincui/Documents/GitHub/git-exercise-AllysonCui/Data/Kaggle/president_county_candidate.csv')
df.head()

Unnamed: 0,state,county,candidate,party,total_votes,won
0,Delaware,Kent County,Joe Biden,DEM,44552,True
1,Delaware,Kent County,Donald Trump,REP,41009,False
2,Delaware,Kent County,Jo Jorgensen,LIB,1044,False
3,Delaware,Kent County,Howie Hawkins,GRN,420,False
4,Delaware,New Castle County,Joe Biden,DEM,195034,True


In [520]:
df_pc = pd.read_csv('/Users/jiaxincui/Documents/GitHub/git-exercise-AllysonCui/Data/Kaggle/president_county.csv')
df_pc.head()

Unnamed: 0,state,county,current_votes,total_votes,percent
0,Delaware,Kent County,87025,87025,100
1,Delaware,New Castle County,287633,287633,100
2,Delaware,Sussex County,129352,129352,100
3,District of Columbia,Ward 1,41681,41681,100
4,District of Columbia,Ward 2,32881,32881,100


In [521]:
df_geo = pd.read_csv('/Users/jiaxincui/Documents/GitHub/git-exercise-AllysonCui/Data/Other/uscities.csv')
df_geo.head()

Unnamed: 0,city,city_ascii,state_id,state_name,county_fips,county_name,lat,lng,population,density,source,military,incorporated,timezone,ranking,zips,id
0,New York,New York,NY,New York,36081,Queens,40.6943,-73.9249,18680025,10768.0,shape,False,True,America/New_York,1,11229 11228 11226 11225 11224 11222 11221 1122...,1840034016
1,Los Angeles,Los Angeles,CA,California,6037,Los Angeles,34.1141,-118.4068,12531334,3267.0,shape,False,True,America/Los_Angeles,1,91367 90291 90293 90292 91316 91311 90035 9003...,1840020491
2,Chicago,Chicago,IL,Illinois,17031,Cook,41.8375,-87.6866,8586888,4576.0,shape,False,True,America/Chicago,1,60018 60649 60641 60640 60643 60642 60645 6064...,1840000494
3,Miami,Miami,FL,Florida,12086,Miami-Dade,25.784,-80.2101,6076316,4945.0,shape,False,True,America/New_York,1,33128 33129 33125 33126 33127 33149 33144 3314...,1840015149
4,Dallas,Dallas,TX,Texas,48113,Dallas,32.7935,-96.7667,5910669,1522.0,shape,False,True,America/Chicago,1,75098 75287 75230 75231 75236 75237 75235 7525...,1840019440


In [522]:
pd.set_option('display.max_columns', None)

### Data Cleaning

1. Remove the rows of US Election 2020 where the candidate did not win.

In [523]:
df['county'] = df['county'].str.replace(' County', '')
df_pc['county'] = df_pc['county'].str.replace(' County', '')

2. Match the geographical coordinates with votes.



In [524]:
df_copy = df.copy()
df_pc_copy = df_pc.copy()
df_geo_copy = df_geo.copy()
df_geo_copy.drop(['city', 'city_ascii', 'state_id', 'county_fips', 'source', 'military', 'incorporated', 'timezone', 'ranking', 'zips', 'id', 'population', 'density'], axis = 1, inplace = True)
df_geo_copy.rename(columns={"state_name": "state", "county_name": "county"}, inplace = True)
df_geo_copy.drop_duplicates(subset = ['state', 'county'], keep = 'first', inplace = True)
df_county_statistics = pd.merge(df_copy, df_geo_copy, on=['county', 'state'], how = 'left')
df_pc_copy.drop(['total_votes', 'percent'], axis = 1, inplace = True)
df_county = df_county_statistics.copy()
df_county.drop(df_copy[df_copy['won'] == False].index, inplace = True)
df_county.reset_index(inplace=True)
df_county['total_votes'] = df_pc_copy['current_votes']
df_county.drop(['index'], axis = 1, inplace = True)
df_county.head()

Unnamed: 0,state,county,candidate,party,total_votes,won,lat,lng
0,Delaware,Kent,Joe Biden,DEM,87025,True,39.161,-75.5202
1,Delaware,New Castle,Joe Biden,DEM,287633,True,39.7415,-75.5416
2,Delaware,Sussex,Donald Trump,REP,129352,True,38.9091,-75.4227
3,District of Columbia,Ward 1,Joe Biden,DEM,41681,True,38.9072,-77.0369
4,District of Columbia,Ward 2,Joe Biden,DEM,32881,True,38.9063,-77.034


Calculating data for summary statistics table. ->

In [525]:
df_county_won = df_county.copy()
df_county_won.drop(df_county_won[df_county_won['candidate'] != 'Donald Trump'].index, inplace = True)
df_county_won.drop(['lat', 'lng'], axis = 1, inplace = True)
df_county_won.rename(columns={"total_votes": "county_total_votes"}, inplace = True)
df_county_won = pd.merge(df_county_won, df_copy, on=['county', 'state', 'candidate', 'party', 'won'], how='left')
df_county_won.rename(columns={"total_votes": "won_votes"}, inplace = True)
df_county_won = df_county_won[['state', 'county', 'candidate', 'party', 'won', 'won_votes', 'county_total_votes']]
df_county_won['share of votes in the winning counties'] = \
    df_county_won['won_votes'] / df_county_won['county_total_votes']
#df_county_won['share of votes in the winning counties'] = \
#    df_county_won['share of votes in the winning counties'].apply(lambda x:round(x,2))
df_county_won.head()

Unnamed: 0,state,county,candidate,party,won,won_votes,county_total_votes,share of votes in the winning counties
0,Delaware,Sussex,Donald Trump,REP,True,71230,129352,0.550668
1,Florida,Baker,Donald Trump,REP,True,11911,14059,0.847215
2,Florida,Bay,Donald Trump,REP,True,66097,93024,0.710537
3,Florida,Bradford,Donald Trump,REP,True,10334,13632,0.758069
4,Florida,Brevard,Donald Trump,REP,True,207883,360764,0.57623


In [526]:
df_county_lost = df_county_statistics.copy()
df_county_lost.drop(df_copy[df_copy['won'] == True].index, inplace = True)
df_county_lost.drop(df_county_lost[df_county_lost['candidate'] != 'Donald Trump'].index, inplace = True)
df_county_lost.drop(['lat', 'lng'], axis = 1, inplace = True)
df_county_lost.reset_index(inplace=True)
df_county_lost.rename(columns={"total_votes": "won_votes"}, inplace = True)
df_county_lost = pd.merge(df_county_lost, df_pc_copy, on = ['state', 'county'], how = 'left')
df_county_lost.rename(columns={"current_votes": "county_total_votes"}, inplace = True)
df_county_lost.drop(['index'], axis = 1, inplace = True)
df_county_lost = df_county_lost[['state', 'county', 'candidate', 'party', 'won', 'won_votes', 'county_total_votes']]
df_county_lost['share of votes in the losing counties'] =\
    df_county_lost['won_votes'] / df_county_lost['county_total_votes']
#df_county_lost['share of votes in the losing counties'] = \
#    df_county_lost['share of votes in the losing counties'].apply(lambda x:round(x,2))
df_county_lost.head()

Unnamed: 0,state,county,candidate,party,won,won_votes,county_total_votes,share of votes in the losing counties
0,Delaware,Kent,Donald Trump,REP,False,41009,87025,0.471232
1,Delaware,New Castle,Donald Trump,REP,False,88364,287633,0.307211
2,District of Columbia,Ward 1,Donald Trump,REP,False,1725,41681,0.041386
3,District of Columbia,Ward 2,Donald Trump,REP,False,2918,32881,0.088744
4,District of Columbia,Ward 3,Donald Trump,REP,False,3705,44231,0.083765


For Electoral System I and II, in which the county is the vote casting level, data cleaning is done. They share the same dataframe for plotting purposes.

For Electoral System III and IV, in which the state is the vote casting level, continue the following steps:

3. Calculate the number of counties won by each candidate in each state.

In [527]:
df_state = df_copy.groupby(["state", "candidate", "party"]).sum(numeric_only = True)
df_state.rename(columns={"won": "# of counties won"}, inplace = True)
#add 3 level of MultiIndex
df_state.index = [df_state.index.get_level_values(0),
            df_state.index.get_level_values(1),
                  df_state.index.get_level_values(2)]
df_state = df_state.reset_index() \
       .sort_values(['state','# of counties won'], ascending=[True,False]) \
       .set_index(['state','party'])
df_state.drop(['total_votes'], axis = 1, inplace = True)
df_state.head()

Unnamed: 0_level_0,Unnamed: 1_level_0,candidate,# of counties won
state,party,Unnamed: 2_level_1,Unnamed: 3_level_1
Alabama,REP,Donald Trump,54
Alabama,DEM,Joe Biden,13
Alabama,WRI,Write-ins,0
Alabama,LIB,Jo Jorgensen,0
Alaska,REP,Donald Trump,20


4. Keep the candidate of the highest voting only.

In [528]:
df_state1 = df_state.copy()
df_state1.reset_index(inplace=True)
df_state1.drop_duplicates(
  subset = ['state'],
  keep = 'first', inplace = True)
df_state1 = df_state1.reset_index(drop=True)
df_state1.head()

Unnamed: 0,state,party,candidate,# of counties won
0,Alabama,REP,Donald Trump,54
1,Alaska,REP,Donald Trump,20
2,Arizona,REP,Donald Trump,10
3,Arkansas,REP,Donald Trump,67
4,California,DEM,Joe Biden,35


Unnamed: 0,state,party,candidate,# of counties won
0,Alabama,REP,Donald Trump,54
1,Alaska,REP,Donald Trump,20
2,Arizona,REP,Donald Trump,10
3,Arkansas,REP,Donald Trump,67
4,California,DEM,Joe Biden,35


5. Replace ALL candidates and parties' name by the winner of each state, because now all the votes of a state belongs to the winner.

In [529]:
df_state2 = df_county.copy()
df_state2.drop(['candidate'], axis = 1, inplace = True)
df_state2.drop(['party'], axis = 1, inplace = True)
df_state2 = pd.merge(df_state2, df_state1, on = 'state', how = 'left')
df_state2.head()

*Need to Improve*
*Some counties' latitude and longitude are still missing, will find other datasets to fill the vacancies.*

Summary statistics table for III and IV ->

In [530]:
df_state_statistics = df_state1.copy()
df_state_total = df_state.groupby(level=[0]).sum(numeric_only = True)
df_state_total.rename(columns={"# of counties won": "# of total counties"}, inplace = True)
df_state_statistics = pd.merge(df_state_statistics, df_state_total, on = 'state', how = 'left')
df_state_statistics.head()
#df_state_won.drop(df_copy[df_copy['candidate'] == 'Donald Trump'].index, inplace = True)

In [531]:
df_state_won = df_state_statistics.copy()
df_state_won.drop(df_state_won[df_state_won['candidate'] != 'Donald Trump'].index, inplace = True)
df_state_won['share of counties in the winning states'] = \
    df_state_won['# of counties won'] / df_state_won['# of total counties']
df_state_won['won'] = True
df_state_won = df_state_won[['state', 'candidate', 'party', 'won', '# of counties won', '# of total counties', 'share of counties in the winning states']]
#df_state_won['share of counties in the winning states'] = \
#    df_state_won['share of counties in the winning states'].apply(lambda x:round(x,2))
df_state_won.head()

In [532]:
df_state_lost = df_state1.copy()
df_state_lost.drop(df_state_lost[df_state_lost['candidate'] == 'Donald Trump'].index, inplace = True)
df_state_lost.drop(['party', 'candidate', '# of counties won'], axis = 1, inplace = True)
df_state_lost = pd.merge(df_state_lost, df_state, on = ['state'], how = 'left')
df_state_lost.drop(df_state_lost[df_state_lost['candidate'] != 'Donald Trump'].index, inplace = True)
df_state_lost = pd.merge(df_state_lost, df_state_total, on = ['state'], how = 'left')
df_state_lost['share of counties in the losing states'] = \
    df_state_lost['# of counties won'] / df_state_lost['# of total counties']
df_state_lost['won'] = False
df_state_lost['party'] = 'REP'
df_state_lost = df_state_lost[['state', 'candidate', 'party', 'won', '# of counties won', '# of total counties', 'share of counties in the losing states']]
df_state_lost

## Summary Statistics Tables

In [533]:
df_I_II_statistics = pd.DataFrame({
    "share of votes in the winning counties" : df_county_won['share of votes in the winning counties'].describe(),
    "share of votes in the losing counties" : df_county_lost['share of votes in the losing counties'].describe(),
})
df_I_II_statistics = df_I_II_statistics.round(decimals = 1)
fig=plt. figure (figsize = (8, 0.3))
ax = fig.add_subplot(111)
ax.table(cellText = df_I_II_statistics.values, colLabels = df_I_II_statistics.columns,
         rowLabels = df_I_II_statistics.index, cellLoc = 'center')
ax.set_title ("In Electoral System I and II, Donald Trump Would Have Won")
ax.axis ('off')

The above summary statistics table presented the basic information about Electoral System I and II. Let's revisit the definition of the two electoral systems. For electoral system I, each county has same number of votes as its voters, it casts all of its votes on the party (candidate) that won the majority of votes. For electoral system II, each county has one vote, it casts its vote on the party (candidate) that won the majority of votes.



In [534]:
df_III_IV_statistics = pd.DataFrame({
    "share of counties in the winning states": df_state_won['share of counties in the winning states'].describe(),
    "share of counties in the losing states": df_state_lost['share of counties in the losing states'].describe(),
})
df_III_IV_statistics = df_III_IV_statistics.round(decimals = 0)
fig2 = plt.figure(figsize = (8, 0.3))
ax2 = fig2.add_subplot(111)
ax2.table(cellText = df_III_IV_statistics.values, colLabels = df_III_IV_statistics.columns,
         rowLabels = df_III_IV_statistics.index, cellLoc = 'center')
ax2.set_title("In Electoral System III and IV, Donald Trump Would Have Won")
ax2.axis ('off')

In [535]:
df_stat = df_state2.copy()
df_stat.drop(['lat', 'lng'], axis = 1, inplace = True)
df_stat.describe()

The above table is used for analyze the dataframe that is used by electoral system I and II. Let's revisit the definition of the two electoral systems. For electoral system I, each county has same number of votes as its voters, it casts all of its votes on the party (candidate) that won the majority of votes. For electoral system II, each county has one vote, it casts its vote on the party (candidate) that won the majority of votes. Since

By the table above, total_votes means the total votes of a county has, # of counties won means the number of counties that a candidate has won with a state.
As from the data, the number and the size of counties within a state varies enormously within a state. This brings space where two strongest candidate, Joe Biden and Donald Trump, can compete against different systems and produce different election results. Donald Trump's policy is more in favor of smaller county voters, whereas Joe Biden's policy is more favored by the large county voters.

## Plots, Histograms, Figures



In [553]:
#plt.figure(figsize = (20,10))
#fig, axes = plt.subplots(1, 2, figsize=(8, 6), sharey=False)

df_boxplot1 = pd.DataFrame({
    'share of votes in the winning counties': df_county_won['share of votes in the winning counties'],
})
#pd.DataFrame(data = df_county_won, columns = ['share of votes in the winning counties'])
#df_boxplot1['Electoral Systems'] = 'share of votes in the winning counties'
#df_boxplot1.rename(columns={'share of votes in the winning counties': 'Value'}, inplace = True)
#df_boxplot2 = pd.DataFrame(data = df_county_lost, columns = ['share of votes in the losing counties'])
#df_boxplot2['Electoral Systems'] = 'share of votes in the losing counties'
#df_boxplot2.rename(columns={'share of votes in the losing counties': 'Value'}, inplace = True)
#df_boxplot1.append(df_boxplot2)
#df_boxplot1['Group'] = 'I & II'
df_boxplot1
#df_county_won
#df_boxplot3 = pd.DataFrame(data = df_state_won, columns = ['share of counties in the winning states'])
#df_boxplot3['Electoral Systems'] = 'share of counties in the winning states'
#df_boxplot3.rename(columns={'share of votes in the winning counties': 'Value'}, inplace = True)
#df_boxplot4 = pd.DataFrame(data = df_state_lost, columns = ['share of counties in the losing states'])
#df_boxplot4['Electoral Systems'] = 'share of counties in the losing states'
#df_boxplot4.rename(columns={'share of counties in the losing states': 'Value'}, inplace = True)
#df_boxplot3.append(df_boxplot4)
#df_boxplot3['Group'] = 'III & IV'

#df_boxplot = df_boxplot1.append(df_boxplot3)


#df_boxplot.groupby('Group').boxplot(by='Electoral Systems', ax=axes)

#plt.show()

#df_boxplot3["share of counties in the winning states"] = df_state_won["share of counties in the winning states"]
#df_boxplot['share of counties in the losing states'] = df_state_lost['share of counties in the losing states']
#sns.boxplot(x="variable", y="value", data=pd.melt(df_boxplot))

For both dataframes, remove Hawaii and Alaska out of the dataframe, to make the mainland graph more clear.

In [537]:
df_county_mainland = df_county[~df_county["state"].isin(["Alaska", "Hawaii"])]
df_state_mainland = df_state2[~df_state2["state"].isin(["Alaska", "Hawaii"])]

### Electoral System I

In [538]:
plt.figure(figsize = (14,10))
sns.scatterplot(data = df_county_mainland, x = "lng", y = "lat", hue = "party", size = "total_votes",
                sizes = (2, 200), palette = ['blue', 'red', 'yellow'])
plt.title("Electoral System I");

In [539]:
y1 = df_county.groupby('party')['total_votes'].sum()['DEM']
y2 = df_county.groupby('party')['total_votes'].sum()['REP']
y3 = df_county.groupby('party')['total_votes'].sum()['LIB']
y4 = df_county.groupby('party')['total_votes'].sum()['WRI']
result_1 = [['DEM', 'Joe Biden', y1, max(y1, y2, y3, y4) == y1],
            ['REP', 'Donald Trump', y2, max(y1, y2, y3, y4) == y2],
            ['LIB', 'Jo Jorgensen', y3, max(y1, y2, y3, y4) == y3],
            ['WRI', 'Write-ins', y4, max(y1, y2, y3, y4) == y4]
            ]
pd.DataFrame(result_1, columns = ['Party', 'Candidate', 'Fabricated Votes', 'Won'])

Under this system, each county has the same number of votes as its number of voters and casts all of its votes on the party (candidate) that won the majority of votes. In this scenario, Joe Biden emerged as the winner. This can be attributed to the fact that Biden was able to secure the majority of votes in a number of megacities, thereby gaining a significant number of electoral votes. Although Trump won large number of counties, the population still doesn't add up to the level of the population from megacities, therefore he lost.

### Electoral System II

In [540]:
plt.figure(figsize = (14,10))
sns.scatterplot(data = df_county_mainland, x = "lng", y = "lat", hue = "party", size = "total_votes",
                sizes = (20, 20), palette = ['blue', 'red', 'yellow'])
plt.title("Electoral System II");

In [541]:
y1 = df_county['party'].value_counts()['DEM']
y2 = df_county['party'].value_counts()['REP']
y3 = df_county['party'].value_counts()['LIB']
y4 = df_county['party'].value_counts()['WRI']
result_2 = [['DEM', 'Joe Biden', y1, max(y1, y2, y3, y4) == y1],
            ['REP', 'Donald Trump', y2, max(y1, y2, y3, y4) == y2],
            ['LIB', 'Jo Jorgensen', y3, max(y1, y2, y3, y4) == y3],
            ['WRI', 'Write-ins', y4, max(y1, y2, y3, y4) == y4]
            ]
pd.DataFrame(result_2, columns = ['Party', 'Candidate', 'Fabricated Votes', 'Won'])

Under this system, each county has only one vote and casts its vote on the party (candidate) that won the majority of votes. Donald Trump emerged as the winner under this system. Since all counties are weighted equally, the fact that Trump is popular with small counties and in several smaller counties, enabled him to secure a majority. Trump's performance in states that traditionally vote Republican has also contributed to his win.

### Electoral System III

In [542]:
plt.figure(figsize = (14,10))
sns.scatterplot(data = df_state_mainland, x = "lng", y = "lat", hue = "party", size = "total_votes",
                sizes = (2, 200), palette = ['blue', 'red'])
plt.title("Electoral System III");

In [543]:
y1 = df_state2.groupby('party')['total_votes'].sum()['DEM']
y2 = df_state2.groupby('party')['total_votes'].sum()['REP']
try:
    y3 = df_state2.groupby('party')['total_votes'].sum()['LIB']
except KeyError:
    y3 = 0
try:
    y4 = df_state2.groupby('party')['total_votes'].sum()['WRI']
except KeyError:
    y4 = 0
result_3 = [['DEM', 'Joe Biden', y1, max(y1, y2, y3, y4) == y1],
            ['REP', 'Donald Trump', y2, max(y1, y2, y3, y4) == y2],
            ['LIB', 'Jo Jorgensen', y3, max(y1, y2, y3, y4) == y3],
            ['WRI', 'Write-ins', y4, max(y1, y2, y3, y4) == y4]
            ]
pd.DataFrame(result_3, columns = ['Party', 'Candidate', 'Fabricated Votes', 'Won'])

Under this system, each state has the same number of votes as its number of voters and casts all of its votes on the party (candidate) that won the majority of counties. Donald Trump emerged as the winner under this system. Since the number of counties where Trump is popular surpasses the number of cities where Biden is popular in almost every state, except for certain states that are Democrat's conservative base. In the general case, the higher number of rural counties in most states where Trump performed well played a crucial role in his win.

### Electoral System IV

In [544]:
plt.figure(figsize = (14,10))
sns.scatterplot(data = df_state_mainland, x = "lng", y = "lat", hue = "party", size = "total_votes",
                sizes = (20, 20), palette = ['blue', 'red'])
plt.title("Electoral System IV");

In [545]:
import geopandas as gpd

In [546]:
usa = gpd.read_file('/Users/jiaxincui/Documents/GitHub/git-exercise-AllysonCui/Data/Other/cb_2021_us_state_20m/cb_2021_us_state_20m.shp')
gpd_state = usa[~usa["NAME"].isin(["Alaska", "Hawaii", "Puerto Rico"])]
gpd_state = gpd_state.copy()
gpd_state.sort_values(by=['NAME'], ascending=True, axis=0, inplace =True)
df_state1.rename(columns={"state": "NAME"}, inplace = True)
gpd_state = pd.merge(gpd_state, df_state1, on="NAME", how="left")
gpd_state

In [547]:
fig, gax = plt.subplots(figsize=(14, 10))

#gpd_state.query("NAME == 'Wisconsin'").plot(ax=gax, edgecolor="black", color="white")
gpd_state.plot(ax=gax, edgecolor="black", color="white")

plt.show()

In [548]:
gpd_state['trump_share'] = \
    df_state_statistics['# of counties won'] / df_state_statistics['# of total counties']
gpd_state

In [549]:
fig, gax = plt.subplots(figsize = (14,8))

# Plot the state
#state_df[state_df['NAME'] == 'Wisconsin'].plot(ax = gax, edgecolor='black',color='white')
#gpd_state.plot(ax=gax, edgecolor="black", color="white")
# Plot the counties and pass 'rel_trump_share' as the data to color
#gpd_state.apply(lambda x: ax.annotate(text=x.STUSPS, xy=x.geometry.centroid.coords[0], ha='center', fontsize=14), axis=1)
gpd_state.plot(
    ax=gax, edgecolor='black', column='trump_share', legend=True, cmap='RdBu_r',
    vmin=0.0, vmax=1 #range of your column value for the color legend
)

# Add text to let people know what we are plotting
#gax.annotate('Trump vote share',xy=(0.76, 0.06),  xycoords='figure fraction')

# I don't want the axis with long and lat
plt.axis('off')

plt.show()

In [550]:
fig, gax= plt.subplots(1, figsize = (14,10))
ax = fig.add_subplot()
gpd_state.apply(lambda x: ax.annotate(text=x.STUSPS, xy=x.geometry.centroid.coords[0], ha='center', fontsize=14)
                , axis=1);
gpd_state.plot(
    ax=ax, edgecolor='black', column='party', legend=True, cmap='bwr', categorical=True, vmin=-1, vmax=2)

gax.annotate('Republican vote share',xy=(0.76, 0.06),  xycoords='figure fraction')
plt.axis('off')
gax.axis('off')
plt.show()

In [551]:
y1 = df_state1['party'].value_counts()['DEM']
y2 = df_state1['party'].value_counts()['REP']
try:
    y3 = df_state1['party'].value_counts()['LIB']
except KeyError:
    y3 = 0
try:
    y4 = df_state1['party'].value_counts()['WRI']
except KeyError:
    y4 = 0
result_4 = [['DEM', 'Joe Biden', y1, max(y1, y2, y3, y4) == y1],
            ['REP', 'Donald Trump', y2, max(y1, y2, y3, y4) == y2],
            ['LIB', 'Jo Jorgensen', y3, max(y1, y2, y3, y4) == y3],
            ['WRI', 'Write-ins', y4, max(y1, y2, y3, y4) == y4]
            ]
pd.DataFrame(result_4, columns = ['Party', 'Candidate', 'Fabricated Votes', 'Won'])

Under this system, each state has only one vote and casts its vote on the party (candidate) that won the majority of counties. Donald Trump emerged as the winner under this system as well. This can be due to his strong performance in a number of states, where he was able to secure the majority of counties. Compared the situation in System III, the ratio compared between Biden and Trump is down slightly, as the high number of votes from major cities are also directed to the counting.

## Conclusion
In conclusion, the results of the four different electoral systems applied to the same election data reveal the significance of the electoral process in determining the outcome of an election. The outcome can vary greatly depending on the specific system being used. In this case, Joe Biden was declared the winner under Electoral System I, where each county was allotted the same number of votes as its number of voters and all votes were cast for the candidate who won the majority of votes in that county. On the other hand, Donald Trump was declared the winner in the remaining three systems - Electoral System II, III, and IV.

In [552]:
result_4 = [['I', 'Joe Biden'],
            ['II', 'Donald Trump'],
            ['III', 'Donald Trump'],
            ['IV', 'Donald Trump']]
pd.DataFrame(result_4, columns = ['Electoral System', 'Winner'])

 This highlights the importance of choosing an appropriate electoral system that aligns with the values and goals of a given society. It also demonstrates how different electoral systems can have a significant impact on the representation of different regions, communities, and individuals. The analysis shows that the design of an electoral system can greatly influence the outcome of an election and should be carefully considered.