# From 2016 to 2020, How Politics has Changed America

### Adit Gupta and Brandon Kim

In [1]:
print("The year 2016 marked a new age for America, when the presidency of President Barack Obama ended and Donald Trump was annointed into office. Just four years later, this dramatic change has occured again, as Joe Biden \
was just recently elected as the new President of the United States in November 2020. From President Obama to President Trump to soon-to-be president \
Biden, the political affiliation of the country's leader has changed from Democrat to Republican, and back to Democrat. More interesting than this change in political affiliation was the voter turnout in 2020, as it was the highest \
ever recorded in the United States of America's history. In addition, many authoritative figures claimed this presidential election to be the most imporant election \
that the country has ever had, which could be a reason why there was a such a large increase in voter turnout from 2016.")
print("") # Empty Line
print("Because Joe Biden- a Democrat- won the election, this means that states in the country needed to flip in order for him to win. Of the many states that flipped, there are a few \
large states that played an imporant role in Biden's win as President elect. The states that my partner Adit and I chose to research were Arizona, Georgia, Michigan, Pennsylvania, and Wisconsin. When looking at these states, we first wanted to see which counties in these \
states had the highest percentage in relation to the state's total votes. After seeing which states were the highest, we would then dive deeper to see the distribution of votes in these counties, seeing whether or not there was a large gap between the leading party affiliation in these \
states, as well as to see if there was any change from 2016 to 2020. We will visualize this in a variety of ways, which you will see below.")

The year 2016 marked a new age for America, when the presidency of President Barack Obama ended and Donald Trump was annointed into office. Just four years later, this dramatic change has occured again, as Joe Biden was just recently elected as the new President of the United States in November 2020. From President Obama to President Trump to soon-to-be president Biden, the political affiliation of the country's leader has changed from Democrat to Republican, and back to Democrat. More interesting than this change in political affiliation was the voter turnout in 2020, as it was the highest ever recorded in the United States of America's history. In addition, many authoritative figures claimed this presidential election to be the most imporant election that the country has ever had, which could be a reason why there was a such a large increase in voter turnout from 2016.

Because Joe Biden- a Democrat- won the election, this means that states in the country needed to flip in order for 

## Data Scraping

In [2]:
print("To begin our analysis and comparison of the 2016 and 2020 election data, we will first take a look at the data for voter turnout and demographics from the 2016 election. To do this, we are going to scrape our \
election data from the link provided below. I pulled the 2016 data and created a dataframe out of it. We will be looking at the voter data to see which counties in our selected states had the largest percentage in comparison to the total state's vote count.")
print("")
print("The 2016 election results at the county-level are scraped from results published by Townhall.com. Their formatted tabled for the 2016 presidential election makes it easy for a web scraped like beautifulSoup to capture results. This data was converted into a csv and added \
to a github repository by the user 'tonmcg', and this is where I scraped the data.")

To begin our analysis and comparison of the 2016 and 2020 election data, we will first take a look at the data for voter turnout and demographics from the 2016 election. To do this, we are going to scrape our election data from the link provided below. I pulled the 2016 data and created a dataframe out of it. We will be looking at the voter data to see which counties in our selected states had the largest percentage in comparison to the total state's vote count.

The 2016 election results at the county-level are scraped from results published by Townhall.com. Their formatted tabled for the 2016 presidential election makes it easy for a web scraped like beautifulSoup to capture results. This data was converted into a csv and added to a github repository by the user 'tonmcg', and this is where I scraped the data.


For More information on the 2016 dataset, visit [Github 2016 Election Data](https://github.com/tonmcg/US_County_Level_Election_Results_08-20/blob/master/2016_US_County_Level_Presidential_Results.csv)

In [None]:
from bs4 import BeautifulSoup
import matplotlib.pyplot as plt
import matplotlib
import pandas as pd
import numpy as np 
import seaborn

data = pd.read_csv("2016_US_County_Level_Presidential_Results.csv")
data = data.drop(['Unnamed: 0'], axis=1) # Getting rid of extra index column
data['percent_votes'] = 0
data

In [None]:
print("After extracting the 2016 election data into a Pandas dataframe, it is now time to collect the data from \
our specific states that we would like to research.")
print("")
print("Grouping the data into the 5 states we are interested in results in:")

### 2016 Election Arizona Data

In [None]:
AZ_data = data[data['state_abbr'] == "AZ"]
AZ_data

### 2016 Election Georgia Data

In [None]:
GA_data = data[data['state_abbr'] == "GA"]
GA_data

### 2016 Election Michigan Data

In [None]:
MI_data = data[data['state_abbr'] == "MI"]
MI_data

### 2016 Election Pennsylvania Data

In [None]:
PA_data = data[data['state_abbr'] == "PA"]
PA_data

### 2016 Election Wisconsin Data

In [None]:
WI_data = data[data['state_abbr'] == 'WI']
WI_data

## Adding Percentage of the Total Vote to the Dataset & Creating a Visualization

In [None]:
print("Now that we have created datasets based on the five different states of interest, it is now time to add a 'percent_votes' data column to the tables. By doing this, we will be able to clearly \
see what percent of the total vote count that the specific county had in relation to the state. To do this, we first need to find the sum of the total votes across the state, and then we simply divide \
the county's vote count by the total vote count in order to find the percentage.")
print("")
print("After inserting this information into each state's specific dataframe, we can visualize the data by creating pie charts to show the distribution of each county's votes in relation to their respective state.")

## Arizona

### 2016 Election Arizona Percent Votes Column

In [None]:
total_AZ_vote = AZ_data['total_votes'].sum()
print("The state of Arizona had a total of " + str(total_AZ_vote) + " votes during the 2016 Election.") 
print("")
print("After finding this piece of data, we can now update each 'percent_votes' value associated to each county, calculating the proper percentage of votes that the county \
encompasses in relation to the state.")
print("")
print("After inserting the data properly into the table, we can then use matplotlib in order to create a pie chart for county's percentage of votes against total votes. Underneath the pie chart, I also included a legend, naming every single county and the \
associated percentage of votes. This percentage was rounded to 2 decimal points for easier viewing.")

In [None]:
for index, row in AZ_data.iterrows():
    #AZ_data['percent_votes'].loc[index]
    AZ_data.loc[:, 'percent_votes'] = AZ_data.loc[:,'total_votes']/total_AZ_vote
AZ_data

### 2016 Election Arizona Percent Votes Visualization

In [None]:
# Visualizing the Percent Voter Column 
plt.figure(figsize=(10,10))

plt.pie(AZ_data['percent_votes'], labels=AZ_data['county_name'], autopct='%1.2f%%')

plt.title('Arizona 2016: Percent of Votes per County in Relation to the State')
plt.axis('equal')
plt.show()

for index, row in AZ_data.iterrows():
    print(row['county_name'] + ": " + str(round(row['percent_votes']*100, 2)) + "%")

In [None]:
# Quick Explanation of Arizona Data
print("When looking at the 2016 Arizona counties, we can clearly see that Maricopa County had the highest proportion of votes, encompassing \
58.27% of the state's total votes. Pima county had the second highest percentage but not even close, with 17.49% of the state's votes belonging to Pima County. After these two counties, not a single county in the state \
has more than 5% of the total votes, however Pinal County (4.55%) and Yavapai County (4.50%) are very close.")

## Georgia

### 2016 Election Georgia Percent Votes Column

In [None]:
# Total state vote, print, explanation
total_GA_vote = GA_data['total_votes'].sum()
print("The state of Georgia had a total of " + str(total_GA_vote) + " votes during the 2016 Election.") 
print("")
print("After finding this piece of data, we can now update each 'percent_votes' value associated to each county, calculating the proper percentage of votes that the county \
encompasses in relation to the state.")
print("")
print("After inserting the data properly into the table, we can then use matplotlib in order to create a pie chart for county's percentage of votes against total votes. Underneath the pie chart, I also included a legend, naming every single county and the \
associated percentage of votes. This percentage was rounded to 2 decimal points for easier viewing.")

In [None]:
# Create row (i.e. insert data)
for index, row in GA_data.iterrows():
    GA_data.loc[:, 'percent_votes'] = GA_data.loc[:, 'total_votes']/total_GA_vote
GA_data

### 2016 Election Georgia Percent Votes Visualization

In [None]:
# Creating plot and printing percent
plt.figure(figsize=(10,10))
plt.pie(GA_data['percent_votes'], labels=GA_data['county_name'], autopct='%1.2f%%')
plt.title('Georgia 2016: Percent of Votes per County in Relation to the State')
plt.axis('equal')
plt.show()
for index, row in GA_data.iterrows():
    print(row['county_name'] + ": " + str(round(row['percent_votes']*100, 2)) + "%")

In [None]:
# Mini Analysis
print("Looking at Georgia's 2016 election pie chart for county vote percentage, we can clearly see four counties that stick \
out as holding a higher percentage than the rest of the counties in the state. Fulton County (10.10%), Cobb County (8.10%), Gwinnett County (8.04%) & DeKalb County (7.34%) were the four \
largest counties in the state. Otherwise, most of counties were miniscule values in relation to the state as a whole. One thing to note is that \
because of the mass amount of counties in the state, the distribution is a little more flattened in comparison to Arizona. If there were not as many counties with such \
small values, the distribution four these four large counties would go up.")

## Michigan

### 2016 Election Michigan Percent Votes Column

In [None]:
# Total state vote, print, explanation
total_MI_vote = MI_data['total_votes'].sum()
print("The state of Michigan had a total of " + str(total_MI_vote) + " votes during the 2016 Election.") 
print("")
print("After finding this piece of data, we can now update each 'percent_votes' value associated to each county, calculating the proper percentage of votes that the county \
encompasses in relation to the state.")
print("")
print("After inserting the data properly into the table, we can then use matplotlib in order to create a pie chart for county's percentage of votes against total votes. Underneath the pie chart, I also included a legend, naming every single county and the \
associated percentage of votes. This percentage was rounded to 2 decimal points for easier viewing.")

In [None]:
# Create row (i.e. insert data)
for index, row in MI_data.iterrows():
    MI_data.loc[:, 'percent_votes'] = MI_data.loc[:,'total_votes']/total_MI_vote
MI_data

### 2016 Election Michigan Percent Votes Visualization

In [None]:
# Creating plot and printing percent
plt.figure(figsize=(10,10))
plt.pie(MI_data['percent_votes'], labels=MI_data['county_name'], autopct='%1.2f%%')
plt.title('Michigan 2016: Percent of Votes per County in Relation to the State')
plt.axis('equal')
plt.show()
for index, row in MI_data.iterrows():
    print(row['county_name'] + ": " + str(round(row['percent_votes']*100, 2)) + "%")

In [None]:
# Mini analysis
print("When observing the 2016 Michigan county vote distribution, similar to the Georgia data there are four main counties that catch \
the eye immediately when looking at the pie chart. Wayne County (16.19%), Oakland County (13.84%), Macomb County (8.74%), and \
Kent County (6.39%) can be seen right away. There are a few counties that have around 4% of the vote distribution (Washtenaw and Genesee County), but that \
percent is small in comparison to the 'big four' counties in Michigan.")

## Pennsylvania

### 2016 Election Pennsylvania Percent Votes Column

In [None]:
# Total state vote, print, explanation
total_PA_vote = PA_data['total_votes'].sum()
print("The state of Pennsylvania had a total of " + str(total_PA_vote) + " votes during the 2016 Election.") 
print("")
print("After finding this piece of data, we can now update each 'percent_votes' value associated to each county, calculating the proper percentage of votes that the county \
encompasses in relation to the state.")
print("")
print("After inserting the data properly into the table, we can then use matplotlib in order to create a pie chart for county's percentage of votes against total votes. Underneath the pie chart, I also included a legend, naming every single county and the \
associated percentage of votes. This percentage was rounded to 2 decimal points for easier viewing.")

In [None]:
# Create row (i.e. insert data)
for index, row in PA_data.iterrows():
    PA_data.loc[:, 'percent_votes'] = PA_data.loc[:,'total_votes']/total_PA_vote
PA_data

### 2016 Election Pennsylvania Percent Votes Visualization

In [None]:
# Creating plot and printing percent 
plt.figure(figsize=(10,10))
plt.pie(PA_data['percent_votes'], labels=PA_data['county_name'], autopct='%1.2f%%')
plt.title('Pennsylvania 2016: Percent of Votes per County in Relation to the State')
plt.axis('equal')
plt.show()
for index, row in PA_data.iterrows():
    print(row['county_name'] + ": " + str(round(row['percent_votes']*100, 2)) + "%")

In [None]:
# Mini analysis
print("Taking a look at the 2016 Pennsylvania election piechart, the top four counties for vote percentage \
in relation to the state were: Philadelphia County (11.39%), Allegheny County (10.77%), Montgomery County (7.16%), and \
Bucks County (5.72%). There were many counties in Pennsylvania that had around 4% of the total vote percentage, including Delaware County, Lancaster \
County, and York County. When viewing the pie chart one could say that the population distribution across the state are more balanced than that of \
Georgia and Michigan.")

## Wisconsin

### 2016 Election Wisconsin Percent Votes Column

In [None]:
# Total state vote, print, explanation
total_WI_vote = WI_data['total_votes'].sum()
print("The state of Wisconsin had a total of " + str(total_WI_vote) + " votes during the 2016 Election.") 
print("")
print("After finding this piece of data, we can now update each 'percent_votes' value associated to each county, calculating the proper percentage of votes that the county \
encompasses in relation to the state.")
print("")
print("After inserting the data properly into the table, we can then use matplotlib in order to create a pie chart for county's percentage of votes against total votes. Underneath the pie chart, I also included a legend, naming every single county and the \
associated percentage of votes. This percentage was rounded to 2 decimal points for easier viewing.")

In [None]:
# Create row (i.e. insert data)
for index, row in WI_data.iterrows():
    WI_data.loc[:, 'percent_votes'] = WI_data.loc[:,'total_votes']/total_WI_vote
WI_data

### 2016 Election Wisconsin Percent Votes Visualization

In [None]:
# Creating plot and printing percent
plt.figure(figsize=(10,10))
plt.pie(WI_data['percent_votes'], labels=WI_data['county_name'], autopct='%1.2f%%')
plt.title('Wisconsin 2016: Percent of Votes per County in Relation to the State')
plt.axis('equal')
plt.show()
for index, row in WI_data.iterrows():
    print(row['county_name'] + ": " + str(round(row['percent_votes']*100, 2)) + "%")

In [None]:
# Mini analysis
print("Looking at Wisconsin's 2016 election voter percentage data, we can see that there are only three \
main counties that jump out to the viewer when observing the pie chart. These three counties are Milwaukee County (14.81%), Dane \
County (10.37%), and Waukseha County (7.94%). Other than this, many of the other counties are around 4% of the state's total vote distribution \
or less. When looking at this visualization, we can make the conclusion that these three counties hold the highest population of voters, as they contribute \
the most to the state's total vote count.")

## State Map Visualization

In [None]:
# Look at link to map the states

Visit [County Choropleth Maps In Python](https://plotly.com/python/county-choropleth/) to learn more.

## Using County Population to...