<center><h1> USA ELECTION 2020 </h1></center>
<img src="https://ichef.bbci.co.uk/news/1024/cpsprodpb/15B1C/production/_114606888_index_promo_simple_guide_976_v7.png" width="600px">




# 1. Introduction

This notebook aims to analyse and visualise the data from the 2020 US elections. Plotly has been used for the data visualisation, as it allows you to create aesthetically pleasing, interactive plots.

Please don't forget to UpVote this notebook if you like it.

# 2. Importing required libraries

In [None]:
# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python Docker image: https://github.com/kaggle/docker-python
# For example, here's several helpful packages to load

import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
import plotly.express as px


# Input data files are available in the read-only "../input/" directory
# For example, running this (by clicking run or pressing Shift+Enter) will list all files under the input directory

import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))

# You can write up to 20GB to the current directory (/kaggle/working/) that gets preserved as output when you create a version using "Save & Run All" 
# You can also write temporary files to /kaggle/temp/, but they won't be saved outside of the current session

# 3. Reading CSV Data

In [None]:
path_president_candidate = '/kaggle/input/us-election-2020/president_county_candidate.csv'

usa_election = pd.read_csv(path_president_candidate)

usa_election

In [None]:
usa_election.info()

# 3. Create a function to visualise the candidates

This function displays the total number of votes based on the candidates for each county.

In [None]:
def county_results(county, state):
    data = usa_election[(usa_election['county']==county) & (usa_election['state']==state)]
    #ax = sns.barplot(x='candidate', y='total_votes', data=data)
    fig = px.bar(data, x='candidate', y='total_votes')
    #fig = go.Figure([go.Bar(x=data['candidate'], y=data['total_votes'])])
    fig.update_layout(
        title={
            'text': f'Election results in {county} ({state})',
            'y':0.95,
            'x':0.5,
            },
        xaxis_title='Candidates',
        yaxis_title='Total votes',
        )
    fig.show()


Some examples of the county_results function:

In [None]:
county_results('Kent County', 'Delaware')

In [None]:
county_results('Palm Beach County', 'Florida')

# 4. Joe Biden or Donald Trump

We select only the two majoritary parties: DEM (Joe Biden) and REP (Donald Trump) and group the votes by states.

In [None]:
DEM_election = usa_election[usa_election['party']=='DEM']
DEM_election = DEM_election.groupby('state').sum()
DEM_election.rename(columns={"total_votes": "DEM_votes", "won": "DEM_won_counties"}, inplace=True)
DEM_election.head()

In [None]:
REP_election = usa_election[usa_election['party']=='REP']
REP_election = REP_election.groupby('state').sum()
REP_election.rename(columns={"total_votes": "REP_votes", "won": "REP_won_counties"}, inplace=True)
REP_election.head()

We can create a function to compare the votes of the two majoritary parties

In [None]:
def most_voted_candidate():
    total_votes_DEM = DEM_election['DEM_votes'].sum()
    total_votes_REP = REP_election['REP_votes'].sum()
    total_votes = usa_election['total_votes'].sum()
    
    print (f'Total votes for Joe Biden (DEM): {total_votes_DEM}')
    print (f'Total votes for Donald Trump (REP): {total_votes_REP}')
    
    # Total votes plot
    fig = px.bar(x=['DEM', 'REP'], y=[total_votes_DEM, total_votes_REP])
    fig.update_layout(
        title={
            'text': f'Total votes for DEM and REP',
            'y':0.95,
            'x':0.5,
            },
        xaxis_title='Parties',
        yaxis_title='Total votes',
        )
    fig.show()
    
    # Percentage of votes plot
    fig = px.bar(x=['DEM', 'REP'], y=[DEM_election['DEM_votes'].sum()/usa_election['total_votes'].sum(), REP_election['REP_votes'].sum()/usa_election['total_votes'].sum()])
    fig.update_layout(
        title={
            'text': f'Percentage of votes for DEM and REP',
            'y':0.95,
            'x':0.5,
            },
        xaxis_title='Parties',
        yaxis_title='% of votes',
        )
    fig.show()
    

In [None]:
most_voted_candidate()

# 5. Map visualisation 

Now, we concatenate the two DataFrame to represent the results on a map. We can compare the number of votes in each state and create a new column with the winner.

In [None]:
president_election = pd.concat([REP_election, DEM_election], axis=1)
president_election.head()

In [None]:
president_election['winner'] = np.where(president_election['REP_votes'] > president_election['DEM_votes'], 'Donald Trump', 'Joe Biden')
president_election

For the map visualisation, we import a csv with the state code, its latitude and longitude. 

In [None]:
path_states_map = '/kaggle/input/usa-states-latitude-longitude/statelatlong.csv'
states_lat_long = pd.read_csv(path_states_map, index_col='City')
states_lat_long

In [None]:
We insert the columns to our DataFrame.

In [None]:
president_election = pd.concat([president_election,states_lat_long], axis=1) 
president_election

The choropleth plot is used to visualise the map with the election results.

In [None]:
fig = px.choropleth(president_election, 
                           locations='State', 
                           color="winner",
                           color_discrete_sequence  = ["red", "blue"],
                           locationmode = 'USA-states',  
                           scope="north america",
                           title='USA Presidential Votes Counts' 
                          )

fig.show()