# Election 2016
___
2018 | Bernard Kung
___

A fun exploration of election 2016 turnout data and building some chloropleths!

### Initializing Workspace
___

In [14]:
import pandas as pd
import numpy as np
import plotly.plotly as py
import plotly.graph_objs as go 
from plotly.offline import download_plotlyjs, init_notebook_mode, plot, iplot

In [15]:
init_notebook_mode(connected=True) 

In [2]:
# change float display format
# pd.options.display.float_format = '{:,.0f}'.format

### Loading and Cleaning Data
___

Data for this project is from the United States Elections Project [[1](#Sources)]. 

When reading in the data:

* Some variables are multi-indexed; only the second header is necessary.
* Numeric columns have the commas filtered out in _read\_csv()_ call.
* State column name is manually added in because of multi-index.
* White spaces are removed from column names.

In [48]:
election_data = pd.read_csv(r'..\data\2016_November_General_Election_Turnout_Rates.csv',
                            header= 1, nrows= 52, thousands=r',')
election_data.rename(columns={'Unnamed: 0':'State'}, inplace= True)
election_data.rename(columns=lambda x: x.replace(' ',''), inplace= True)

In [62]:
election_data

Unnamed: 0,State,StateResultsWebsite,Status,VEPTotalBallotsCounted,VEPHighestOffice,VAPHighestOffice,TotalBallotsCounted(Estimate),HighestOffice,Voting-EligiblePopulation(VEP),Voting-AgePopulation(VAP),%Non-citizen,Prison,Probation,Parole,TotalIneligibleFelon,OverseasEligible,StateAbv
0,United States,,,60.2,59.3,54.7,138846571.0,136700729,230585915,250055734,8.4,1456032,2254727,508576,3249802,4739596.0,
1,Alabama,http://www.alabamavotes.gov/downloads/election...,Official,59.3,59.0,56.3,2134061.0,2123372,3601361,3770142,2.6,30627,56700,8138,71084,,AL
2,Alaska,http://www.elections.alaska.gov/results/16GENR/,Official,61.8,61.3,57.4,321271.0,318608,519849,555367,4.3,5338,7077,2210,11582,,AK
3,Arizona,http://apps.azsos.gov/election/2016/General/Of...,Official,56.2,55.0,48.9,2661497.0,2604657,4734313,5331034,9.5,38068,76005,7379,88770,,AZ
4,Arkansas,http://results.enr.clarityelections.com/AR/639...,Official,53.1,52.8,49.4,1137772.0,1130635,2142571,2286625,3.8,17405,28900,23093,56971,,AR
5,California,http://www.sos.ca.gov/elections/prior-election...,Official,58.4,56.7,47.0,14610509.0,14181595,25017408,30201571,16.7,129593,0,0,129593,,CA
6,Colorado,http://results.enr.clarityelections.com/CO/637...,Official,72.1,70.1,64.6,2859216.0,2780247,3966297,4305728,7.2,18708,0,10269,28977,,CO
7,Connecticut,http://ctemspublic.pcctg.net/#/home,Official,65.4,64.2,58.3,1675955.0,1644920,2561555,2821935,8.6,15247,0,2939,18186,,CT
8,Delaware,http://elections.delaware.gov/results/html/ele...,Official,64.6,64.4,59.2,445228.0,443814,689125,749872,6.0,6329,15646,425,15672,,DE
9,District of Columbia,https://www.dcboee.org/election_info/election_...,Official,61.1,60.9,55.4,312575.0,311268,511463,562329,9.0,0,0,0,0,,DC


The problem I want to deal with is removing the % sign from entries in certain columns. The problem is further exacerbated by NaN values. 

My strategy to do so involves:

1. Select the columns needed into a dataframe to improve legibility. 
2. Use _.notnull()_ to avoid NaN entries.
3. Use _.apply()_ to apply _.replace()_ to replace % signs. 

In [50]:
percent_data = election_data[['VEPTotalBallotsCounted','VEPHighestOffice','VAPHighestOffice','%Non-citizen']]

In [51]:
for cols in  percent_data.columns:
    percent_data.loc[percent_data[cols].notnull(),cols] = percent_data.loc[percent_data[cols].notnull(), cols].apply(lambda x: x.replace('%',''))    



A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy



A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy



In [52]:
percent_data = percent_data.astype(np.float64)

In [53]:
election_data[percent_data.columns] = percent_data

In [54]:
election_data.dtypes

State                              object
StateResultsWebsite                object
Status                             object
VEPTotalBallotsCounted            float64
VEPHighestOffice                  float64
VAPHighestOffice                  float64
TotalBallotsCounted(Estimate)     float64
HighestOffice                       int64
Voting-EligiblePopulation(VEP)      int64
Voting-AgePopulation(VAP)           int64
%Non-citizen                      float64
Prison                              int64
Probation                           int64
Parole                              int64
TotalIneligibleFelon                int64
OverseasEligible                  float64
StateAbv                           object
dtype: object

In [61]:
# removes US and DC rows. Not actually necessary though...
election_data2 = election_data[~election_data['State'].isin(['United States', 'District of Columbia'])]

In [106]:
data = dict(type = 'choropleth',
            colorscale = 'Blues',
            locations = election_data['StateAbv'],
            locationmode= 'USA-states',
            text= election_data['VEPHighestOffice'],
            z= election_data['VEPHighestOffice'],
            reversescale = True)
layout = dict (geo= {'scope':'usa'}, title= '2016 US Election Votes by Percentage')
choromap = go.Figure(data = [data],layout = layout)

In [107]:
iplot(choromap)

### Sources 
___
McDonald, Michael P. "2016 November General Election Turnout Rates" United States Elections Project. http://www.electproject.org/2016g



### Archive Code
___

Original data featured partial multi-index; here the structure is preserved for reference to provide insight into column meanings.

In [13]:
column_key = {'Turnout Rates':['VEPTotalBallotsCounted','VEPHighestOffice','VAPHighestOffice'],
              'Numerators':['TotalBallotsCounted(Estimate)''HighestOffice'],
              'Denominators':['Voting-EligiblePopulation(VEP)','Voting-AgePopulation(VAP)'],
              'VEPComponents':['%Non-citizen','Prison','Probation','Parole','TotalIneligibleFelon','OverseasEligible']}