![](../additional_materials/logos/darden_rice_logo_SM.png)

### 2017 Municipal Election Day Doc Processing

This notebook contains code to process and format data according to Adrienne Bogen's [E Day Doc](https://docs.google.com/spreadsheets/d/1M6EKaDWyVTHzpNTi2cdLXDYZfKgGVtChcbCmEbIla4k/edit#gid=0) for the 2017 Pinellas County general municipal election on Google Sheets.

Data sources: 
* [Pinellas County SOE](https://www.votepinellas.com/Election-Results) (specific: [Pinellas County SOE 2017 Municipal Primary Reports](https://enr.votepinellas.com/FL/Pinellas/71078/188313/en/reports.html))
* NGP VAN

---
---

In [1]:
import pandas as pd
from pandas.tseries.offsets import BDay
pd.set_option('display.max_columns', None)

import numpy as np
import matplotlib.pyplot as plt
import matplotlib.ticker as ticker
import seaborn as sns

import datetime

In [2]:
vbm_df = pd.read_csv('../data/raw_eday_2017/VAN/2017_municipal_turnout_vbm.csv')
polls_df = pd.read_csv('../data/raw_eday_2017/VAN/2017_municipal_turnout_polls.csv')

In [3]:
vbm_df.head(3)

Unnamed: 0,Precinct,Democrats,Independent,Republicans,Unknown,Totals
0,101,483,74,91,0.0,648
1,102,281,31,52,0.0,364
2,103,106,42,58,0.0,206


In [4]:
vbm_df.tail(3)

Unnamed: 0,Precinct,Democrats,Independent,Republicans,Unknown,Totals
89,275,66.0,26.0,61.0,0.0,153.0
90,Total People,18260.0,5526.0,11290.0,0.0,35076.0
91,,,,,,


In [5]:
polls_df.head(3)

Unnamed: 0,Precinct,Democrats,Independent,Republicans,Unknown,Totals
0,101,356,46,71,0.0,473
1,102,204,22,43,0.0,269
2,103,77,19,47,0.0,143


In [6]:
polls_df.tail(3)

Unnamed: 0,Precinct,Democrats,Independent,Republicans,Unknown,Totals
89,275,31.0,13.0,31.0,0.0,75.0
90,Total People,11933.0,3272.0,6735.0,0.0,21940.0
91,,,,,,


In [7]:
# Drop last 2 rows of each df
vbm_df.drop(vbm_df.tail(2).index, inplace=True)
polls_df.drop(polls_df.tail(2).index, inplace=True)

In [8]:
# Merge VBM and polls dfs on precinct
van_df = vbm_df.merge(polls_df, how='left', left_on='Precinct', right_on='Precinct', suffixes=('_vbm', '_polls'))

In [9]:
print(vbm_df.shape)
print(polls_df.shape)
print(vbm_df.shape)

(90, 6)
(90, 6)
(90, 6)


In [10]:
van_df

Unnamed: 0,Precinct,Democrats_vbm,Independent_vbm,Republicans_vbm,Unknown_vbm,Totals_vbm,Democrats_polls,Independent_polls,Republicans_polls,Unknown_polls,Totals_polls
0,101,483,74,91,0.0,648,356,46,71,0.0,473
1,102,281,31,52,0.0,364,204,22,43,0.0,269
2,103,106,42,58,0.0,206,77,19,47,0.0,143
3,104,255,41,39,0.0,335,247,37,31,0.0,315
4,105,558,91,93,0.0,742,396,43,51,0.0,490
...,...,...,...,...,...,...,...,...,...,...,...
85,237,172,55,172,0.0,399,52,11,14,0.0,77
86,239,259,103,184,0.0,546,187,71,176,0.0,434
87,240,118,59,155,0.0,332,95,33,103,0.0,231
88,241,202,87,190,0.0,479,126,50,121,0.0,297


In [11]:
van_df.isnull().sum()

Precinct             0
Democrats_vbm        0
Independent_vbm      0
Republicans_vbm      0
Unknown_vbm          0
Totals_vbm           0
Democrats_polls      0
Independent_polls    0
Republicans_polls    0
Unknown_polls        0
Totals_polls         0
dtype: int64

In [12]:
# # Impute 0 voters for null precincts
# van_df.fillna(0, inplace=True)

In [13]:
van_df.dtypes

Precinct              object
Democrats_vbm         object
Independent_vbm       object
Republicans_vbm       object
Unknown_vbm          float64
Totals_vbm            object
Democrats_polls       object
Independent_polls     object
Republicans_polls     object
Unknown_polls        float64
Totals_polls          object
dtype: object

In [14]:
# Remove commas from numbers >= 1000 and cast as integers
van_df.replace(',','', regex=True, inplace=True)

van_df = van_df.astype(int).copy()

In [15]:
van_df.dtypes

Precinct             int64
Democrats_vbm        int64
Independent_vbm      int64
Republicans_vbm      int64
Unknown_vbm          int64
Totals_vbm           int64
Democrats_polls      int64
Independent_polls    int64
Republicans_polls    int64
Unknown_polls        int64
Totals_polls         int64
dtype: object

In [16]:
# Sum columns for total 2017 TO (all parties)
van_df['Total TO'] = van_df['Totals_vbm'] + van_df['Totals_polls']

In [17]:
# Drop parties other than Dem and Rep
van_df.drop(['Independent_vbm', 'Unknown_vbm', 'Independent_polls', 'Unknown_polls'], axis=1, inplace=True)

In [18]:
# Sum VBM and Polls TO for Totals column
van_df['Total Dem & Rep TO'] = van_df['Democrats_vbm'] + van_df['Democrats_polls'] + van_df['Republicans_vbm'] + van_df['Republicans_polls']

# Drop individual totals
van_df.drop(columns=['Totals_vbm', 'Totals_polls'], inplace=True)

# Reorder columns to match e day spreadsheet
van_df = van_df[['Precinct', 'Democrats_vbm', 'Republicans_vbm', 'Democrats_polls', 'Republicans_polls', 'Total Dem & Rep TO', 'Total TO']]

In [19]:
van_df['pct_of_total'] = round(van_df['Total Dem & Rep TO'] / van_df['Total TO'], 4)

In [20]:
van_df

Unnamed: 0,Precinct,Democrats_vbm,Republicans_vbm,Democrats_polls,Republicans_polls,Total Dem & Rep TO,Total TO,pct_of_total
0,101,483,91,356,71,1001,1121,0.8930
1,102,281,52,204,43,580,633,0.9163
2,103,106,58,77,47,288,349,0.8252
3,104,255,39,247,31,572,650,0.8800
4,105,558,93,396,51,1098,1232,0.8912
...,...,...,...,...,...,...,...,...
85,237,172,172,52,14,410,476,0.8613
86,239,259,184,187,176,806,980,0.8224
87,240,118,155,95,103,471,563,0.8366
88,241,202,190,126,121,639,776,0.8235


In [21]:
van_df.to_csv('../data/processed_eday_2017/2017_municipal_turnout.csv', index=False)

---
---

#### SOE Vote Breakdown

This is based on each candidates' party, and does not necessarily represent that a voter is registered as a Democrat or Republican. Only the Democratic and Republican candidates are included, as they comprised 96.59% of the votes cast in the 2017 Municipal Primary election.

Candidate party affiliation for the 2017 Municipal Primary Mayoral race is as follows:
* **Rick Baker:** Republican
* **Rick Kriseman:** Democrat

In [22]:
# soe_df = pd.read_csv('../data/raw_eday_2017/SOE/2017_primary_detail.csv')

In [23]:
# soe_df

In [24]:
# # Drop last row of soe_df that represents totals
# soe_df.drop(index=soe_df.tail(1).index, inplace=True)

In [25]:
# soe_df.tail(3)