In [1]:
import maup # mggg's library for proration, see documentation here: https://github.com/mggg/maup
import pandas as pd # standard python data library
import geopandas as gp # the geo-version of pandas
import numpy as np 
import os
import fiona
from statistics import mean, median
from pandas import read_csv
gp.io.file.fiona.drvsupport.supported_drivers['KML'] = 'rw' #To load KML files

# VEST PA Validation

In [2]:
vest_pa_18 = gp.read_file("./raw-from-source/VEST/pa_2018/pa_2018.shp")

Election results from the Pennsylvania Secretary of State's office via OpenElections (https://github.com/openelections/openelections-data-pa/). Precinct data was corrected with canvass reports for the following counties: Berks, Blair, Bradford, Cambria, Carbon, Crawford, Elk, Forest, Franklin, Lawrence, Lycoming, Mifflin, Montgomery, Montour, Northumberland, Susquehanna. The candidate totals for Berks, Blair, Crawford, and Mifflin differ from the county totals reported by the state and therefore the statewide totals differ from the official results accordingly.

Precinct shapefiles primarily from the U.S. Census Bureau's 2020 Redistricting Data Program Phase 2 release. The shapefiles from Delaware County and the City of Pittsburgh are from the respective jurisdictions instead. Precinct numbers were corrected to match the voter file in the following locales: Allegheny (Elizabeth, Pittsburgh W12), Blair (Greenfield), Bradford (Athens), Greene (Nonongahela), Monroe (Smithfield), Montgomery (Hatfield), Northampton (Bethlehem Twp), Perry (Toboyne), Washington (New Eagle, Somerset), York (Fairview).

Precinct boundaries throughout the state were edited to match voter assignments in the PA Secretary of State voter file from the 2018 election cycle. While some edits reflect official updates to wards or divisions the great majority involve voters incorrectly assigned to voting districts by the counties. As such the VEST shapefile endeavors to reflect the de facto precinct boundaries and these often differ from the official voting district boundaries, in some cases quite drastically. Wherever possible edits were made using census boundaries or alternatively using the parcel shapefiles from the respective counties. 

In certain areas voter assignments appear so erratic that it is impractical to place all voters within their assigned precinct. These areas were edited so as to place as many voters as possible within their assigned precinct without displacing a greater number from their assigned precinct. In general, municipal boundaries were retained except where significant numbers of numbers were misassigned to the wrong municipality. In cases where the odd/even split was incorrectly reversed for precinct boundary streets the official boundary was retained. All such cases involved near equal number of voters swapped between voting districts.

The following revisions were made to the base shapefiles to match the de facto 2018 precinct boundaries consistent with the voter file. Individual precincts are noted in cases of splits or merges. Due to the sheer number of edits boundary adjustments are noted at the borough/township level. There may be as many as two dozen individual precincts that were revised within a given municipality.

Adams: Adjust Cumberland, Franklin
Allegheny: Merge CD splits for S Fayette 3/5; Split Pittsburgh W5 11/17; Merge Pittsburgh W16 9/11/12, Align McCandless with municipal boundary; Adjust Avalon, Baldwin, Bethel Park, Braddock, Brentwood, Castle Shannon, Clairton, Collier, Coraopolis, Crescent, Dormont, Dravosburg, Duquesne, E Deer, E McKeesport, E Pittsburgh, Elizabeth, Emsworth, Forward, Glassport, Hampton, Harmar, Ingram, Jefferson Hills, Kennedy, Leet, Liberty, Marshall, McCandless, McKees Rocks, McKeesport, Monroeville, Moon, Mount Lebanon, Munhall, N Fayette, N Versailles, O'Hara, Oakdale, Penn Hills, Pine, Pittsburgh (nearly all wards), Pleasant Hills, Reserve, Richland, Ross, Scott, Sewickley, Shaler, S Fayette, S Park, Stowe, Swissvale, Upper St. Clair, W Deer, W Homestead, W Mifflin, W View, Whitaker, Whitehall, Wilkins, Wilkinsburg
Armstrong: Align Dayton, Elderton, Ford City, Kittanning, N Apollo with municipal boundaries; Adjust Ford City, Gilpin, Kiskiminetas, Kittanning, Manor, N Buffalo, Parks, Parker City, S Buffalo
Beaver: Adjust Aliquippa, Ambridge, Baden, Beaver, Brighton, Center, Chippewa, Conway, Economy, Franklin, Hanover, Harmony, Hopewell, Midland, Monaca, N Sewickley
Bedford: Adjust Bedford Boro, Bedford Twp
Berks: Adjust Cumru, Douglass, Oley, Maxatawny, Robeson, Sinking Spring, Spring, Union
Blair: Merge Tunnelhill/Allegheny Twp 4; Align Altoona, Bellwood, Duncansville, Hollidaysburg, Newry, Roaring Spring, Tyrone, Williamsburg with municipal boundaries; Adjust Allegheny, Altoona, Antis, Frankstown, Freedom, Greenfield, Huston, Juniata, N Woodbury, Logan, Snyder, Tyrone Boro, Tyrone Twp
Bucks: Align Sellersville, Tullytown with municipal boundaries; Adjust Bensalem, Bristol Boro, Bristol Twp, Buckingham, Doylestown Twp, Falls, Hilltown, Lower Makefield N, Lower Southampton E, Middletown, Milford, Morrissville, Newtown Twp, Northampton, Solebury Lower, Solebury, Springfield, Tinicum, Upper Makefield, Upper Southampton E, Warminster, Warrington, W Rockhill
Butler: Merge CD splits for Cranberry E 2, 3, Cranberry W 1, 2, Jefferson 1, 2; Align Butler Twp, Valencia with municipal boundaries; Adjust Adams, Buffalo, Butler Boro, Butler Twp, Center, Cranberry E, Cranberry W, Jackson, Jefferson, Zelienople
Cambria: Align Daisytown, Sankertown, W Taylor, Wilmore with municipal boundaries; Adjust Cambria, Conemaugh, Croyle, E Taylor, Ebensburg, E Carroll, Geistown, Jackson, Johnstown W8, W17, W20, Lower Yoder, Northern Cambria, Portage Boro, Portage Twp, Richland, Southmont, Stonycreek, Summerhill, Susquehanna, Upper Yoder, W Carroll, Westmont
Cameron: Adjust Emporium, Shippen
Carbon: Adjust Jim Thorpe, Kidder, Mahoning, New Mahoning, Summit Hill
Centre: Merge CD splits for Halfmoon E Central/Proper; Merge Ferguson Northeast 1 A/B; Adjust Benner, College, Ferguson, Patton
Chester: Merge CD/LD splits for Birmingham 2, Phoenixville M 1; Adjust Birmingham, E Bradford S, E Fallowfield, E Goshen, E Marlborough, Easttown, N Coventry, Spring City, Tredyffrin M, Uwchlan, W Bradford, W Caln, W Goshen N, W Goshen S, Westtown
Clarion: Merge Emlenton/Richland; Adjust Clarion, Highland, Farmington, Knox
Clearfield: Adjust Bradford, Cooper, Decatur, Golden Rod, Lawrence Glen Richie, Morris, Plympton, Woodward
Columbia: Merge Ashland/Conyngham; Adjust Orange, Scott West
Crawford: Align Mead, Woodcock with municipal boundaries
Cumberland: Merge CD splits for N Middleton 1, 3; Split Lower Allen 1/Annex; Align Carlisle, E Pennsboro, Hampton, Lemoyne, Lower Allen, Mechanisburg, Middlesex, Mount Holly Springs, N Middleton, Shiremanstown, Silver Spring, W Pennsboro, Wormsleysburg with municipal boundaries
Dauphin: Align Middletown with municipal boundary; Adjust Derry, Harrisburg W1, W7, W8, W9, Hummelstown, Lower Paxton, Lykens, Middletown
Delaware: Adjust Chester, Concord, Darby Boro, Darby Twp, Haverford, Marple, Nether Providence, Newtown, Radnor, Ridley, Sharon Hill, Thornbury, Tinicum, Trainer, Upper Chichester, Upper Darby, Upper Providence
Elk: Split N/S Horton; Adjust Johnsonburg, Ridgeway Boro, Ridgeway Twp, St. Marys
Erie: Adjust Erie W1, W4, W5, W6, Greene, Lawrence Park, McKean, Millcreek, North East
Fayette: Align Dunbar with municipal boundary; Adjust Brownsville, Bullskin, Dunbar, Georges, German, Luzerne, N Union, Redstone
Franklin: Align Mercersburg with municipal boundary; Adjust Antrim, Fannett, Greene, Guilford, Hamilton, Metal, Peters, Quincy, St. Thomas, Southampton, Washington
Fulton: Align McConnellsburg with municipal boundary
Greene: Align Carmichaels with municipal boundary; Adjust Cumberland, Dunkard, Franklin, Jefferson, Lipencott, Mather, Morgan Chart, Monongahela, Nemacolin
Huntingdon: Merge CD splits for Penn; Adjust Huntingdon, Mount Union
Jefferson:  Align Reynoldsville with municipal boundary; Adjust Punxsutawney
Lackawanna: Adjust Archbald, Blakely, Carbondale, Clarks Summit, Dickson City, Dunmore, Fell, Jermyn, Jessup, Mayfield, Moosic, Old Forge, Olyphant, Scranton W1, W2, W3, W6, W7, W10, W12, W13, W14, W15, W16, W19, W20, W23, S Abington, Taylor, Throop
Lancaster: Split Lancaster 7-8 CV/LS; Adjust Brecknock, Columbia, E Hempfield, E Lampeter, E Petersburg, Elizabethtown, Ephrata, Lancaster W4, W8, Lititz, Manheim, Manor, Millersville, Mt Joy Boro, Mt Joy Twp, New Holland, Penn, Providence, Rapho, Warwick, W Cocalico, W Donegal, W Hempfield
Lawrence: Adjust Neshannock
Lebanon: Adjust Jackson, Lickdale, S Lebanon, Union Green Pt
Lehigh: Adjust Lower Macungie, Salisbury
Luzerne: Merge CD splits for Hazle 1; Align Avoca, Pittston with municipal boundaries; Adjust Butler, Dallas, Exeter, Foster, Freeland, Hanover, Hazle, Jenkins, Kingston Boro, Kingston Twp, Larksville, Lehman, Nanticoke, Newport, Plains, Salem, Smoyersville, W Wyoming, Wilkes-Barre
Lycoming: Align Williamsport with municipal boundary; Adjust Jersey Shore
McKean: Adjust Bradford City, Bradford Twp, Foster, Keating, Otto
Mercer: Adjust Delaware, Fredonia, Greenville, Hempfield, Hermitage, Sharon, Sharpsville, S Pymatuning, W Salem
Mifflin: Split Brown Reedsville/Church Hill
Monroe: Align E Stroudsburg with municipal boundary; Adjust E Stroudsburg, Smithfield
Montgomery: Add CD special election splits for Horsham 2-2, Perkiomen 1, Plymouth 2-3; Adjust Abington, Lower Merion, Pottstown, Springfield, Upper Moreland, Upper Merion, Upper Providence
Northampton: Align Glendon, Walnutport with municipal boundaries; Adjust Bangor, Bethlehem W2, W3, W4, W7, W9, W14, W15, Bethlehem Twp, Bushkill, Easton, Forks, Hanover, Hellertown, Lehigh, Lower Mt Bethel, Lower Saucon, Moore, Nazareth, Palmer, Plainfield, Upper Mt Bethel, Washington, Williams
Northumberland: Align Northumberland with municipal boundary; Adjust Coal, Milton, Mount Carmel W, Natalie-Strong, Northumberland, Point, Ralpho, Shamokin, Sunbury, Upper Augusta
Philadelphia: Adjust 1-19/21, 5-3/19, 7-2/3/17, 7-6/7, 9-5/6, 15-7/10, 17-20/26, 20-5/10, 21-1/15, 21-40/41, 22-21/26, 23-11/12, 25-9/17, 25-4/7/12, 25-10/12, 26-1/2, 27-7/8, 27-18/20/21, 28-1/8, 29-9/11, 29-10/17, 30-14/15, 31-5/6, 38-11/17, 38-13/20, 38-15/19, 40-12/18/19, 40-17/19, 42-3/4/7, 44-8/14, 50-3/12, 50-11/27, 52-2/6/9, 52-3/8, 57-6/7, 57-10/27, 57-17/28, 58-6/12, 62-5/19, 65-4/7, 65-11/16, 66-22/34
Pike: Adjust Matamoras
Potter: Adjust Galeton, Sharon
Schuylkill: Adjust Coaldale, N Manheim, Norwegian, Porter, Pottsville
Somerset: Align New Centerville with municipal boundary; Adjust Conemaugh, Jefferson, Middlecreek, Paint, Somerset Boro
Susquehanna: Adjust Montrose; Lanesboro, Susquehanna Depot
Tioga: Adjust Delmar, Wellsboro
Union: Adjust Buffalo, White Deer
Venango: Adjust Franklin, Sugarcreek, Cornplanter, Oil City
Warren: Adjust Conewango
Washington: Align Allenport, Beallsville, Burgettstown, Canonsburg, Carroll, Charleroi, Claysville, Elco, Finleyville, Houston, Long Branch, McDonald, Monongahela, Speers, Twilight with municipal boundaries; Adjust Amwell, Bentleyville, California, Canonsburg, Canton, Cecil, Centerville, Chartiers, Donegal, Donora, Fallowfield, Hanover, Independence, Mount Pleasant, N Franklin, N Strabane, Peters, Robinson, Smith, Somerset, S Franklin, S Strabane, Union Washington, W Brownsville
Wayne: Adjust Honesdale
Westmoreland: Merge CD splits for Unity Pleasant Unity; Align Greensburg with municipal boundary; Adjust Allegheny, Arnold, Bell, Derry, E Huntingdon, Fairfield, Greensburg W1-W8, Hempfield, Jeannette, Latrobe, Ligonier, Lower Burrell, Monessen, Mount Pleasant, Murraysville, New Kensington, N Belle Vernon, N Huntingdon, Penn, Rostraver, St. Clair, Scottdale, Sewickley, S Greensburg, S Huntingdon, Trafford, Upper Burrell, Unity, Vandergrift, Washington, Youngwood
Wyoming: Adjust Falls
York: Merge CD splits for York Twp 5-3; Align E Prospect, Goldsboro, Jefferson, Manchester, Monaghan, Wellsville, York with municipal boundaries; Adjust Chanceford, Codorus, Conewago, Dover, Fairview, Hanover, Jackson, Lower Windsor, New Freedom, Newberry, N Codorus, Penn, Red Lion, Shrewsbury, Spring Garden, Springbettsbury, W Manchester, Windsor Boro, Windsor Twp, Wrightsville, York Twp, York W5, W6, W15

In [3]:
print(vest_pa_18.head())
print(vest_pa_18.columns)

col_list = ['G18USSDCAS', 'G18USSRBAR','G18USSLKER', 'G18USSGGAL', 'G18GOVDWOL', 'G18GOVRWAG', 'G18GOVLKRA','G18GOVGGLO']
print("")
print("Here are the vote totals:")
for i in col_list:
    print(i + ": "+str(sum(vest_pa_18[i])))

  STATEFP COUNTYFP   VTDST          NAME  G18USSDCAS  G18USSRBAR  G18USSLKER  \
0      42      001  000010   ABBOTTSTOWN         120         183           5   
1      42      001  000020  ARENDTSVILLE         151         178           6   
2      42      001  000030  BENDERSVILLE          74         103           1   
3      42      001  000040       BERWICK         289         575          14   
4      42      001  000050   BIGLERVILLE         152         231           3   

   G18USSGGAL  G18GOVDWOL  G18GOVRWAG  G18GOVLKRA  G18GOVGGLO  \
0           2         120         185           2           2   
1           3         160         172           4           2   
2           2          76          98           3           2   
3           5         318         554           9           5   
4           7         168         215           5           2   

                                            geometry  
0  POLYGON Z ((-76.99801 39.88359 0.00000, -76.99...  
1  POLYGON Z ((-77

In [4]:
fips_file = pd.read_csv("./raw-from-source/FIPS/US_FIPS_Codes.csv")
fips_file = fips_file[fips_file["State"]=="Pennsylvania"]
fips_file["FIPS County"]=fips_file["FIPS County"].astype(str)
fips_file["FIPS County"]=fips_file["FIPS County"].str.zfill(3)
fips_file["unique_ID"] =  "42" + fips_file["FIPS County"]
fips_codes = fips_file["unique_ID"].tolist()
print(fips_file["County Name"].unique())
pa_fips_dict = dict(zip(fips_file["County Name"],fips_file["FIPS County"]))

['Adams' 'Allegheny' 'Armstrong' 'Beaver' 'Bedford' 'Berks' 'Blair'
 'Bradford' 'Bucks' 'Butler' 'Cambria' 'Cameron' 'Carbon' 'Centre'
 'Chester' 'Clarion' 'Clearfield' 'Clinton' 'Columbia' 'Crawford'
 'Cumberland' 'Dauphin' 'Delaware' 'Elk' 'Erie' 'Fayette' 'Forest'
 'Franklin' 'Fulton' 'Greene' 'Huntingdon' 'Indiana' 'Jefferson' 'Juniata'
 'Lackawanna' 'Lancaster' 'Lawrence' 'Lebanon' 'Lehigh' 'Luzerne'
 'Lycoming' 'McKean' 'Mercer' 'Mifflin' 'Monroe' 'Montgomery' 'Montour'
 'Northampton' 'Northumberland' 'Perry' 'Philadelphia' 'Pike' 'Potter'
 'Schuylkill' 'Snyder' 'Somerset' 'Sullivan' 'Susquehanna' 'Tioga' 'Union'
 'Venango' 'Warren' 'Washington' 'Wayne' 'Westmoreland' 'Wyoming' 'York']


In [131]:
pa_election = pd.read_csv("./raw-from-source/Election_Results/openelections-data-pa-master/2018/20181106__pa__general__precinct.csv")

  has_raised = await self.run_ast_nodes(code_ast.body, cell_name,


In [132]:
office_list = ["U.S. Senate", 'Governor','Straight Party']
filtered_pa_election = pa_election[pa_election["office"].isin(office_list)]
county_changes_dict = {"Washington ":"Washington"}
filtered_pa_election["county"] = filtered_pa_election["county"].map(county_changes_dict).fillna(filtered_pa_election["county"])
filtered_pa_election["County_FIPS"]=filtered_pa_election.loc[:,"county"].map(pa_fips_dict).fillna(filtered_pa_election.loc[:,"county"])

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  filtered_pa_election["county"] = filtered_pa_election["county"].map(county_changes_dict).fillna(filtered_pa_election["county"])
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  filtered_pa_election["County_FIPS"]=filtered_pa_election.loc[:,"county"].map(pa_fips_dict).fillna(filtered_pa_election.loc[:,"county"])


In [133]:
filtered_pa_election["pivot_col"]=filtered_pa_election["County_FIPS"]+filtered_pa_election["precinct"]
filtered_pa_election["candidate"]=filtered_pa_election["candidate"].str.upper()
filtered_pa_election["candidate"] = filtered_pa_election["candidate"].str.strip()
filtered_pa_election["party"] = filtered_pa_election["party"].str.upper()


print(filtered_pa_election["party"].unique())

party_changes_dict = {"DEMOCRATIC":"DEM","REPUBLICAN":"REP","LIBERTARIAN":"LIB","GREEN":"GRN",
                     "GR":"GRN","GRE":"GRN","DEMOCRAT":"DEM"}

filtered_pa_election["party"] = filtered_pa_election["party"].map(party_changes_dict).fillna(filtered_pa_election["party"])


['DEM' 'REP' 'GRN' 'LIB' nan 'IND' 'GREEN' 'GR' 'GRE' 'DEMOCRATIC'
 'REPUBLICAN' 'LIBERTARIAN' 'DEMOCRAT' 'NAF' 'NOA']


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  filtered_pa_election["pivot_col"]=filtered_pa_election["County_FIPS"]+filtered_pa_election["precinct"]
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  filtered_pa_election["candidate"]=filtered_pa_election["candidate"].str.upper()
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  filtered_pa_election["

KeyError: 'cand_col'

In [135]:
filtered_pa_election = filtered_pa_election[~(filtered_pa_election["candidate"].str[-3:]=="(W)")]

G18USSDCAS - Robert P. Casey Jr. (Democratic Party)  
G18USSRBAR - Louis J. Carletta (Republican Party)  
G18USSLKER - Dale R. Kerns (Libertarian Party)  
G18USSGGAL - Neal Taylor Gale (Green Party)  
  
G18GOVDWOL - Thomas W. Wolf (Democratic Party)  
G18GOVRWAG - Scott R. Wagner (Republican Party)  
G18GOVLKRA - Kenneth V. Krawchuk (Libertarian Party)  
G18GOVGGLO - Paul Glover (Green Party)  

In [136]:
#Things to look into

party_cand_list = [
  'DEMOCRATIC', 
 'REPUBLICAN',
 'GREEN', 
 'INDEPENDENT', 
 'LIBERTARIAN', 

]

In [137]:
candidate_name_changes = {
   'DEMOCRATIC':'DEM', 
 'REPUBLICAN':"REP",
 'GREEN':"GRN", 
 'INDEPENDENT':"IND", 
 'LIBERTARIAN':"LIB",
    
    
    'LOU BARLETTA':'BARLETTA',
 'LOU  BARLETTA':'BARLETTA',
 'LOU BARLETTA JR':'BARLETTA',
 'BARLETTA, LOU':'BARLETTA',

    'KEN V KRAWCHUK, GOVERNOR':'KRAWCHUK',
    'KEN V. KRAWCHUK/K.S. SMITH':'KRAWCHUK',
    'KRAWCHUK /SMITH':'KRAWCHUK',
    'KEN V. KRAWCHUK KATHLEEN S. SMITH':'KRAWCHUK',
    'KRAWCHUK\\SMITH':'KRAWCHUK',
     'KEN V KRAWCHUK':'KRAWCHUK', 
 'KRAWCHUK / SMITH':'KRAWCHUK',
 'KRAWCHUK/SMITH':'KRAWCHUK',
 'KEN V. KRAWCHUK/K. S. SMITH':'KRAWCHUK',
 'KEN KRAWCHUK':'KRAWCHUK',
 'KRAWCHUK/ SMITH':'KRAWCHUK',
 'KEN V. KRAWCHUK':'KRAWCHUK',
 'KRAWCHUK, KEN V.':'KRAWCHUK',
 'KEN V. KRAWCHUK / K. S. SMITH':'KRAWCHUK',
    
    'GLOVER / BOSTICK':'GLOVER',
    'PAUL GLOVER, GOVERNOR':'GLOVER',
    'GLOVER / BOWSER BOSTICK':'GLOVER',
    'PAUL GLOVER/J. BOWSER-BOSTICK':'GLOVER',
    'GLOVER/BOSTICK':'GLOVER',
    'GLOVER/BOWSER-BOSTIC':'GLOVER',
    'PAUL GLOVER JOCOLYN BOWSER-BOSTICK':'GLOVER',
    'GLOVER/BOWSERBOS':'GLOVER',
    'GLOVER\\BOWSERBOSTICK':'GLOVER',
     'GLOVER / BOWSER-BOSTICK':'GLOVER', 
 'GLOVER/BOWSER-BOSTICK':'GLOVER',
 'GLOVER/BOWSER-BOS':'GLOVER', 
 'PAUL GLOVER/JOCOLYN BOWER-BOSTICK':'GLOVER', 
 'PAUL  GLOVER':'GLOVER',
 'GLOVER / BOWSER-BOS':'GLOVER',
 'PAUL GLOVER':'GLOVER',
 'GLOVER, PAUL':'GLOVER',
 'PAUL GLOVER / J. BOWSER BOSTICK':'GLOVER',
    
    'SCOTT R WAGNER, GOVERNOR':'WAGNER',
    'SCOTT R. WAGNER JEFF BARTOS':'WAGNER',
    'WAGNER\\BARTOS':'WAGNER',
     'SCOTT R WAGNER':'WAGNER', 
    'WAGNER/BARTOS':'WAGNER',
 'WAGNER / BARTOS':'WAGNER',
  'SCOTT R. WAGNER/JEFF BARTOS':'WAGNER',
 'WAGNER/ BARTOS':'WAGNER',
 'SCOTT R WAGNER AND JEFF BARTOS':'WAGNER',
 'SCOTT WAGNER':'WAGNER',
 'SCOTT R. WAGNER':'WAGNER',
 'WAGNER, SCOTT R.':'WAGNER',
 'SCOTT R. WAGNER / JEFF BARTOS':'WAGNER',
    
    'TOM WOLF, GOVERNOR':'WOLF',
    'TOM WOLF JOHN FETTERMAN':'WOLF',
    'WOLF\\FETTERMAN':'WOLF',
     'WOLF / FETTERMAN':'WOLF',
 'WOLF/FETTERMAN':'WOLF',
 'TOM WOLF/JOHN FETTERMAN':'WOLF', 
 'TOM  WOLF':'WOLF',
 'TOM WOLF AND JOHN FETTERMAN':'WOLF',
 'TOM WOLF':'WOLF',
 'WOLF, TOM':'WOLF',
 'TOM WOLF / JOHN FETTERMAN':'WOLF',
    
    'DALE KERNS':"KERNS",
    'DALE R KEARNS, JR':"KERNS",
 'DALE R KERNS, JR':"KERNS",
 'DALE R. KERNS JR.':"KERNS",
  'DALE KERNS JR':"KERNS", 
 'DALE R. KERNS, JR':"KERNS",
 'DALE R. KERNS, JR.':"KERNS",
 'DALE R. KERNS JR':"KERNS", 
 'DALE R KERNS JR':"KERNS",
    'KERNS, JR., DALE R.':"KERNS",
    
    'ROBERT CASEY JR.':"CASEY",
     'BOB CASEY, JR':"CASEY",
 'BOB CASEY JR.':"CASEY",
 'BOB  CASEY, JR.':"CASEY",
 'BOB CASEY':"CASEY",
 'CASEY, JR., BOB':"CASEY",
 'BOB CASEY, JR.':"CASEY", 
 'BOB CASEY JR':"CASEY", 
    
    'NEAL GALE':"GALE",
 'NEAL  GALE':"GALE",
 'GALE, NEAL':"GALE",
 'NEALE GALE':"GALE"}

filtered_pa_election["candidate"] = filtered_pa_election["candidate"].map(candidate_name_changes).fillna(filtered_pa_election["candidate"])

In [138]:
candidates_to_remove = ["NO AFFILIATION",'WRITE - IN','BLANK VOTES',
                      'WRITE-INS','WRITE IN','CAST VOTES','OVER VOTES',
                     'UNDER VOTES','WRITE IN VOTES','WRITE-IN VOTES']

parties_to_remove = ["NAF","IND"]

In [139]:
filtered_pa_election = filtered_pa_election[~(filtered_pa_election["candidate"].isin(candidates_to_remove))]
filtered_pa_election = filtered_pa_election[~(filtered_pa_election["party"].isin(parties_to_remove))]
filtered_pa_election["party"] = filtered_pa_election["party"].fillna(filtered_pa_election["candidate"])
filtered_pa_election["candidate"] = filtered_pa_election["candidate"].fillna(filtered_pa_election["party"])

In [147]:
filtered_pa_election["cand_col"]=filtered_pa_election["office"]+filtered_pa_election["candidate"]

In [152]:
print(filtered_pa_election[filtered_pa_election["pivot_col"].isna()])

Empty DataFrame
Columns: [county, precinct, office, district, candidate, party, votes, absentee, election_day, County_FIPS, pivot_col, cand_col, cand_col_2]
Index: []


In [149]:
pivoted_2018 = pd.pivot_table(filtered_pa_election, values=["votes"], index=["pivot_col"],columns=["cand_col"],aggfunc=sum)

In [150]:
print(pivoted_2018.head())

cand_col              GovernorGLOVER GovernorKRAWCHUK GovernorWAGNER  \
pivot_col                                                              
001Abbottstown  votes              2                2            185   
001Arendtsville votes              2                4            172   
001Bendersville votes              2                3             98   
001Berwick      votes              5                9            554   
001Biglerville  votes              2                5            215   

cand_col              GovernorWOLF Straight PartyDEM Straight PartyGRN  \
pivot_col                                                                
001Abbottstown  votes          120                57                 1   
001Arendtsville votes          160                64                 0   
001Bendersville votes           76                37                 0   
001Berwick      votes          318               144                 0   
001Biglerville  votes          168                6

In [None]:
#Combine all the data from separate files into one
li = []
for i in fips_codes:
    ref = "./raw-from-source/Census/partnership_shapefiles_19v2_"
    file_ref = ref+i+"/PVS_19_v2_vtd_"+i+".shp"
    file_prev = gp.read_file(file_ref)
    #print(file_prev.shape)
    li.append(file_prev)
shapefiles_census = pd.concat(li, axis=0, ignore_index=True)

In [None]:
print(len(shapefiles_census["COUNTYFP"].unique()))

In [None]:
print(vest_pa_18.head())