# Verification Example Notebook

This Notebook will show how to use the verification script by providing a few examples in order of increasing difficulty. Before you start I highly reccomend reading [README.md](https://github.com/OpenPrecincts/verification/blob/master/README.md)

The following cells will show how to use both verification methods at your disposal. The first section shows how to use `verify.verify_state_2016(...)`. The second second shows how to use `verify.verify_state(...)`, which is applicable to any election year. This may seem confusing at first, but fret not, the outputs of the two functions are identical. The `verify.verify_state_2016(...)` function just supplies 2016 presets to `verify.verify_state(...)` thereby reducing the number of arguements you need to supply for 2016 states. Since it requries fewer arguements, its easier to understand. For that reason, we will start will the 2016 specific useage before moving on to `verify.verify_state(...)`. Even if you are looking to verify a non-2016 election, you might find it helpful to review the 2016 examples.

In [1]:
import geopandas as gpd
import pandas as pd
import verify
import matplotlib.pyplot as plt
from reference_data import state_fip_to_county_to_geoid, geoid_to_county_name

## Examples for 2016 Precinct-Level Election Shapefiles

The `verify.verify_state_2016(...)` function will call `verify.verify_state(...)` automatically apply 2016 specific defaults:

* Uses Official County Results from the 2016 Presidential Election already in this repository
* Sets year to '2016'
* Sets office to 'President'

Using this funciton for a 2016 Precinct-Level Election Shapefile has the benefit of standardizing 2016 reports. Moreover, it saves you the time of finding official county level results and conforming the data to the expected schema for input data.

### Example #1 - The best case scenario

In [2]:
gdf = gpd.read_file('example-election-shapefiles/open-precincts-md-2016')
gdf.head(2)

Unnamed: 0,JURIS,NAME,NUMBER,preid,G16RPRS,G16DPRS,G16PRELJoh,G16PREGSte,G16PREOth,G16USSRSze,...,G16H07RVau,G16H07DCum,G16H07GHoe,G16H07Oth,G16H08RCox,G16H08DRas,G16H08LWun,G16H08GWal,G16H08Oth,geometry
0,ALLE,ALLEGANY PRECINCT 01-000,01-000,ALLE-01-000,420,63,7,2,8,367,...,0,0,0,0,0,0,0,0,0,"POLYGON ((279387.444 229180.231, 279432.538 22..."
1,ALLE,ALLEGANY PRECINCT 02-000,02-000,ALLE-02-000,457,78,18,9,2,400,...,0,0,0,0,0,0,0,0,0,"POLYGON ((262891.921 216881.387, 263013.601 21..."


Great - our election shapefile looks good because it has:
* election results at a precinct level with vote counts 
    * for Clinton (G16DPRS)
    * and Trump (G16DPRS)
* AND geometries for each precinct (geometry)

We need all the bullets above in order to use the verification script. Next we will apply `verify.verify_state_2016`. It's docstring is as follows:

```
returns a complete (StateReport) object and a ((CountyReport) list) for the state.

:state_prec_gdf: (GeoDataFrame) containing precinct geometries and election results
:state_abbreviation: (str) e.g. 'MA' for Massachusetts
:source: (str) person or organization that made the 'state_prec_gdf' e.g 'VEST'
:year: (str) 'YYYY' indicating the year the election took place e.g. '2016'
:d_col: (str) denotes the column for Hillary Clinton vote counts in each precinct
:r_col: (str) denotes the column for Donald Trump vote counts in each precinct
:path: (str) filepath to which the report should be saved (if None it won't be saved)

d_col, r_col are optional - if they are not provided, 'get_party_cols' will be used
to guess based on comparing each column in state_prec_gdf to the expected results.
```

Pro tip: If you want to view a docstring in Jupyter Notebooks just type hit `shift-tab` after the name of the function for which you want to see the docstring. 

In [3]:
state_report, county_report_lst = verify.verify_state_2016(gdf, 'MD', 'Princeton Gerrymandering Project', '2016')

Starting verification process for:  MD OP 2016
Candidate vote count columns are being assigned automatically
Choose d_col as:  G16DPRS
Choose r_col as:  G16RPRS
Verification will now begin with this GeoDataFrame: 

   d_col  r_col                                           geometry
0     63    420  POLYGON ((279387.444 229180.231, 279432.538 22...
1     78    457  POLYGON ((262891.921 216881.387, 263013.601 21...
2     81    444  POLYGON ((271950.900 229307.142, 272012.454 22...
3    126    200  POLYGON ((249101.155 221530.847, 249098.897 22...
4    209    366  POLYGON ((248919.702 220365.957, 248928.023 22...
Starting Vote Verification
Starting Topology Verification
Starting County Verification
Missing GEOID Column - attempting automatic assignment
GEOID assignment successful
All done!



It's normal for the cell above this one to take a while - normally a few minutes, but even hours in extreme cases. It  depends on the complexity of the state shapefile.

Now that it's finished, let's inspect the reports it returned.

In [4]:
vars(state_report)

{'abbreviation': 'MD',
 'name': 'Maryland',
 'fips': '24',
 'n_votes_democrat_expected': 1677928.0,
 'n_votes_republican_expected': 943169.0,
 'n_two_party_votes_expected': 2621097.0,
 'n_votes_democrat_observed': 1677928,
 'n_votes_republican_observed': 943169,
 'n_two_party_votes_observed': 2621097,
 'vote_score': 1.0,
 'county_vote_score_dispersion': 0.0,
 'worst_county_vote_score': 1.0,
 'median_county_area_difference_score': 0.030068442816528422,
 'worst_county_area_difference_score': 0.22995055024742478,
 'year': 2016,
 'source': 'OP',
 'office': 'President',
 'all_precincts_have_a_geometry': False,
 'can_use_maup': False,
 'can_use_gerrychain': False}

In [5]:
vars(county_report_lst[0])

{'n_votes_democrat_expected': 7875.0,
 'n_votes_republican_expected': 21270.0,
 'n_two_party_votes_expected': 29145.0,
 'n_votes_democrat_observed': 7875,
 'n_votes_republican_observed': 21270,
 'n_two_party_votes_observed': 29145,
 'vote_score': 1.0,
 'geoid': '24001',
 'name': 'Allegany County',
 'area_difference_score': 0.003126333615735013}

In [6]:
len(county_report_lst)

24

Great - now let's use these report objects to render a markdown file. You can also do this with verify.verify_state by providing the optional arguement `path` 

In [7]:
report_file_path = 'open-precincts-maryland-2016'
verify.make_report(report_file_path, state_report, county_report_lst)

[Maryland's Report](https://github.com/OpenPrecincts/verification/blob/master/reports/mggg-vermont-2016.md)

## Example #2 - Manual GEOID Assignment

If the [GEOID column](https://github.com/OpenPrecincts/verification#geoid-county-assignment-for-each-precinct) is missing then the script will attempt to create it using the [MAUP package](https://github.com/mggg/maup#assigning-precincts-to-districts) to assign each precinct to the county which contains it. This election shapefile runs into trouble with MAUP

In [9]:
gdf = gpd.read_file('example-election-shapefiles/vest-nh-2016')
gdf.head(2)

Unnamed: 0,STATEFP,COUNTYFP,VTDST,NAMELSAD,NAME,G16PRERTRU,G16PREDCLI,G16PRELJOH,G16PREGSTE,G16PREOFUE,...,G16USSRAYO,G16USSDHAS,G16USSLCHA,G16USSIDAY,G16USSOWRI,G16GOVRSUN,G16GOVDVAN,G16GOVLABR,G16GOVOWRI,geometry
0,33,1,ALTO01,TOWN OF ALTON Voting District,TOWN OF ALTON,2201,1152,115,24,2,...,2204,1192,49,69,0,2166,1163,135,14,"POLYGON Z ((-71.34362 43.62879 0.00000, -71.34..."
1,33,1,BARN01,TOWN OF BARNSTEAD Voting District,TOWN OF BARNSTEAD,1520,924,125,20,0,...,1454,1033,52,62,0,1454,1017,116,0,"POLYGON Z ((-71.34905 43.34658 0.00000, -71.34..."


In [10]:
state_report, county_report_lst = verify.verify_state_2016(gdf, 'NH', 'VEST', '2016')

Starting verification process for:  NH VEST 2016
Candidate vote count columns are being assigned automatically
Choose d_col as:  G16PREDCLI
Choose r_col as:  G16PRERTRU
Verification will now begin with this GeoDataFrame: 

   d_col  r_col                                           geometry
0   1152   2201  POLYGON Z ((-71.34362 43.62879 0.00000, -71.34...
1    924   1520  POLYGON Z ((-71.34905 43.34658 0.00000, -71.34...
2   1272   2192  POLYGON Z ((-71.54931 43.45244 0.00000, -71.54...
3    324    357  POLYGON Z ((-71.58140 43.69195 0.00000, -71.58...
4   1973   2504  POLYGON Z ((-71.45809 43.53618 0.00000, -71.45...
Starting Vote Verification
Starting Topology Verification
Starting County Verification
Missing GEOID Column - attempting automatic assignment


AssertionError: 

This assertion error is telling us that we are missing a GEOID column and the script was unable to assign it automatically. Luckily, this NH GeoDataFrame already has the two consitutents of a GEOID:
* STATEFP
* COUNTFP

So we can create a GEOID column manually like so:

In [11]:
gdf['GEOID'] = gdf['STATEFP'].map(str) + gdf['COUNTYFP'].map(str)
gdf.GEOID.head(5)

0    33001
1    33001
2    33001
3    33001
4    33001
Name: GEOID, dtype: object

In [12]:
report_file_path = 'vest-new-hampshire-2016'
state_report, county_report_lst = verify.verify_state_2016(gdf, 'NH', 'VEST', '2016',path=report_file_path)

Starting verification process for:  NH VEST 2016
Candidate vote count columns are being assigned automatically
Choose d_col as:  G16PREDCLI
Choose r_col as:  G16PRERTRU
Verification will now begin with this GeoDataFrame: 

   d_col  r_col                                           geometry  GEOID
0   1152   2201  POLYGON Z ((-71.34362 43.62879 0.00000, -71.34...  33001
1    924   1520  POLYGON Z ((-71.34905 43.34658 0.00000, -71.34...  33001
2   1272   2192  POLYGON Z ((-71.54931 43.45244 0.00000, -71.54...  33001
3    324    357  POLYGON Z ((-71.58140 43.69195 0.00000, -71.58...  33001
4   1973   2504  POLYGON Z ((-71.45809 43.53618 0.00000, -71.45...  33001
Starting Vote Verification
Starting Topology Verification
Starting County Verification
Using the GEOID Column in the original shapefile.
All done!



[New Hampshire's Report](https://github.com/OpenPrecincts/verification/blob/master/reports/vest-new-hampshire-2016.md)

In a less trivial case, you may have the county names, but not their FIPS code. Let's consider VEST's Washington 2016:

In [15]:
gdf = gpd.read_file('example-election-shapefiles/vest-wa-2016')
gdf.head(2)

Unnamed: 0,LEGDIST,CONGDIST,CCDIST,COUNTY,COUNTYCODE,PRECCODE,PRECNAME,ST_CODE,G16PREDCLI,G16PRERTRU,...,G16TRERDAV,G16AUDDMCC,G16AUDRMIL,G16ATGDFER,G16ATGRTRU,G16LNDDFRA,G16LNDRMCL,G16INSDKRE,G16INSRSCH,geometry
0,9,4,1,Adams,AD,111,Ritzville Ward 1,AD00000111,24,109,...,81,45,92,57,66,27,107,40,89,"POLYGON ((2169598.057 664568.324, 2169604.801 ..."
1,9,4,1,Adams,AD,112,Ritzville Ward 2,AD00000112,26,96,...,65,35,87,66,51,30,92,40,84,"POLYGON ((2170310.521 663414.822, 2170266.610 ..."


In [16]:
from reference_data import state_fip_to_county_to_geoid
washington_state_fips_code = 53
geoid_to_county_name = state_fip_to_county_to_geoid[53]
geoid_to_county_name

{'Adams County': '53001',
 'Asotin County': '53003',
 'Benton County': '53005',
 'Chelan County': '53007',
 'Clallam County': '53009',
 'Clark County': '53011',
 'Columbia County': '53013',
 'Cowlitz County': '53015',
 'Douglas County': '53017',
 'Ferry County': '53019',
 'Franklin County': '53021',
 'Garfield County': '53023',
 'Grant County': '53025',
 'Grays Harbor County': '53027',
 'Island County': '53029',
 'Jefferson County': '53031',
 'King County': '53033',
 'Kitsap County': '53035',
 'Kittitas County': '53037',
 'Klickitat County': '53039',
 'Lewis County': '53041',
 'Lincoln County': '53043',
 'Mason County': '53045',
 'Okanogan County': '53047',
 'Pacific County': '53049',
 'Pend Oreille County': '53051',
 'Pierce County': '53053',
 'San Juan County': '53055',
 'Skagit County': '53057',
 'Skamania County': '53059',
 'Snohomish County': '53061',
 'Spokane County': '53063',
 'Stevens County': '53065',
 'Thurston County': '53067',
 'Wahkiakum County': '53069',
 'Walla Walla Co

In [17]:
gdf['GEOID'] = gdf['COUNTY'].apply(lambda x: geoid_to_county_name[x + " County"])
print(gdf.GEOID.unique())
gdf.head(2)

['53001' '53003' '53005' '53007' '53009' '53011' '53013' '53015' '53017'
 '53019' '53021' '53023' '53025' '53027' '53029' '53031' '53033' '53035'
 '53041' '53043' '53045' '53047' '53051' '53055' '53057' '53059' '53061'
 '53063' '53065' '53067' '53069' '53071' '53075' '53077' '53053' '53073'
 '53049' '53039' '53037']


Unnamed: 0,LEGDIST,CONGDIST,CCDIST,COUNTY,COUNTYCODE,PRECCODE,PRECNAME,ST_CODE,G16PREDCLI,G16PRERTRU,...,G16AUDDMCC,G16AUDRMIL,G16ATGDFER,G16ATGRTRU,G16LNDDFRA,G16LNDRMCL,G16INSDKRE,G16INSRSCH,geometry,GEOID
0,9,4,1,Adams,AD,111,Ritzville Ward 1,AD00000111,24,109,...,45,92,57,66,27,107,40,89,"POLYGON ((2169598.057 664568.324, 2169604.801 ...",53001
1,9,4,1,Adams,AD,112,Ritzville Ward 2,AD00000112,26,96,...,35,87,66,51,30,92,40,84,"POLYGON ((2170310.521 663414.822, 2170266.610 ...",53001


Now Washington has a GEOID column and can be run through the verification script.

## Example #3 - Manual Candidate Column Selection
The script needs to know which column contains votes for Clinton and which column contains votes for Trump. They can be manually entered as arguments:

* `d_col` denotes the column for Hillary Clinton vote counts in each precinct
* `r_col` denotes the column for Donald Trump vote counts in each precinct.

Without those arguments, the script will guess based on the expected number of votes for each candidate.

In [19]:
gdf = gpd.read_file('example-election-shapefiles/mggg-vt-2016')
gdf.head(2)

Unnamed: 0,STATEFP10,COUNTYFP10,COUSUBFP10,GEOID10,NAME10,NAMELSAD10,ALAND10,AWATER10,INTPTLAT10,INTPTLON10,...,TOTV14,PRES12D,PRES12R,PRES12L,TOTV12,SEN12B,SEN12R,USH12D,USH12R,geometry
0,50,5,62200,5000562200,St. Johnsbury,St. Johnsbury town,94250348,917562,44.4603077,-72.0049436,...,1960,1789,1081,29,2968,1909,583,1918,814,"POLYGON ((-71.99365 44.49649, -71.99262 44.496..."
1,50,5,64075,5000564075,Sheffield,Sheffield town,84217719,667553,44.6416305,-72.110075,...,182,182,93,4,289,200,66,190,62,"POLYGON ((-72.15832 44.60817, -72.15881 44.608..."


In [21]:
report_file_path = 'mggg-vermont-2016'
state_report, county_report_lst = verify.verify_state_2016(gdf, 'VT', 'MGGG', '2016',path=report_file_path)

Starting verification process for:  VT MGGG 2016
Candidate vote count columns are being assigned automatically
Please manually select the Democrat candidate votes column by index: 
[0] PRES16B e.g. 164
[1] PRES16L e.g. 125
[2] PRES16G e.g. 64
[3] N/A (no suitable match for Democrat candidate votes)


Select the column (by index):  3


Exception: Unable to find a suitable column

None of those looked right, so let's take a look at the column names and pick them manually. You may want to consult with the README for this state if one was provided.

In [22]:
gdf.columns

Index(['STATEFP10', 'COUNTYFP10', 'COUSUBFP10', 'GEOID10', 'NAME10',
       'NAMELSAD10', 'ALAND10', 'AWATER10', 'INTPTLAT10', 'INTPTLON10',
       'TOTPOP', 'WHITE', 'BLACK', 'AMIN', 'ASIAN', 'NHPI', 'OTHER', '2MORE',
       'VAP', 'HISP', 'SENDIST', 'DISTNAME', 'PRES16D', 'PRES16R', 'PRES16B',
       'PRES16L', 'PRES16G', 'TOTV16', 'SEN16D', 'SEN16R', 'USH14D', 'USH14R',
       'TOTV14', 'PRES12D', 'PRES12R', 'PRES12L', 'TOTV12', 'SEN12B', 'SEN12R',
       'USH12D', 'USH12R', 'geometry'],
      dtype='object')

In [None]:
verify.verify_state_2016

In [25]:
d_col = 'PRES16D'
r_col = 'PRES16R'
state_report, county_report_lst = verify.verify_state_2016(gdf, 'VT', 'MGGG', d_col=d_col, r_col=r_col, path=report_file_path)

Starting verification process for:  VT MGGG 2016
Candidate vote count columns are being assigned manually
Choose d_col as:  PRES16D
Choose r_col as:  PRES16R
Verification will now begin with this GeoDataFrame: 

   d_col  r_col                                           geometry
0  1,469  1,110  POLYGON ((-71.99365 44.49649, -71.99262 44.496...
1    110    131  POLYGON ((-72.15832 44.60817, -72.15881 44.608...
2     58     22  POLYGON ((-72.18129 44.51200, -72.19186 44.515...
3    168    220  POLYGON ((-72.06885 44.63188, -72.07023 44.635...
4    246    168  POLYGON ((-72.22942 44.42690, -72.23075 44.427...
Starting Vote Verification
Starting Topology Verification
Starting County Verification
Missing GEOID Column - attempting automatic assignment
GEOID assignment successful
All done!



[Vermont Report](https://github.com/OpenPrecincts/verification/blob/master/reports/mggg-vermont-2016.md)

That's it! You may need to combine the method used in example 2 and example 3 in some cases, but hopefully most states will work like example #1. Happy Verifying :)

## Examples for Non-2016 Precinct-Level Election Shapefiles (General Usage)

This example will use the **Precinct-Level Election Shapefile** for Pennsylvannia's 2018 general election, but it should be applicable to any state and any election year. That being said, if your **Precinct-Level Election Shapefile** is for the 2016 Election, I reccommend checking out the 2016 examples above to potentially save some time.

### Conform the Precinct-Level Election Shapefile to the Expected Schema 

GEOID is optional for `state_prec_gdf`, but strongly reccomended. [Learn more...](https://github.com/OpenPrecincts/verification#geoid-county-assignment-for-each-precinct)

In [27]:
'''
`state_prec_gdf`:
| Column Name | dtype    | example                                           |
|-------------|----------|---------------------------------------------------|
| `d_col`     | int      | 5936                                              |
| `r_col`     | int      | 6395                                              |
| geometry    | geometry | POLYGON ((-71.99365 44.49649, -71.99262 44.496... |
| GEOID       | object   | '01001'                                           |
'''

gdf = gpd.read_file('example-election-shapefiles/pgp-pa-2018/')
# conform to the required input schema
gdf.rename(columns={'COUNTYFP':'GEOID'}, inplace=True)
gdf.head(2)

Unnamed: 0,"loc, prec",county_id,oldPrecNm,GEOID,editedPrec,VTDST,G18DemSen,G18RepSen,G18LibSen,G18GreSen,...,G18RepGov,G18LibGov,G18GreGov,G18IndGov,G18DemHOR,G18RepHOR,G18LibHOR,G18GreHOR,G18IndHOR,geometry
0,"Adams County, abbottstown",Adams County,ABBOTTSTOWN,42001,abbottstown,10,120.0,183.0,5.0,2.0,...,185.0,2.0,2.0,0.0,108.0,201.0,0.0,0.0,0.0,"POLYGON Z ((-76.99801 39.88359 0.00000, -76.99..."
1,"Adams County, arendtsville",Adams County,ARENDTSVILLE,42001,arendtsville,20,151.0,178.0,6.0,3.0,...,172.0,4.0,2.0,0.0,132.0,204.0,0.0,0.0,1.0,"POLYGON Z ((-77.31141 39.92625 0.00000, -77.30..."


### Aquire Official County Level Election Results

In [31]:
import requests
import json

In [32]:
def get_race_df(race_name, county_to_candidate_dict):
    cnty_lst = []
    for _, candidate_lst in county_to_candidate_dict.items():
        cnty_lst.append(pd.DataFrame.from_dict(candidate_lst))
    df = pd.concat(cnty_lst)
    df['office'] = race_name
    return df

In [33]:
url_senate_race = 'https://electionreturns.pa.gov/api/ElectionReturn/GetCountyBreak?officeId=2&districtId=1&methodName=GetCountyBreak&electionid=63&electiontype=G&isactive=0'
page_senate_race = requests.get(url_senate_race)
json_senate_race = json.loads(json.loads(page_senate_race.content))
county_to_senate_race_candidate_dict = json_senate_race['Election']['Statewide'][0]
assert len(county_to_senate_race_candidate_dict) == 67
df = get_race_df('U.S. Senate', county_to_senate_race_candidate_dict)
df = df.astype({'Votes':'int'}).rename(columns={'Votes':'votes'})
df.head()

Unnamed: 0,ID,ElectionYear,CountyName,PartyName,CandidateName,votes,YesVotes,NoVotes,Percentage,YesVotesPercent,NoVotesPercent,RunningMateName,Level,RankOrder,IsRetention,office
0,1.0,2018,ADAMS,DEM,"CASEY, ROBERT P JR",14880,0,0,38.05,0.0,0.0,,ADAMS,1,0,U.S. Senate
1,3.0,2018,ADAMS,REP,"BARLETTA, LOUIS J.",23419,0,0,59.89,0.0,0.0,,ADAMS,5,0,U.S. Senate
2,,2018,ADAMS,GRN,"GALE, NEAL TAYLOR",292,0,0,0.75,0.0,0.0,,ADAMS,14,0,U.S. Senate
3,,2018,ADAMS,LIB,"KERNS, DALE R JR",511,0,0,1.31,0.0,0.0,,ADAMS,17,0,U.S. Senate
0,1.0,2018,ALLEGHENY,DEM,"CASEY, ROBERT P JR",355907,0,0,65.7,0.0,0.0,,ALLEGHENY,1,0,U.S. Senate


### Conform the county-level election resluts to the Expected Schema 

In [36]:
'''
`county_level_results_df`:
| Column Name | dtype  | example                    |
|-------------|--------|----------------------------|
| county      | object | 'Essex County'             |
| GEOID       | object | '01001'                    |
| party       | object | 'democrat' or 'republican' |
| votes       | int    | 5936                       |
'''

df['county'] = df['CountyName'].apply(lambda county_name: ' '.join([county_name.title(), 'County']))
df.loc[df['county']=='Mckean County','county'] = 'McKean County'
df['GEOID'] = df['county'].map(state_fip_to_county_to_geoid[42])
df['party'] = df['PartyName'].map({'DEM':'democrat','REP':'republican'})
df.head()

Unnamed: 0,ID,ElectionYear,CountyName,PartyName,CandidateName,votes,YesVotes,NoVotes,Percentage,YesVotesPercent,NoVotesPercent,RunningMateName,Level,RankOrder,IsRetention,office,county,GEOID,party
0,1.0,2018,ADAMS,DEM,"CASEY, ROBERT P JR",14880,0,0,38.05,0.0,0.0,,ADAMS,1,0,U.S. Senate,Adams County,42001,democrat
1,3.0,2018,ADAMS,REP,"BARLETTA, LOUIS J.",23419,0,0,59.89,0.0,0.0,,ADAMS,5,0,U.S. Senate,Adams County,42001,republican
2,,2018,ADAMS,GRN,"GALE, NEAL TAYLOR",292,0,0,0.75,0.0,0.0,,ADAMS,14,0,U.S. Senate,Adams County,42001,
3,,2018,ADAMS,LIB,"KERNS, DALE R JR",511,0,0,1.31,0.0,0.0,,ADAMS,17,0,U.S. Senate,Adams County,42001,
0,1.0,2018,ALLEGHENY,DEM,"CASEY, ROBERT P JR",355907,0,0,65.7,0.0,0.0,,ALLEGHENY,1,0,U.S. Senate,Allegheny County,42003,democrat


Choose the office for which you want to verify the election results. The `county_level_results_df` DataFrame should only contain results for the `office` that's passed as an input.

In [37]:
OFFICE = 'U.S. Senate'
county_level_results_df = df[df.office == OFFICE][['county','GEOID','party', 'votes']].reset_index()
county_level_results_df.head()

Unnamed: 0,index,county,GEOID,party,votes
0,0,Adams County,42001,democrat,14880
1,1,Adams County,42001,republican,23419
2,2,Adams County,42001,,292
3,3,Adams County,42001,,511
4,0,Allegheny County,42003,democrat,355907


### Run the script!

In [40]:
state_report, county_report_lst = verify.verify_state(
    gdf,
    'PA',
    'Princeton Gerrymandering Project',
    '2018',
    county_level_results_df,
    OFFICE,
    d_col='G18DemSen',
    r_col='G18RepSen',
    path='open-precincts-pennsylvannia-2018.md',
)

Starting verification process for:  PA PGP 2018
Candidate vote count columns are being assigned manually
Choose d_col as:  G18DemSen
Choose r_col as:  G18RepSen
Verification will now begin with this GeoDataFrame: 

   d_col  r_col                                           geometry  GEOID
0  120.0  183.0  POLYGON Z ((-76.99801 39.88359 0.00000, -76.99...  42001
1  151.0  178.0  POLYGON Z ((-77.31141 39.92625 0.00000, -77.30...  42001
2   74.0  103.0  POLYGON Z ((-77.25596 39.98075 0.00000, -77.25...  42001
3  289.0  575.0  MULTIPOLYGON Z (((-77.02734 39.87105 0.00000, ...  42001
4  152.0  231.0  POLYGON Z ((-77.25594 39.93043 0.00000, -77.25...  42001
Starting Vote Verification
Starting Topology Verification
Starting County Verification
Using the GEOID Column in the original shapefile.
All done!

