## Alabama 2022 Primary Election Returns and Boundaries

### Sections
- <a href="#join">Read in Input Files</a><br>
- <a href="#shp">Modify Precinct Boundaries and Names</a><br>
- <a href="#maup">Join with Election Returns</a><br>
- <a href="#splits">Split Precincts</a><br>
- <a href="#check">Vote Total Checks</a><br>
- <a href="#exp">Export Cleaned Precinct-Level Datasets</a><br>

#### Sources
Note: This file was created after the 2022 Alabama precinct-level primary runoff election results file, that file is used as an input to this notebook and the processing code can be found on the [RDH github.](https://github.com/nonpartisan-redistricting-datahub/pber_collection/tree/main/AL/2022/al_22_primary_runoff)
- [RDH Alabama 2022 Primary Runoff Election Results, Precint Level](https://redistrictingdatahub.org/dataset/alabama-2022-primary-run-off-election-precinct-level-results/)
- [RDH Alabama 2022 General Election Results and Boundaries, Precint Level](https://redistrictingdatahub.org/dataset/alabama-2022-general-election-precinct-level-results-and-boundaries/)
-[2021 Alabama Congressional Districts](https://redistrictingdatahub.org/dataset/2021-alabama-congressional-districts-adopted-plan/)
-[2021 Adopted State Senate Plan](https://redistrictingdatahub.org/dataset/2021-alabama-state-senate-adopted-plan/)
-[2021 Adopted State House Plan](https://redistrictingdatahub.org/dataset/2021-alabama-state-house-adopted-plan/)

In [1]:
import pandas as pd
import geopandas as gp
import os
import numpy as np
import re
from collections import Counter
from helper_functions_AL22 import *
pd.set_option('display.max_rows', 1000)
pd.set_option('display.max_columns', 1000)

In [2]:
# Temporarily ignore warning messages for splits function
import warnings

warnings.filterwarnings("ignore")

<p><a name="join"></a></p>

### Read in input files

In [3]:
# Shapefiles
#2022 AL Congressional plan
al_cong = gp.read_file("./raw-from-source/al_cong_2021/2021 Alabama Congressional Plan_shape file.shp")
#2022 AL State Senate Plan
al_sldu = gp.read_file("./raw-from-source/al_sldu_2021/2021 Alabama Senate Plan_shape file.shp")
#2022 AL State House Plan
al_sldl = gp.read_file("./raw-from-source/al_sldl_2021/2021 Alabama House Plan_shape file.shp")

In [4]:
# Read in Primary Runoff Returns
al_prim_er_ro = pd.read_csv('./raw-from-source/al_2022_prim_runoff_prec/al_2022_prim_runoff_prec.csv')

In [5]:
#Set unique ID to include County name
al_prim_er_ro['UNIQUE_ID'] = al_prim_er_ro['county'] +'-:-'+ al_prim_er_ro['precinct']

In [6]:
# Make one change to precincts in Cullman County
# combine FAIRVIEW FIRE DEPT and FAIRVIEW TOWN HALL into one precinct
test = al_prim_er_ro[al_prim_er_ro['precinct'].isin(['FAIRVIEW FIRE DEPT_ A-K', 'FAIRVIEW TOWN HALL L-Z' ])].sum()
# adjust county name, precinct name
test['COUNTYFP'] = '043'
test['county'] = 'Cullman'
test['precinct'] = 'FAIRVIEW FIRE DEPT TOWN HALL'

In [7]:
replace_index = al_prim_er_ro[al_prim_er_ro['precinct'].isin(['FAIRVIEW FIRE DEPT_ A-K', 'FAIRVIEW TOWN HALL L-Z' ])].index[0]
drop_index = al_prim_er_ro[al_prim_er_ro['precinct'].isin(['FAIRVIEW FIRE DEPT_ A-K', 'FAIRVIEW TOWN HALL L-Z' ])].index[1]

In [8]:
# Replace the row
al_prim_er_ro.loc[replace_index] = test
# drop the extra row
al_prim_er_ro = al_prim_er_ro.drop(drop_index)

In [9]:
#redo unique ID to account for name changes
al_prim_er_ro['UNIQUE_ID'] = al_prim_er_ro['county'] +'-:-'+ al_prim_er_ro['precinct']

In [10]:
al_prim_er_ro['UNIQUE_ID'].nunique() == len(al_prim_er_ro['UNIQUE_ID'])

True

<p><a name="shp"></a></p>

### Modify Precinct Boundaries and Join

In [11]:
# Read in shp from AL 2022 General election, from website
al_gen_shp = gp.read_file('./raw-from-source/al_gen_22_prec/al_gen_22_no_splits_prec.shp', 
                          usecols = ['UNIQUE_ID', 'County', 'Precinct', 'geometry'])
# subset to just columns of interest
al_gen_shp = al_gen_shp[['UNIQUE_ID', 'County', 'Precinct', 'geometry']]

In [12]:
#Check number of rows/precincts
print('Number of precincts in AL Gen:', al_gen_shp['UNIQUE_ID'].nunique())

Number of precincts in AL Gen: 1939


In [13]:
al_prim_er_ro[al_prim_er_ro['county'] == 'Mobile']['UNIQUE_ID'].to_csv('ermob.csv')

Create rename dictionary from [this google sheet](https://docs.google.com/spreadsheets/d/18m4T2KdcXOSXtWtYIOp0nCKrAgQCl9K1CvZAMEs1Z9Q/edit#gid=2102357482)

In [14]:
rename_df = pd.read_csv('./raw-from-source/al_precinct_counts_names - name_anoms.csv')

In [15]:
rename_df.head()

Unnamed: 0,SHP,ER_RO,Unnamed: 2,Unnamed: 3
0,Autauga-:-60 MARBURY MIDDLE SCHOOL,Autauga-:-60 MARBURY MIDDLE SCH,,
1,Baldwin-:-BROMLEY CROSSROADS VFD,Baldwin-:-BROMLEY CROOSROADS VFD,,
2,Baldwin-:-PT_CLEAR ST_ FRANCIS,Baldwin-:-PT_CLEAR ST FRANCIS,,
3,Baldwin-:-ST_ PAUL'S EPISCOPAL,Baldwin-:-ST_ PAUL'S EPISCOPAL CH,,
4,Bibb-:-EOLINE FIRE DEPT_,Bibb-:-EOLINE FIRE DEPT,,


In [16]:
#Change precinct names in shp to correctly match election returns. 
prec_dict = dict(zip(rename_df['SHP'], rename_df['ER_RO']))

In [17]:
al_gen_shp['UNIQUE_ID'] = al_gen_shp['UNIQUE_ID'].replace(prec_dict)

In [18]:
#Adjust precinct name field to account for name changes
al_gen_shp['Precinct'] = al_gen_shp['UNIQUE_ID'].apply(lambda x: x.split('-:-')[1])

In [19]:
#check before changing df
len(al_gen_shp.dissolve(by= 'UNIQUE_ID'))

1935

In [20]:
al_prim_runoff_shp = al_gen_shp.dissolve(by= 'UNIQUE_ID').reset_index()

In [21]:
al_prim_runoff_shp.head(2)

Unnamed: 0,UNIQUE_ID,geometry,County,Precinct
0,Autauga-:-10 JONES COMM_ CTR_,"POLYGON Z ((-86.92124 32.65708 0.00000, -86.92...",Autauga,10 JONES COMM_ CTR_
1,Autauga-:-100 TRINITY METHODIST,"POLYGON Z ((-86.45394 32.49318 0.00000, -86.45...",Autauga,100 TRINITY METHODIST


<p><a name="maup"></a></p>

### Join with Election Results

In [22]:
# check that all unique_ids match up
set(al_prim_er_ro['UNIQUE_ID']) == set(al_prim_runoff_shp['UNIQUE_ID'])

False

In [23]:
# Look at further name anomalies
shp_names = sorted(list(set(al_prim_runoff_shp['UNIQUE_ID']) - set(al_prim_er_ro['UNIQUE_ID'])))
er_names = sorted(list(set(al_prim_er_ro['UNIQUE_ID']) - set(al_prim_runoff_shp['UNIQUE_ID'])))
# Remove shape with no votes
shp_names.remove('Coffee-:-Unassigned')
# Create Data Frame
name_anomalies = pd.DataFrame()
name_anomalies['shp'] = shp_names
name_anomalies['er'] = er_names

In [24]:
#visually inspect
name_anomalies

Unnamed: 0,shp,er


In [25]:
# check that all unique_ids match up

#set(al_prim_er['UNIQUE_ID']) == set(al_prim_shp['UNIQUE_ID'])
#sorted(list(set(al_prim_er['UNIQUE_ID']) - set(al_prim_shp['UNIQUE_ID'])))
sorted((list(set(al_prim_runoff_shp['UNIQUE_ID']) - set(al_prim_er_ro['UNIQUE_ID']))))

['Coffee-:-Unassigned']

There is one geometry with no votes associated, in the shapefile.

In [26]:
# merge
al_prim_pber = al_prim_runoff_shp[['UNIQUE_ID', 'geometry']].merge(al_prim_er_ro, on='UNIQUE_ID', how='outer', indicator=True)

In [27]:
#check indicator to see if merge was successful
al_prim_pber._merge.value_counts()

both          1934
left_only        1
right_only       0
Name: _merge, dtype: int64

In [28]:
al_prim_pber.head(2)

Unnamed: 0,UNIQUE_ID,geometry,COUNTYFP,county,precinct,R22USSRBRI,R22USSRBRO,R22GOVDFLO,R22GOVDFOR,R22SOSRALL,R22SOSRZEI,R22AUDRSOR,R22AUDRCOO,R22PS1RODE,R22PS1RWOO,R22PS2RBEE,R22PS2RMCC,RCON05RSTR,RCON05RWAR,RSU12RKEL,RSU12RDRA,RSU23DSTE,RSU23DSAN,RSL002RHAR,RSL002RBLA,RSL004RMOO,RSL004RJOH,RSL014RWAD,RSL014RFRE,RSL020RLOM,RSL020RTAY,RSL040RROB,RSL040RBOR,RSL055DPLU,RSL055DSCO,RSL056DTIL,RSL056DHUF,RSL057DSEL,RSL057DWIN,RSL100RSHI,RSL100RKUP,_merge
0,Autauga-:-10 JONES COMM_ CTR_,"POLYGON Z ((-86.92124 32.65708 0.00000, -86.92...",1,Autauga,10 JONES COMM_ CTR_,49.0,35.0,5.0,26.0,46.0,37.0,50.0,28.0,42.0,31.0,36.0,38.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,both
1,Autauga-:-100 TRINITY METHODIST,"POLYGON Z ((-86.45394 32.49318 0.00000, -86.45...",1,Autauga,100 TRINITY METHODIST,422.0,266.0,4.0,3.0,428.0,243.0,420.0,204.0,296.0,277.0,318.0,261.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,both


In [29]:
al_prim_pber.drop(labels = ['_merge'], axis = 1, inplace = True)

In [30]:
# rearrange columns
al_prim_pber = al_prim_pber[['UNIQUE_ID'] + al_prim_pber.columns[2:].to_list() + ['geometry']]

In [31]:
#Fill Nulls
al_prim_pber = al_prim_pber.fillna(0)

In [32]:
al_prim_pber['geometry'] = al_prim_pber.geometry.buffer(0)

<p><a name="splits"></a></p>

### Split Precincts

In [33]:
#Assign columns to datasets
unsplit_col_names = al_prim_pber.columns[4:-1].to_list()
cong_cols = [col for col in unsplit_col_names if col.startswith('RCON')]
sldu_cols = [col for col in unsplit_col_names if col.startswith('RSU')]
sldl_cols = [col for col in unsplit_col_names if col.startswith('RSL')]
st_cols = [col for col in unsplit_col_names if col not in cong_cols+sldu_cols+sldl_cols]

In [34]:
# set columns with votes as integer type
for item in unsplit_col_names:
    al_prim_pber[item] = al_prim_pber[item].astype(int)

In [35]:
#Check
len(unsplit_col_names) == len(cong_cols+sldu_cols+sldl_cols+st_cols)

True

In [36]:
#Create datasets for splits
cong = al_prim_pber[['UNIQUE_ID', 'COUNTYFP', 'county', 'precinct'] + cong_cols + ['geometry']]
sldu = al_prim_pber[['UNIQUE_ID', 'COUNTYFP', 'county', 'precinct'] + sldu_cols + ['geometry']]
sldl = al_prim_pber[['UNIQUE_ID', 'COUNTYFP', 'county', 'precinct'] + sldl_cols + ['geometry']]
st = al_prim_pber[['UNIQUE_ID', 'COUNTYFP', 'county', 'precinct'] + st_cols + ['geometry']]

#### Identify split precincts

In [37]:
precinct_mapping_dict = {}
split_precincts_list = {}
for index,row in al_prim_pber.iterrows():
    precinct_list = []
    for contest in unsplit_col_names:
        if(row[contest]!=0) and ("RCON" in contest or "RSL" in contest or "RSU" in contest):
            precinct_info = get_level_dist(contest)
            if precinct_info not in precinct_list:
                precinct_list.append(get_level_dist(contest))
    is_split = is_split_precinct(precinct_list)
    if (is_split):
        split_precincts_list[row["UNIQUE_ID"]]=is_split
    precinct_mapping_dict[row["UNIQUE_ID"]]=precinct_list
    
cong_check_list = {i:contains_cong(precinct_mapping_dict[i]) for i in precinct_mapping_dict.keys()}
sldu_check_list = {i:contains_sldu(precinct_mapping_dict[i]) for i in precinct_mapping_dict.keys()}
sldl_check_list = {i:contains_sldl(precinct_mapping_dict[i]) for i in precinct_mapping_dict.keys()}

In [38]:
print(split_precincts_list)

{}


<p><a name="check"></a></p>

### Vote Total Checks

In [39]:
#Statewide GDF
county_totals_check(al_prim_pber, "RDH raw", st, "join", st_cols, "COUNTYFP")

***Countywide Totals Check***

R22USSRBRI is equal across all counties
R22USSRBRO is equal across all counties
R22GOVDFLO is equal across all counties
R22GOVDFOR is equal across all counties
R22SOSRALL is equal across all counties
R22SOSRZEI is equal across all counties
R22AUDRSOR is equal across all counties
R22AUDRCOO is equal across all counties
R22PS1RODE is equal across all counties
R22PS1RWOO is equal across all counties
R22PS2RBEE is equal across all counties
R22PS2RMCC is equal across all counties


In [40]:
#congressional splits
county_totals_check(al_prim_pber, "RDH raw", cong, "join", cong_cols, "COUNTYFP")

***Countywide Totals Check***

RCON05RSTR is equal across all counties
RCON05RWAR is equal across all counties


In [41]:
#State house splits
county_totals_check(al_prim_pber, "RDH raw", sldl, "join", sldl_cols, "county")

***Countywide Totals Check***

RSL002RHAR is equal across all counties
RSL002RBLA is equal across all counties
RSL004RMOO is equal across all counties
RSL004RJOH is equal across all counties
RSL014RWAD is equal across all counties
RSL014RFRE is equal across all counties
RSL020RLOM is equal across all counties
RSL020RTAY is equal across all counties
RSL040RROB is equal across all counties
RSL040RBOR is equal across all counties
RSL055DPLU is equal across all counties
RSL055DSCO is equal across all counties
RSL056DTIL is equal across all counties
RSL056DHUF is equal across all counties
RSL057DSEL is equal across all counties
RSL057DWIN is equal across all counties
RSL100RSHI is equal across all counties
RSL100RKUP is equal across all counties


In [42]:
#State Senate Splits
county_totals_check(al_prim_pber, "RDH raw", sldu, "join", sldu_cols, "COUNTYFP")

***Countywide Totals Check***

RSU12RKEL is equal across all counties
RSU12RDRA is equal across all counties
RSU23DSTE is equal across all counties
RSU23DSAN is equal across all counties


<p><a name="exp"></a></p>

### Export Cleaned Precinct-Level Datasets

Export joined datasets

In [44]:
if not os.path.exists("./al_2022_prim_runoff_pber/al_prim_runoff_22_no_splits_prec"):
    os.mkdir("./al_2022_prim_runoff_pber/al_prim_runoff_22_no_splits_prec/")
    os.mkdir("./al_2022_prim_runoff_pber/al_prim_runoff_22_st_prec/")
    os.mkdir("./al_2022_prim_runoff_pber/al_prim_runoff_22_cong_prec/")
    os.mkdir("./al_2022_prim_runoff_pber/al_prim_runoff_22_sldu_prec/")
    os.mkdir("./al_2022_prim_runoff_pber/al_prim_runoff_22_sldl_prec/")

al_prim_pber.to_file("./al_2022_prim_runoff_pber/al_prim_runoff_22_no_splits_prec/al_prim_runoff_22_no_splits_prec.shp")
st.to_file("./al_2022_prim_runoff_pber/al_prim_runoff_22_st_prec/al_prim_runoff_22_st_prec.shp")
cong.to_file("./al_2022_prim_runoff_pber/al_prim_runoff_22_cong_prec/al_prim_runoff_22_cong_prec.shp")
sldu.to_file("./al_2022_prim_runoff_pber/al_prim_runoff_22_sldu_prec/al_prim_runoff_22_sldu_prec.shp")
sldl.to_file("./al_2022_prim_runoff_pber/al_prim_runoff_22_sldl_prec/al_prim_runoff_22_sldl_prec.shp")