## Alabama 2022 Primary Election Returns and Boundaries

### Sections
- <a href="#join">Read in Input Files</a><br>
- <a href="#shp">Modify Precinct Boundaries and Names</a><br>
- <a href="#maup">Join with Election Returns</a><br>
- <a href="#splits">Split Precincts</a><br>
- <a href="#check">Vote Total Checks</a><br>
- <a href="#exp">Export Cleaned Precinct-Level Datasets</a><br>

#### Sources
Note: This file was created after the 2022 Alabama precinct-level election results file, that file is used as an input to this notebook and the processing code can be found on the [RDH github.](https://github.com/nonpartisan-redistricting-datahub/pber_collection/tree/main/AL/2022/al_22_primary)
- [Precinct Boundary File Source](https://www.elections.alaska.gov/research/district-maps/)
- [RDH Alabama 2022 Primary Election Results, Precint Level](https://redistrictingdatahub.org/dataset/alabama-2022-primary-election-precinct-level-results/)
- [RDH Alabama 2022 General Election Results and Boundaries, Precint Level](https://redistrictingdatahub.org/dataset/alabama-2022-general-election-precinct-level-results-and-boundaries/)
-[2021 Alabama Congressional Districts](https://redistrictingdatahub.org/dataset/2021-alabama-congressional-districts-adopted-plan/)
-[2021 Adopted State Senate Plan](https://redistrictingdatahub.org/dataset/2021-alabama-state-senate-adopted-plan/)
-[2021 Adopted State House Plan](https://redistrictingdatahub.org/dataset/2021-alabama-state-house-adopted-plan/)

In [1]:
import pandas as pd
import geopandas as gp
import os
import numpy as np
import re
from collections import Counter
from helper_functions_AL22 import *
pd.set_option('display.max_rows', 1000)
pd.set_option('display.max_columns', 1000)

In [2]:
# Temporarily ignore warning messages for splits function
import warnings

warnings.filterwarnings("ignore")

<p><a name="join"></a></p>

### Read in input files

In [3]:
# Shapefiles
#2022 AL Congressional plan
al_cong = gp.read_file("./raw-from-source/al_cong_2021/2021 Alabama Congressional Plan_shape file.shp")
#2022 AL State Senate Plan
al_sldu = gp.read_file("./raw-from-source/al_sldu_2021/2021 Alabama Senate Plan_shape file.shp")
#2022 AL State House Plan
al_sldl = gp.read_file("./raw-from-source/al_sldl_2021/2021 Alabama House Plan_shape file.shp")

In [4]:
# Read in Election Returns
al_prim_er = pd.read_csv('./raw-from-source/al_2022_prim_prec/al_2022_prim_prec.csv')

In [5]:
#Set unique ID to include County name
al_prim_er['UNIQUE_ID'] = al_prim_er['county'] +'-:-'+ al_prim_er['precinct']

In [6]:
# Make one change to precincts in Cullman County
# combine FAIRVIEW FIRE DEPT and FAIRVIEW TOWN HALL into one precinct
test = al_prim_er[al_prim_er['precinct'].isin(['FAIRVIEW FIRE DEPT_ A-K', 'FAIRVIEW TOWN HALL L-Z' ])].sum()
# adjust county name, precinct name
test['COUNTYFP'] = '043'
test['county'] = 'Cullman'
test['precinct'] = 'FAIRVIEW FIRE DEPT TOWN HALL'

In [7]:
replace_index = al_prim_er[al_prim_er['precinct'].isin(['FAIRVIEW FIRE DEPT_ A-K', 'FAIRVIEW TOWN HALL L-Z' ])].index[0]
drop_index = al_prim_er[al_prim_er['precinct'].isin(['FAIRVIEW FIRE DEPT_ A-K', 'FAIRVIEW TOWN HALL L-Z' ])].index[1]

In [8]:
# Replace the row
al_prim_er.loc[replace_index] = test
# drop the extra row
al_prim_er = al_prim_er.drop(drop_index)

In [9]:
#redo unique ID to account for name changes
al_prim_er['UNIQUE_ID'] = al_prim_er['county'] +'-:-'+ al_prim_er['precinct']

In [10]:
al_prim_er['UNIQUE_ID'].nunique() == len(al_prim_er['UNIQUE_ID'])

True

<p><a name="shp"></a></p>

### Modify Precinct Boundaries and Join

In [11]:
# Read in shp from AL 2022 General election, from website
al_gen_shp = gp.read_file('./raw-from-source/al_gen_22_prec/al_gen_22_no_splits_prec.shp', 
                          usecols = ['UNIQUE_ID', 'County', 'Precinct', 'geometry'])
# subset to just columns of interest
al_gen_shp = al_gen_shp[['UNIQUE_ID', 'County', 'Precinct', 'geometry']]

In [12]:
#Check number of rows/precincts
print('Number of precincts in AL Gen:', al_gen_shp['UNIQUE_ID'].nunique())

Number of precincts in AL Gen: 1939


Create rename dictionary from [this google sheet](https://docs.google.com/spreadsheets/d/18m4T2KdcXOSXtWtYIOp0nCKrAgQCl9K1CvZAMEs1Z9Q/edit#gid=1454399295)

In [13]:
rename_df = pd.read_csv('./raw-from-source/al_precinct_counts_names - rename_dict.csv')

In [14]:
# Adjust format to match 'UNIQUE_ID' field
rename_df['al_gen'] = rename_df['county'] +'-:-'+ rename_df['al_gen']
rename_df['al_prim'] = rename_df['county'] +'-:-'+ rename_df['al_prim']
rename_df.head(3)

Unnamed: 0,county,al_gen,al_prim
0,Jefferson,Jefferson-:-PREC 2090 - HOLY FAMILY CRISTO,Jefferson-:-PREC 2090 - SIXTH AVENUE BAPTI
1,Jefferson,Jefferson-:-PREC 3010 - HOOVER MET BASEBAL,Jefferson-:-PREC 3010/3015 - HUNTER STREET
2,Jefferson,Jefferson-:-PREC 3015 - HOOVER MET SPORTS,Jefferson-:-PREC 3010/3015 - HUNTER STREET


In [15]:
#Change precinct names in shp to correctly match election returns. 
prec_dict = dict(zip(rename_df['al_gen'], rename_df['al_prim']))

In [16]:
al_gen_shp['UNIQUE_ID'] = al_gen_shp['UNIQUE_ID'].replace(prec_dict)

In [17]:
#Adjust precinct name field to account for name changes
al_gen_shp['Precinct'] = al_gen_shp['UNIQUE_ID'].apply(lambda x: x.split('-:-')[1])

In [18]:
#check before changing df
len(al_gen_shp.dissolve(by= 'UNIQUE_ID'))

1935

In [19]:
al_prim_shp = al_gen_shp.dissolve(by= 'UNIQUE_ID').reset_index()

In [20]:
al_prim_shp.head(2)

Unnamed: 0,UNIQUE_ID,geometry,County,Precinct
0,Autauga-:-10 JONES COMM_ CTR_,"POLYGON Z ((-86.92124 32.65708 0.00000, -86.92...",Autauga,10 JONES COMM_ CTR_
1,Autauga-:-100 TRINITY METHODIST,"POLYGON Z ((-86.45394 32.49318 0.00000, -86.45...",Autauga,100 TRINITY METHODIST


<p><a name="maup"></a></p>

### Join with Election Results

In [21]:
# check that all unique_ids match up
set(al_prim_er['UNIQUE_ID']) == set(al_prim_shp['UNIQUE_ID'])

False

In [22]:
# Look at further name anomalies
shp_names = sorted(list(set(al_prim_shp['UNIQUE_ID']) - set(al_prim_er['UNIQUE_ID'])))
er_names = sorted(list(set(al_prim_er['UNIQUE_ID']) - set(al_prim_shp['UNIQUE_ID'])))
# Remove shape with no votes
shp_names.remove('Coffee-:-Unassigned')
# Create Data Frame
name_anomalies = pd.DataFrame()
name_anomalies['shp'] = shp_names
name_anomalies['er'] = er_names

In [23]:
#visually inspect
name_anomalies

Unnamed: 0,shp,er
0,Autauga-:-60 MARBURY MIDDLE SCHOOL,Autauga-:-60 MARBURY MIDDLE SCH
1,Baldwin-:-BROMLEY CROSSROADS VFD,Baldwin-:-BROMLEY CROOSROADS VFD
2,Baldwin-:-PT_CLEAR ST_ FRANCIS,Baldwin-:-PT_CLEAR ST FRANCIS
3,Baldwin-:-ST_ PAUL'S EPISCOPAL,Baldwin-:-ST_ PAUL'S EPISCOPAL CH
4,Bibb-:-EOLINE FIRE DEPT_,Bibb-:-EOLINE FIRE DEPT
5,Bibb-:-GREENPOND FIRE DEPT_,Bibb-:-GREENPOND FIRE DEPT
6,Bibb-:-LAWLEY COMM_ CTR_,Bibb-:-LAWLEY COMM CTR
7,Bibb-:-SIX MILE COMM_ CTR_,Bibb-:-SIX MILE COMM CTR
8,Blount-:-STRAIGHT MOUNTAIN,Blount-:-ST_ MOUNTAIN
9,Bullock-:-PEROTE VOTING BUILDING,Bullock-:-PEROTE VOTING BLDG_


Precinct names are not meaningfully different, match the shapefile name to the precinct name in the election returns.

In [24]:
# create second rename dict
uid_dict = dict(zip(name_anomalies['shp'], name_anomalies['er']))

In [25]:
# apply second rename dict
al_prim_shp['UNIQUE_ID'] = al_prim_shp['UNIQUE_ID'].replace(uid_dict)

#Adjust precinct name field to account for name changes
al_prim_shp['Precinct'] = al_prim_shp['UNIQUE_ID'].apply(lambda x: x.split('-:-')[1])

In [26]:
# check that all unique_ids match up

#set(al_prim_er['UNIQUE_ID']) == set(al_prim_shp['UNIQUE_ID'])
#sorted(list(set(al_prim_er['UNIQUE_ID']) - set(al_prim_shp['UNIQUE_ID'])))
sorted((list(set(al_prim_shp['UNIQUE_ID']) - set(al_prim_er['UNIQUE_ID']))))

['Coffee-:-Unassigned']

There is one geometry with no votes associated, in the shapefile.

In [27]:
# merge
al_prim_pber = al_prim_shp[['UNIQUE_ID', 'geometry']].merge(al_prim_er, on='UNIQUE_ID', how='outer', indicator=True)

In [28]:
#check indicator to see if merge was successful
al_prim_pber._merge.value_counts()

both          1934
left_only        1
right_only       0
Name: _merge, dtype: int64

In [29]:
al_prim_pber.head(2)

Unnamed: 0,UNIQUE_ID,geometry,COUNTYFP,county,precinct,P22USSDBOY,P22USSRBOD,P22USSDDEA,P22USSRBRI,P22USSDJAC,P22USSRBRO,P22USSRDUP,P22USSRDUR,P22USSRSCH,PCON02DHAR,PCON02DPAT,P22GOVDFLO,P22GOVRBLA,P22GOVDFOR,P22GOVRBUR,P22GOVDJAM,P22GOVRGEO,P22GOVDKEN,P22GOVRIVE,P22GOVDMAR,P22GOVRJAM,P22GOVDSMI,P22GOVRJON,P22GOVRODL,P22GOVRTHO,P22GOVRYOU,P22ATGRMAR,P22ATGRSTI,PSSC5RCOO,PSSC5RJON,P22SOSRALL,P22SOSRHOR,P22SOSRPAC,P22SOSRZEI,P22AUDRCOO,P22AUDRGLO,P22AUDRSOR,P22PS1RHAM,P22PS1RMCL,P22PS1RODE,P22PS1RWOO,P22PS2RBEE,P22PS2RLIT,P22PS2RMCC,PSL088RDIS,PSL088RSTA,G22A01YES,G22A01NO,PSU22RALB,PSU22RSEX,PSL064RFER,PSL064RGIV,PSL065RCAM,PSL065REAS,PSL094RFAU,PSL094RFID,PSL095RHOL,PSL095RLUD,PSL095RPUL,PSL096RDUG,PSL096RSIM,PBOE02RBAL,PBOE02RWES,PSU28DBEA,PSU28DLEE,PSL049RBED,PSL049RHAR,PSL072DHOW,PSL072DTRA,PBOE06RMAN,PBOE06RYOT,PSU17RDUN,PSU17RSHE,PSU23DMEL,PSU23DSAN,PSU23DSPE,PSU23DSTE,PCON03RJOI,PCON03RROG,PSU12RDRA,PSU12RKEL,PSU12RWIL,PSL029RGID,PSL029RGRA,PSL040RBLA,PSL040RBOR,PSL040REXU,PSL040RLES,PSL040RMCA,PSL040RROB,PSL040RWIL,PSU13RCOK,PSU13RPRI,PSL038RMES,PSL038RWOO,PSL039RRHO,PSL039RSHA,PSU31RCAR,PSU31RHOR,PSU31RJON,PSL091RHOG,PSL091RMAR,PSL092RHAM,PSL092RWHI,PCON04DGOR,PCON04DNEI,PSL003DBEN,PSL003RJOL,PSL003DTHO,PSL003RUND,PSL007RROB,PSL007RYAR,PSL014RFRA,PSL014RFRE,PSL014RWAD,PSL067DCHE,PSL067DPET,PBOE08RDAV,PBOE08RREY,PSL024RLED,PSL024RSTO,PSL031RSMI,PSL031RSTU,PSL028RBUT,PSL028RISB,PSL087RJOH,PSL087RSOR,PCON05DTHO,PCON05RBLA,PCON05DWAR,PCON05RROB,PCON05RSAN,PCON05RSTR,PCON05RWAR,PCON05RWRI,PSL023RHAN,PSL023RKIR,PSU15RCHR,PSU15RROB,PSU19DALE,PSU19DCOL,PSU20DCOL,PSU20DHUN,PSL015RHUL,PSL015RTOM,PSL045RDRA,PSL045RDUB,PSL047DCOL,PSL047DTOO,PSL048RCAR,PSL048RWEN,PSL052DMIL,PSL052DROG,PSL054DBLA,PSL054DMAD,PSL054DRAF,PSL055DHEN,PSL055DODE,PSL055DPLU,PSL055DSCO,PSL055DWOM,PSL056DHUF,PSL056DKIN,PSL056DMAT,PSL056DTIL,PSL057DDUN,PSL057DSEL,PSL057DWIN,PSL060DGIV,PSL060DTAY,PSU01RMEL,PSU01RSUT,PSL001RMCC,PSL001RPET,PSL002RBLA,PSL002RBUT,PSL002RHAR,PSL002RIRE,PSU27RHOV,PSU27RWHA,PSL082DJOH,PSL082DWAR,PSU02RBUT,PSU02RHOL,PSL004RBAN,PSL004RJOH,PSL004RMOO,PSL025RCLE,PSL025RRIG,PSL020RBRO,PSL020RLOM,PSL020RMCC,PSL020RTAY,PSL026RCOL,PSL026RHOL,PSL026RMIT,PSL099DJON,PSL099DWRI,PSL100RKUP,PSL100RPIG,PSL100RSHI,PSL074DCAL,PSL074DENS,PSL061RBOL,PSL061RMAD,PSU11RBEL,PSU11RWRI,PSL013RBAR,PSL013RDAV,PSL013RDOZ,PSL013RWAI,PSL013RWOO,_merge
0,Autauga-:-10 JONES COMM_ CTR_,"POLYGON Z ((-86.92124 32.65708 0.00000, -86.92...",1,Autauga,10 JONES COMM_ CTR_,29.0,1.0,6.0,35.0,14.0,31.0,4.0,53.0,1.0,46.0,3.0,7.0,25.0,40.0,4.0,2.0,0.0,3.0,66.0,1.0,29.0,2.0,1.0,3.0,0.0,0.0,110.0,7.0,66.0,47.0,46.0,12.0,5.0,56.0,41.0,23.0,48.0,19.0,26.0,28.0,29.0,46.0,22.0,34.0,0.0,0.0,118.0,55.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,both
1,Autauga-:-100 TRINITY METHODIST,"POLYGON Z ((-86.45394 32.49318 0.00000, -86.45...",1,Autauga,100 TRINITY METHODIST,27.0,18.0,11.0,390.0,9.0,286.0,13.0,287.0,9.0,36.0,11.0,22.0,187.0,6.0,34.0,5.0,2.0,6.0,567.0,5.0,190.0,3.0,7.0,25.0,8.0,2.0,865.0,68.0,429.0,457.0,344.0,78.0,96.0,365.0,206.0,260.0,366.0,244.0,79.0,251.0,166.0,330.0,188.0,231.0,349.0,640.0,749.0,248.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,both


In [30]:
al_prim_pber.drop(labels = ['_merge'], axis = 1, inplace = True)

In [31]:
# rearrange columns
al_prim_pber = al_prim_pber[['UNIQUE_ID'] + al_prim_pber.columns[2:].to_list() + ['geometry']]

In [32]:
#Fill Nulls
al_prim_pber = al_prim_pber.fillna(0)

In [33]:
al_prim_pber['geometry'] = al_prim_pber.geometry.buffer(0)

<p><a name="splits"></a></p>

### Split Precincts

In [34]:
#Assign columns to datasets
unsplit_col_names = al_prim_pber.columns[4:-1].to_list()
cong_cols = [col for col in unsplit_col_names if col.startswith('PCON')]
sldu_cols = [col for col in unsplit_col_names if col.startswith('PSU')]
sldl_cols = [col for col in unsplit_col_names if col.startswith('PSL')]
st_cols = [col for col in unsplit_col_names if col not in cong_cols+sldu_cols+sldl_cols]

In [35]:
# set columns with votes as integer type
for item in unsplit_col_names:
    al_prim_pber[item] = al_prim_pber[item].astype(int)

In [36]:
#Check
len(unsplit_col_names) == len(cong_cols+sldu_cols+sldl_cols+st_cols)

True

In [37]:
#Create datasets for splits
cong = al_prim_pber[['UNIQUE_ID', 'COUNTYFP', 'county', 'precinct'] + cong_cols + ['geometry']]
sldu = al_prim_pber[['UNIQUE_ID', 'COUNTYFP', 'county', 'precinct'] + sldu_cols + ['geometry']]
sldl = al_prim_pber[['UNIQUE_ID', 'COUNTYFP', 'county', 'precinct'] + sldl_cols + ['geometry']]
st = al_prim_pber[['UNIQUE_ID', 'COUNTYFP', 'county', 'precinct'] + st_cols + ['geometry']]

#### Identify split precincts

#### Congressional Districts

In [38]:
races = [i for i in list(cong.columns) if "PCON" in i]
precinct_mapping_dict = {}
split_precincts_list = {}
for index,row in cong.iterrows():
    precinct_list = []
    for contest in races:
        if(row[contest]!=0):
            precinct_info = get_level_dist(contest)
            if precinct_info not in precinct_list:
                precinct_list.append(get_level_dist(contest))
    is_split = is_split_precinct(precinct_list)
    if (is_split):
        split_precincts_list[row["UNIQUE_ID"]]=is_split
    precinct_mapping_dict[row["UNIQUE_ID"]]=precinct_list
    
#Standardize district format with name different from PBER to use later and distinguish when splits occur
al_cong['HOUSE_DIST']=al_cong["DISTRICT"].astype(str).str.zfill(2)
cong["CONG_DIST"] = cong.apply(lambda row: return_cd(row, races), axis = 1)

split_precincts_list

{'Lauderdale-:-CHRIST CHAPEL CH': {'CON': ['04', '05']},
 'Lauderdale-:-CLOVERDALE COMM_ CTR_': {'CON': ['04', '05']},
 'Lauderdale-:-FLORENCE BLVD CH OF CHRIST': {'CON': ['04', '05']},
 'Lauderdale-:-HIGHLAND BAPT_ CH': {'CON': ['04', '05']},
 'Lauderdale-:-KILLEN CH OF CHRIST': {'CON': ['04', '05']}}

In [39]:
#Create split GDF
cong_w_splits = cong.copy()
for val in cong["UNIQUE_ID"]:
    cd_list = []
    if val in split_precincts_list.keys():
        cong_w_splits = district_splits_mod(split_precincts_list[val],"CONG",val, cong_w_splits, al_cong, "UNIQUE_ID", "HOUSE_DIST", races, "CONG_DIST")

In [40]:
#check
set(cong_w_splits['UNIQUE_ID']) - set(cong['UNIQUE_ID'])

{'Lauderdale-:-CHRIST CHAPEL CH-(CONG-04)',
 'Lauderdale-:-CHRIST CHAPEL CH-(CONG-05)',
 'Lauderdale-:-CLOVERDALE COMM_ CTR_-(CONG-04)',
 'Lauderdale-:-CLOVERDALE COMM_ CTR_-(CONG-05)',
 'Lauderdale-:-FLORENCE BLVD CH OF CHRIST-(CONG-04)',
 'Lauderdale-:-FLORENCE BLVD CH OF CHRIST-(CONG-05)',
 'Lauderdale-:-HIGHLAND BAPT_ CH-(CONG-04)',
 'Lauderdale-:-HIGHLAND BAPT_ CH-(CONG-05)',
 'Lauderdale-:-KILLEN CH OF CHRIST-(CONG-04)',
 'Lauderdale-:-KILLEN CH OF CHRIST-(CONG-05)'}

In [41]:
#fillna
cong_w_splits = cong_w_splits.fillna(0)
#Adjust precinct name field to account for name changes
cong_w_splits['Precinct'] = cong_w_splits['UNIQUE_ID'].apply(lambda x: x.split('-:-')[1])
#reorder df
cong_w_splits = cong_w_splits[['UNIQUE_ID', 'COUNTYFP', 'county', 'precinct', 'CONG_DIST'] + cong_cols + ['geometry']]
#check
cong_w_splits[cong_w_splits["CONG_DIST"].isna()]

Unnamed: 0,UNIQUE_ID,COUNTYFP,county,precinct,CONG_DIST,PCON02DHAR,PCON02DPAT,PCON03RJOI,PCON03RROG,PCON04DGOR,PCON04DNEI,PCON05DTHO,PCON05RBLA,PCON05DWAR,PCON05RROB,PCON05RSAN,PCON05RSTR,PCON05RWAR,PCON05RWRI,geometry


#### State Senate Districts

In [42]:
races = [i for i in list(sldu.columns) if "PSU" in i]
precinct_mapping_dict = {}
split_precincts_list = {}
for index,row in sldu.iterrows():
    precinct_list = []
    for contest in races:
        if(row[contest]!=0):
            precinct_info = get_level_dist(contest)
            if precinct_info not in precinct_list:
                precinct_list.append(get_level_dist(contest))
    is_split = is_split_precinct(precinct_list)
    if (is_split):
        split_precincts_list[row["UNIQUE_ID"]]=is_split
    precinct_mapping_dict[row["UNIQUE_ID"]]=precinct_list
    
#Standardize district format with name different from PBER to use later and distinguish when splits occur
al_sldu['SS_DIST']=al_sldu["DISTRICT"].astype(str).str.zfill(2)
sldu["SLDU_DIST"] = sldu.apply(lambda row: return_sl(row, races), axis = 1)

split_precincts_list

{'Jefferson-:-PREC 3280 - BROOKSIDE COMMUNIT': {'SU': ['19', '20']},
 'Jefferson-:-PREC 4005 - GARDENDALE FIRST B': {'SU': ['17', '19', '20']},
 'Jefferson-:-PREC 4020 - TRUSSVILLE FIRST B': {'SU': ['17', '15']},
 'Jefferson-:-PREC 4040 - TRUSSVILLE CITY HA': {'SU': ['17', '15']},
 'Jefferson-:-PREC 4045 - FAITH COMMUNITY FE': {'SU': ['17', '15']},
 'Jefferson-:-PREC 4200 - FULTONDALE SENIOR': {'SU': ['17', '20']},
 'Lee-:-EAMC HEALTH RESOURCE': {'SU': ['13', '27']},
 'Lee-:-OPELIKA LEARNING CTR': {'SU': ['13', '27']},
 'Lee-:-OPELIKA SPORTSPLEX': {'SU': ['13', '27']},
 'Limestone-:-BETHEL CH OF CHRIST': {'SU': ['01', '02']},
 'Limestone-:-COPELAND PRESBYTERIAN CH': {'SU': ['01', '02']},
 'Russell-:-AUSTIN SUMBRY PARK': {'SU': ['28', '27']},
 'Russell-:-CENTRAL ACTIVITES BLDG': {'SU': ['28', '27']},
 'Russell-:-DIXIE VFD': {'SU': ['28', '27']},
 'Russell-:-LADONIA VFD': {'SU': ['28', '27']},
 'Russell-:-OLD SEALE C_H_': {'SU': ['28', '27']},
 'Russell-:-RUSSELL COUNTY C_H_': {'SU': ['2

In [43]:
#Create Split GDF
sldu_w_splits = sldu.copy()
for val in sldu["UNIQUE_ID"]:
    sl_list = []
    if val in split_precincts_list.keys():
        sldu_w_splits = district_splits_mod(split_precincts_list[val],"SLDU",val, sldu_w_splits, al_sldu, "UNIQUE_ID", "SS_DIST", races, "SLDU_DIST")

In [44]:
#check
set(sldu_w_splits['UNIQUE_ID']) - set(sldu['UNIQUE_ID'])

{'Jefferson-:-PREC 3280 - BROOKSIDE COMMUNIT-(SLDU-17)',
 'Jefferson-:-PREC 3280 - BROOKSIDE COMMUNIT-(SLDU-19)',
 'Jefferson-:-PREC 3280 - BROOKSIDE COMMUNIT-(SLDU-20)',
 'Jefferson-:-PREC 4005 - GARDENDALE FIRST B-(SLDU-17)',
 'Jefferson-:-PREC 4005 - GARDENDALE FIRST B-(SLDU-19)',
 'Jefferson-:-PREC 4005 - GARDENDALE FIRST B-(SLDU-20)',
 'Jefferson-:-PREC 4020 - TRUSSVILLE FIRST B-(SLDU-15)',
 'Jefferson-:-PREC 4020 - TRUSSVILLE FIRST B-(SLDU-17)',
 'Jefferson-:-PREC 4020 - TRUSSVILLE FIRST B-(SLDU-20)',
 'Jefferson-:-PREC 4040 - TRUSSVILLE CITY HA-(SLDU-11)',
 'Jefferson-:-PREC 4040 - TRUSSVILLE CITY HA-(SLDU-15)',
 'Jefferson-:-PREC 4040 - TRUSSVILLE CITY HA-(SLDU-17)',
 'Jefferson-:-PREC 4040 - TRUSSVILLE CITY HA-(SLDU-20)',
 'Jefferson-:-PREC 4045 - FAITH COMMUNITY FE-(SLDU-15)',
 'Jefferson-:-PREC 4045 - FAITH COMMUNITY FE-(SLDU-17)',
 'Jefferson-:-PREC 4200 - FULTONDALE SENIOR-(SLDU-17)',
 'Jefferson-:-PREC 4200 - FULTONDALE SENIOR-(SLDU-20)',
 'Lee-:-EAMC HEALTH RESOURCE-(SLD

In [45]:
#fillna
sldu_w_splits = sldu_w_splits.fillna(0)
#Adjust precinct name field to account for name changes
sldu_w_splits['Precinct'] = sldu_w_splits['UNIQUE_ID'].apply(lambda x: x.split('-:-')[1])
#reorder df
sldu_w_splits = sldu_w_splits[['UNIQUE_ID', 'COUNTYFP', 'county', 'precinct', 'SLDU_DIST'] + sldu_cols + ['geometry']]
#check
sldu_w_splits[sldu_w_splits["SLDU_DIST"].isna()]

Unnamed: 0,UNIQUE_ID,COUNTYFP,county,precinct,SLDU_DIST,PSU22RALB,PSU22RSEX,PSU28DBEA,PSU28DLEE,PSU17RDUN,PSU17RSHE,PSU23DMEL,PSU23DSAN,PSU23DSPE,PSU23DSTE,PSU12RDRA,PSU12RKEL,PSU12RWIL,PSU13RCOK,PSU13RPRI,PSU31RCAR,PSU31RHOR,PSU31RJON,PSU15RCHR,PSU15RROB,PSU19DALE,PSU19DCOL,PSU20DCOL,PSU20DHUN,PSU01RMEL,PSU01RSUT,PSU27RHOV,PSU27RWHA,PSU02RBUT,PSU02RHOL,PSU11RBEL,PSU11RWRI,geometry


#### State House Districts

In [46]:
races = [i for i in list(sldl.columns) if "PSL" in i]
precinct_mapping_dict = {}
split_precincts_list = {}
for index,row in sldl.iterrows():
    precinct_list = []
    for contest in races:
        if(row[contest]!=0):
            precinct_info = get_level_dist(contest)
            if precinct_info not in precinct_list:
                precinct_list.append(get_level_dist(contest))
    is_split = is_split_precinct(precinct_list)
    if (is_split):
        split_precincts_list[row["UNIQUE_ID"]]=is_split
    precinct_mapping_dict[row["UNIQUE_ID"]]=precinct_list
    
#Standardize district format with name different from PBER to use later and distinguish when splits occur
al_sldl['SH_DIST']=al_sldl["DISTRICT"].astype(str).str.zfill(2)
sldl["SLDL_DIST"] = sldl.apply(lambda row: return_sl(row, races), axis = 1)

split_precincts_list

{'Baldwin-:-FAIRHOPE 3 CIRCLE CH': {'SL': ['06', '09']},
 'Baldwin-:-NEW LIFE ASSEMBLY OF GOD': {'SL': ['06', '09']},
 'Baldwin-:-ROBERTSDALE PZK HALL': {'SL': ['06', '09']},
 'Baldwin-:-SUMMERDALE COMM_ CTR': {'SL': ['06', '09']},
 'Calhoun-:-PIEDMONT FIRE STATION': {'SL': ['02', '04']},
 'DeKalb-:-COPELAND BRIDGE LIBERTY': {'SL': ['03', '02']},
 'DeKalb-:-MCKESTES COMM CTR': {'SL': ['03', '02']},
 'DeKalb-:-PORTERSVILLE BAPT CH': {'SL': ['03', '02']},
 'DeKalb-:-SKIRUM COMM CTR': {'SL': ['03', '02']},
 'DeKalb-:-TENBROECK COMM CTR': {'SL': ['03', '02']},
 'Jefferson-:-PREC 2300 - BUSH HILLS ACADEMY': {'SL': ['05', '06']},
 'Jefferson-:-PREC 2420 - BELL PHYSICAL EDUC': {'SL': ['05', '06']},
 'Jefferson-:-PREC 5010 - HOOVER PARK & RECR': {'SL': ['01', '04']},
 'Jefferson-:-PREC 5040 - METROPOLITAN CHURC': {'SL': ['01', '04', '05']},
 'Jefferson-:-PREC 5080 - BROOKWOOD BAPTIST': {'SL': ['04', '05']},
 'Jefferson-:-PREC 5090 - LEEDS FIRST UNITED': {'SL': ['01', '04', '05']},
 'Jefferson-

In [47]:
sldl_w_splits = sldl.copy()
for val in sldl["UNIQUE_ID"]:
    sl_list = []
    if val in split_precincts_list.keys():
        sldl_w_splits = district_splits_mod(split_precincts_list[val],"SLDL",val, sldl_w_splits, al_sldl, "UNIQUE_ID", "SH_DIST", races, "SLDL_DIST")

In [48]:
#check
set(sldl_w_splits['UNIQUE_ID']) - set(sldl['UNIQUE_ID'])

{'Baldwin-:-FAIRHOPE 3 CIRCLE CH-(SLDL-64)',
 'Baldwin-:-FAIRHOPE 3 CIRCLE CH-(SLDL-94)',
 'Baldwin-:-FAIRHOPE 3 CIRCLE CH-(SLDL-96)',
 'Baldwin-:-NEW LIFE ASSEMBLY OF GOD-(SLDL-64)',
 'Baldwin-:-NEW LIFE ASSEMBLY OF GOD-(SLDL-96)',
 'Baldwin-:-ROBERTSDALE PZK HALL-(SLDL-64)',
 'Baldwin-:-ROBERTSDALE PZK HALL-(SLDL-66)',
 'Baldwin-:-SUMMERDALE COMM_ CTR-(SLDL-64)',
 'Baldwin-:-SUMMERDALE COMM_ CTR-(SLDL-66)',
 'Baldwin-:-SUMMERDALE COMM_ CTR-(SLDL-94)',
 'Baldwin-:-SUMMERDALE COMM_ CTR-(SLDL-95)',
 'Calhoun-:-PIEDMONT FIRE STATION-(SLDL-29)',
 'Calhoun-:-PIEDMONT FIRE STATION-(SLDL-39)',
 'Calhoun-:-PIEDMONT FIRE STATION-(SLDL-40)',
 'DeKalb-:-COPELAND BRIDGE LIBERTY-(SLDL-24)',
 'DeKalb-:-COPELAND BRIDGE LIBERTY-(SLDL-39)',
 'DeKalb-:-MCKESTES COMM CTR-(SLDL-24)',
 'DeKalb-:-MCKESTES COMM CTR-(SLDL-39)',
 'DeKalb-:-PORTERSVILLE BAPT CH-(SLDL-24)',
 'DeKalb-:-PORTERSVILLE BAPT CH-(SLDL-39)',
 'DeKalb-:-SKIRUM COMM CTR-(SLDL-24)',
 'DeKalb-:-SKIRUM COMM CTR-(SLDL-39)',
 'DeKalb-:-TENBRO

In [49]:
#fillna
sldl_w_splits = sldl_w_splits.fillna(0)
#Adjust precinct name field to account for name changes
sldl_w_splits['Precinct'] = sldl_w_splits['UNIQUE_ID'].apply(lambda x: x.split('-:-')[1])
#reorder df
sldl_w_splits = sldl_w_splits[['UNIQUE_ID', 'COUNTYFP', 'county', 'precinct', 'SLDL_DIST'] + sldl_cols + ['geometry']]
#check
sldl_w_splits[sldl_w_splits["SLDL_DIST"].isna()]

Unnamed: 0,UNIQUE_ID,COUNTYFP,county,precinct,SLDL_DIST,PSL088RDIS,PSL088RSTA,PSL064RFER,PSL064RGIV,PSL065RCAM,PSL065REAS,PSL094RFAU,PSL094RFID,PSL095RHOL,PSL095RLUD,PSL095RPUL,PSL096RDUG,PSL096RSIM,PSL049RBED,PSL049RHAR,PSL072DHOW,PSL072DTRA,PSL029RGID,PSL029RGRA,PSL040RBLA,PSL040RBOR,PSL040REXU,PSL040RLES,PSL040RMCA,PSL040RROB,PSL040RWIL,PSL038RMES,PSL038RWOO,PSL039RRHO,PSL039RSHA,PSL091RHOG,PSL091RMAR,PSL092RHAM,PSL092RWHI,PSL003DBEN,PSL003RJOL,PSL003DTHO,PSL003RUND,PSL007RROB,PSL007RYAR,PSL014RFRA,PSL014RFRE,PSL014RWAD,PSL067DCHE,PSL067DPET,PSL024RLED,PSL024RSTO,PSL031RSMI,PSL031RSTU,PSL028RBUT,PSL028RISB,PSL087RJOH,PSL087RSOR,PSL023RHAN,PSL023RKIR,PSL015RHUL,PSL015RTOM,PSL045RDRA,PSL045RDUB,PSL047DCOL,PSL047DTOO,PSL048RCAR,PSL048RWEN,PSL052DMIL,PSL052DROG,PSL054DBLA,PSL054DMAD,PSL054DRAF,PSL055DHEN,PSL055DODE,PSL055DPLU,PSL055DSCO,PSL055DWOM,PSL056DHUF,PSL056DKIN,PSL056DMAT,PSL056DTIL,PSL057DDUN,PSL057DSEL,PSL057DWIN,PSL060DGIV,PSL060DTAY,PSL001RMCC,PSL001RPET,PSL002RBLA,PSL002RBUT,PSL002RHAR,PSL002RIRE,PSL082DJOH,PSL082DWAR,PSL004RBAN,PSL004RJOH,PSL004RMOO,PSL025RCLE,PSL025RRIG,PSL020RBRO,PSL020RLOM,PSL020RMCC,PSL020RTAY,PSL026RCOL,PSL026RHOL,PSL026RMIT,PSL099DJON,PSL099DWRI,PSL100RKUP,PSL100RPIG,PSL100RSHI,PSL074DCAL,PSL074DENS,PSL061RBOL,PSL061RMAD,PSL013RBAR,PSL013RDAV,PSL013RDOZ,PSL013RWAI,PSL013RWOO,geometry


<p><a name="check"></a></p>

### Vote Total Checks

In [50]:
#Statewide GDF
county_totals_check(al_prim_pber, "RDH raw", st, "join", st_cols, "COUNTYFP")

***Countywide Totals Check***

P22USSDBOY is equal across all counties
P22USSRBOD is equal across all counties
P22USSDDEA is equal across all counties
P22USSRBRI is equal across all counties
P22USSDJAC is equal across all counties
P22USSRBRO is equal across all counties
P22USSRDUP is equal across all counties
P22USSRDUR is equal across all counties
P22USSRSCH is equal across all counties
P22GOVDFLO is equal across all counties
P22GOVRBLA is equal across all counties
P22GOVDFOR is equal across all counties
P22GOVRBUR is equal across all counties
P22GOVDJAM is equal across all counties
P22GOVRGEO is equal across all counties
P22GOVDKEN is equal across all counties
P22GOVRIVE is equal across all counties
P22GOVDMAR is equal across all counties
P22GOVRJAM is equal across all counties
P22GOVDSMI is equal across all counties
P22GOVRJON is equal across all counties
P22GOVRODL is equal across all counties
P22GOVRTHO is equal across all counties
P22GOVRYOU is equal across all counties
P22ATGRMA

In [51]:
#congressional splits
county_totals_check(al_prim_pber, "RDH raw", cong_w_splits, "join", cong_cols, "COUNTYFP")

***Countywide Totals Check***

PCON02DHAR is equal across all counties
PCON02DPAT is equal across all counties
PCON03RJOI is equal across all counties
PCON03RROG is equal across all counties
PCON04DGOR is equal across all counties
PCON04DNEI is equal across all counties
PCON05DTHO is equal across all counties
PCON05RBLA is equal across all counties
PCON05DWAR is equal across all counties
PCON05RROB is equal across all counties
PCON05RSAN is equal across all counties
PCON05RSTR is equal across all counties
PCON05RWAR is equal across all counties
PCON05RWRI is equal across all counties


In [52]:
#State house splits
county_totals_check(al_prim_pber, "RDH raw", sldl, "join", sldl_cols, "county")

***Countywide Totals Check***

PSL088RDIS is equal across all counties
PSL088RSTA is equal across all counties
PSL064RFER is equal across all counties
PSL064RGIV is equal across all counties
PSL065RCAM is equal across all counties
PSL065REAS is equal across all counties
PSL094RFAU is equal across all counties
PSL094RFID is equal across all counties
PSL095RHOL is equal across all counties
PSL095RLUD is equal across all counties
PSL095RPUL is equal across all counties
PSL096RDUG is equal across all counties
PSL096RSIM is equal across all counties
PSL049RBED is equal across all counties
PSL049RHAR is equal across all counties
PSL072DHOW is equal across all counties
PSL072DTRA is equal across all counties
PSL029RGID is equal across all counties
PSL029RGRA is equal across all counties
PSL040RBLA is equal across all counties
PSL040RBOR is equal across all counties
PSL040REXU is equal across all counties
PSL040RLES is equal across all counties
PSL040RMCA is equal across all counties
PSL040RRO

In [53]:
#State Senate Splits
county_totals_check(al_prim_pber, "RDH raw", sldu, "join", sldu_cols, "COUNTYFP")

***Countywide Totals Check***

PSU22RALB is equal across all counties
PSU22RSEX is equal across all counties
PSU28DBEA is equal across all counties
PSU28DLEE is equal across all counties
PSU17RDUN is equal across all counties
PSU17RSHE is equal across all counties
PSU23DMEL is equal across all counties
PSU23DSAN is equal across all counties
PSU23DSPE is equal across all counties
PSU23DSTE is equal across all counties
PSU12RDRA is equal across all counties
PSU12RKEL is equal across all counties
PSU12RWIL is equal across all counties
PSU13RCOK is equal across all counties
PSU13RPRI is equal across all counties
PSU31RCAR is equal across all counties
PSU31RHOR is equal across all counties
PSU31RJON is equal across all counties
PSU15RCHR is equal across all counties
PSU15RROB is equal across all counties
PSU19DALE is equal across all counties
PSU19DCOL is equal across all counties
PSU20DCOL is equal across all counties
PSU20DHUN is equal across all counties
PSU01RMEL is equal across all cou

<p><a name="exp"></a></p>

### Export Cleaned Precinct-Level Datasets

Export joined datasets

In [54]:
if not os.path.exists("./al_2022_prim_pber/al_prim_22_no_splits_prec"):
    os.mkdir("./al_2022_prim_pber/al_prim_22_no_splits_prec/")
    os.mkdir("./al_2022_prim_pber/al_prim_22_st_prec/")
    os.mkdir("./al_2022_prim_pber/al_prim_22_cong_prec/")
    os.mkdir("./al_2022_prim_pber/al_prim_22_sldu_prec/")
    os.mkdir("./al_2022_prim_pber/al_prim_22_sldl_prec/")

al_prim_pber.to_file("./al_2022_prim_pber/al_prim_22_no_splits_prec/al_prim_22_no_splits_prec.shp")
st.to_file("./al_2022_prim_pber/al_prim_22_st_prec/al_prim_22_st_prec.shp")
cong_w_splits.to_file("./al_2022_prim_pber/al_prim_22_cong_prec/al_prim_22_cong_prec.shp")
sldu_w_splits.to_file("./al_2022_prim_pber/al_prim_22_sldu_prec/al_prim_22_sldu_prec.shp")
sldl_w_splits.to_file("./al_2022_prim_pber/al_prim_22_sldl_prec/al_prim_22_sldl_prec.shp")