## Alabama 2022 Primary Election Returns

### Sections
- <a href="#ETL">Cleaning Precinct-Level Election Results</a><br>
- <a href="#check">Vote Totals Checks</a><br>
- <a href="#discrep">Addressing Vote Discrepancies</a><br>
- <a href="#allocate">Allocate Absentee and Provisional Votes at Precinct Level</a><br>
- <a href="#readme">Creating README</a><br>
- <a href="#exp">Exporting Cleaned Precinct-Level Dataset</a><br>

#### Sources

- [Alabama Primary Election Results, Precint Level](https://www.sos.alabama.gov/sites/default/files/election-data/2022-06/2022%20Primary%20Precinct%20Results.zip)
- [Secretary of State Certified Results, Democratic Party](https://www.sos.alabama.gov/sites/default/files/election-2022/AL%20Democratic%20Party%202022%20Primary%20Results.xlsx)
- [Secretary of State Certified Results, Republican Party](https://www.sos.alabama.gov/sites/default/files/election-2022/AL%20Republican%20Party%202022%20Primary%20Results%20Official.xlsx)
- [Secretary of State Certified Ballot Measure Results](https://www.sos.alabama.gov/sites/default/files/election-data/2022-11/Final%20Canvass%20of%20Results%20%28canvassed%20by%20state%20canvassing%20board%2011-28-2022%29.pdf)

In [1]:
import pandas as pd
import os
import numpy as np
import re
from collections import Counter
import AL_22_helper_functions as hlp
pd.set_option('display.max_rows', 1000)

In [2]:
# Temporarily ignore warning messages for vote allocation function
import warnings

warnings.filterwarnings("ignore")

<p><a name="ETL"></a></p>

### Cleaning Precinct Level Election Returns

Load-In + Clean Election Results

In [3]:
county_list = []
ph_county_list = []
contest_list = []
clean_index = []
index_issue = [] #use to 
files = os.listdir('./raw-from-source/2022 Primary Precinct Results/')

#list of words to filter for contests with statewide reach
contest_keywords = ['Senator', 'Representative', 'Governor', 'Attorney General', 'Secretary of State', 'Treasurer', 'Auditor', 'State Board of Education', 'BOE', 'Agriculture', 'Insurance', 'PSC', 'Public Service', 'Supreme Court', 'Court of Appeals', 'Amendment']

for idx, file in enumerate(files):
    #Load county files
    temp = pd.read_excel('./raw-from-source/2022 Primary Precinct Results/' + file)
    
    # Filter to contests of interest
    contest_keywords = set(keyword.upper() for keyword in contest_keywords)
    contest_titles_set = set(temp['Contest Title'])
    comb_list = [title for title in contest_titles_set if any(keyword in title for keyword in contest_keywords)]
    temp_statewide = temp[temp['Contest Title'].isin(comb_list)].copy()
    
    #Add to contest_list
    contest_list += list(temp_statewide['Contest Title'].unique())
    
    # Get the county name, clean "StClair" to match pattern
    county_name = file.split("-")[-1][0:-4]
    if county_name == "StClair":
        county_name = "St. Clair"
    county_list.append(county_name)
        
    # Clean the party name
    temp_statewide["Party"] = temp_statewide["Party"].str.strip()
    temp_statewide["Party"] = temp_statewide["Party"].fillna("")
    
    # Create a column to pivot on
    temp_statewide["pivot_col"] = temp_statewide["Contest Title"].str.strip()+"-:-"+temp_statewide["Candidate"].str.strip()
    temp_statewide["pivot_col"] = np.where(temp_statewide["Party"]=="",temp_statewide["pivot_col"],temp_statewide["pivot_col"]+"-:-"+temp_statewide["Party"].str.strip())
    
    # Drop columns that are no longer needed
    temp_statewide.drop(["Contest Title", "Party", "Candidate"], axis = 1, inplace = True)
    
    # Add the county name to the precinct
    rename_dict = {i:i+"-:-"+county_name for i in temp_statewide.columns if i != "pivot_col"}
    temp_statewide.rename(columns = rename_dict, inplace = True)
    
    # Transpose the dataframe
    temp_transpose = temp_statewide.set_index("pivot_col").T
    temp_transpose.reset_index(inplace = True, drop = False)
    
    # Make sure cols and indexes unique
    if temp_transpose.columns.nunique() == len(temp_transpose.columns) and temp_transpose.index.is_unique:
        clean_index.append(county_name)
    else:
        index_issue.append(str(county_name) + ' ' + str(idx))        

    # Add to the list of counties
    ph_county_list.append(temp_transpose)

In [4]:
# Check for number of unique contests
len(set(contest_list))

# # Visually inspect contest types
# set(contest_list)

75

In [5]:
#check for index issues
len(index_issue)

0

In [6]:
# Concatenate into one file
comb = pd.concat(ph_county_list, axis = 0)

In [7]:
# Remove the under votes and the over votes
comb_list = [i for i in comb.columns.to_list() if "Under Votes" not in i and "Over Votes" not in i]
al_prim = comb[comb_list]

#### Rename Columns

In [8]:
# Create dictionaries for renaming pivot col to VEST
# use helper functions, found in helper module
exclude_columns = ['UNIQUE_ID', 'county', 'COUNTYFP', 'precinct', 'index']
contest_updates_dict, contest_updates_reversed, clean_dups =hlp.create_column_rename_dicts(al_prim, exclude_columns)

In [9]:
# Check all dict values under 10 characters
for item in contest_updates_dict.values():
    if len(item) > 10 or len(item) < 7:
        print(item)
        print(contest_updates_reversed[item])

In [10]:
# Check 
len(contest_updates_dict) == al_prim.shape[1] - 1

True

In [11]:
# apply rename dictionary to df
al_prim.rename(columns = contest_updates_dict, inplace = True)
al_prim.reset_index(inplace = True, drop = True)

# Fillna
al_prim = al_prim.fillna(0)

# set columns with votes as integer type
for item in contest_updates_dict.values():
    al_prim[item] = al_prim[item].astype(int)

In [12]:
# Create DF of field names for later use in README creation
fieldnames = list(contest_updates_dict.values())
fieldnames.sort()

sorted_dict = dict(sorted(contest_updates_dict.items(), key=lambda x:x[1]))

export_dict = {i:key for key, i in sorted_dict.items()}
rm_df = pd.DataFrame(export_dict.items())

#### Add County, FIPS, Unique ID columns

In [13]:
# Define county, precinct columns
al_prim['county'] = al_prim['index'].apply(lambda x: x.split("-:-")[1])
al_prim['precinct'] = al_prim['index'].apply(lambda x: x.split("-:-")[0])

In [14]:
# Add FIPS col to precinct df
al_prim = hlp.create_fips_col("./raw-from-source/FIPS/US_FIPS_Codes.csv", 'Alabama', al_prim, 'county')

# Check
al_prim['COUNTYFP'].isnull().any()

False

In [15]:
# Create UNIQUE_ID col
al_prim['UNIQUE_ID'] = al_prim['COUNTYFP'] + '-' +al_prim['precinct']
# Check
al_prim['UNIQUE_ID'].nunique()

2070

In [16]:
# Filter down to needed columns
al_prim = al_prim[['UNIQUE_ID', 'COUNTYFP', 'county', 'precinct']+list(contest_updates_dict.values())]

In [17]:
al_prim.head(2)

pivot_col,UNIQUE_ID,COUNTYFP,county,precinct,P22USSDBOY,P22USSRBOD,P22USSDDEA,P22USSRBRI,P22USSDJAC,P22USSRBRO,...,PSL074DENS,PSL061RBOL,PSL061RMAD,PSU11RBEL,PSU11RWRI,PSL013RBAR,PSL013RDAV,PSL013RDOZ,PSL013RWAI,PSL013RWOO
0,001-10 JONES COMM_ CTR_,1,Autauga,10 JONES COMM_ CTR_,28,1,6,34,14,30,...,0,0,0,0,0,0,0,0,0,0
1,001-100 TRINITY METHODIST,1,Autauga,100 TRINITY METHODIST,26,17,11,377,9,277,...,0,0,0,0,0,0,0,0,0,0


<p><a name="allocate"></a></p>

### Allocate Absentee and Provisional Votes at the Precinct Level

In [18]:
absentee_precs = al_prim[(al_prim["UNIQUE_ID"].str.contains("ABSENTEE"))|(al_prim["UNIQUE_ID"].str.contains("PROVISIONAL"))|(al_prim["UNIQUE_ID"].str.contains("PROVISONAL"))]
nonabsentee_precs = al_prim[~((al_prim["UNIQUE_ID"].str.contains("ABSENTEE"))|(al_prim["UNIQUE_ID"].str.contains("PROVISIONAL"))|(al_prim["UNIQUE_ID"].str.contains("PROVISONAL")))]

print(absentee_precs.shape)
print(nonabsentee_precs.shape)

(135, 216)
(1935, 216)


In [19]:
# #Check for anomalies, every county should have 2 entries
# absentee_precs['county'].value_counts()

Lamar county has a typo in their 'provisional' votes precinct name.
Jefferson county has two categories of absentee votes

#### Jefferson county vote allocation

Jefferson county vote allocation will be done separately because provisional votes are reported at the county level, but absentee votes are reported in two categories of Birmingham precincts and Bessemer precincts.
information on categorizing precincts in this county was found on the [county website](https://www.jccal.org/Default.asp?ID=1936&pg=Maps)

In [20]:
# Create a list of all precincts in jefferson county
jco = al_prim[al_prim['county'] == 'Jefferson']['UNIQUE_ID'].to_list()
jco = jco[2:]

In [21]:
# use list to subset to those precincts indicated as belonging to bessemer
bessemer_precs = [jco[4], jco[8], jco[22], jco[25], jco[29],
                  jco[36], jco[38], jco[40], jco[46], jco[52],
                  jco[53], jco[55], jco[56], jco[62], jco[63],
                  jco[66], jco[67], jco[77], jco[78], jco[79],
                  jco[80], jco[81], jco[83], jco[84], jco[85],
                  jco[89], jco[90], jco[91], jco[92], jco[93],
                 jco[94], jco[95], jco[98], jco[99], jco[101],
                 jco[103], jco[104], jco[105], jco[106], jco[109],
                 jco[110], jco[111], jco[112], jco[113], jco[116],
                 jco[117], jco[118], jco[119], jco[120], jco[148],
                 jco[150], jco[158]]

In [22]:
# Check that all precincts are allocated to bessemer or birmingham
birmingham_precs = list(set(jco) - set(bessemer_precs))
print('Number of Birmingham Precincts: ', len(birmingham_precs))
print('Number of Bessemer Precincts: ',len(bessemer_precs),)
print('Total Precincts: ', len(jco))

Number of Birmingham Precincts:  123
Number of Bessemer Precincts:  52
Total Precincts:  175


In [23]:
# Visually check precinct names against state PDFS
#bessemer_precs
#birmingham_precs

In [24]:
# Filter out the absentee precincts related to Birmingham or Bessemer
jefferson_exceptions = absentee_precs[absentee_precs["UNIQUE_ID"].str.contains("BIRMINGHAM") | absentee_precs["UNIQUE_ID"].str.contains("BESSEMER")]

# Make a list of the remaining absentee precincts
remaining_absentee =  absentee_precs[~(absentee_precs["UNIQUE_ID"].str.contains("BIRMINGHAM") | absentee_precs["UNIQUE_ID"].str.contains("BESSEMER"))]

In [25]:
# Going to use the names "BESSEMER" and "BIRMINGAHM" as the keys to allocate on and not the county names
jefferson_exceptions["Spec_Alloc"] = jefferson_exceptions["precinct"].apply(lambda x: x.split( )[0])

birmingham_precincts = nonabsentee_precs[(~nonabsentee_precs["UNIQUE_ID"].isin(bessemer_precs)) & (nonabsentee_precs["county"]=="Jefferson")]
bessemer_precincts = nonabsentee_precs[nonabsentee_precs["UNIQUE_ID"].isin(bessemer_precs)]

birmingham_precincts["Spec_Alloc"] = "BIRMINGHAM"
bessemer_precincts["Spec_Alloc"] = "BESSEMER"

In [26]:
# combine into one df
receiving_jefferson = pd.concat([birmingham_precincts, bessemer_precincts])

In [27]:
# Get the precincts that need votes in the rest of the state
non_jefferson_non_absentee = nonabsentee_precs[~nonabsentee_precs["UNIQUE_ID"].isin(list(receiving_jefferson["UNIQUE_ID"]))]

In [28]:
# columns that need allocation
vote_cols = al_prim.columns[4:]

In [29]:
# Run vote allocation for jefferson county
received_jefferson = hlp.allocate_absentee(receiving_jefferson, jefferson_exceptions, list(vote_cols), "Spec_Alloc", allocating_to_all_empty_precs=False)

Special allocation used for [['BIRMINGHAM', 'PSL015RHUL'], ['BIRMINGHAM', 'PSL015RTOM'], ['BIRMINGHAM', 'PSL056DHUF'], ['BIRMINGHAM', 'PSL056DTIL']]


In [30]:
received_jefferson.drop("Spec_Alloc", axis = 1, inplace = True)

#### Absentee Allocation for non Jefferson counties

In [31]:
round_two = pd.concat([received_jefferson, non_jefferson_non_absentee])

In [32]:
round_two.reset_index(inplace = True, drop = True)
remaining_absentee.reset_index(inplace = True, drop = True)

In [33]:
# run final vote allocation for 66 non jefferson counties
final_allocation = hlp.allocate_absentee(round_two, remaining_absentee, list(vote_cols), "county", allocating_to_all_empty_precs=False)

<p><a name="check"></a></p>

### Vote Totals Check

#### Republican Election Results from SOS

In [34]:
# Read in edited summary sheet
df = pd.read_excel('./raw-from-source/AL Republican Party 2022 Primary Results Official.xlsx', sheet_name='summary_edited')
# Extract columns with vote total information
df1 = df[df.columns[:3]]
df2 = df[df.columns[4:7]]
df3 = df[df.columns[8:11]]
# create column names
df1.columns = df2.columns = df3.columns = ['contest', 'choice', 'num_votes']
# concatenate into one df
sos_combined = pd.concat([df1, df2, df3], axis=0)

In [35]:
#check
df1.shape[0] + df2.shape[0] + df3.shape[0] == sos_combined.shape[0]

True

In [36]:
#filter out null rows
sos_r = sos_combined[~sos_combined['contest'].isnull()]
#set num_votes column as integer
sos_r.num_votes = sos_r.num_votes.astype(int)
sos_r.num_votes.dtype

dtype('int32')

In [37]:
#account for anomalies in contest names
sos_r['contest'] = sos_r['contest'].replace('Govenor', 'Governor')
sos_r['party'] = 'REP'

In [39]:
sos_r_p = hlp.create_pivot_col(sos_r, 'choice', 'contest', 'party', 'pivot').copy()

In [40]:
# Create rename dict to read in county-level results
sos_r_dict = dict(zip(sos_r_p['choice'], sos_r_p['contest']))

In [41]:
# Create placeholder dataframe
sos_r_county = pd.DataFrame(columns=['choice', 'num_votes', 'county', 'contest'])

In [42]:
# extract data fom each county sheet, process and add to new dataframe
counties_worked = []
county_list_subset = county_list[:1]
sos_r_county = pd.DataFrame()
for county in county_list:
    df = pd.read_excel('./raw-from-source/AL Republican Party 2022 Primary Results Official.xlsx', sheet_name= str(county))
    # Extract columns with vote total information
    df1 = df[df.columns[:2]]
    df2 = df[df.columns[4:6]]
    # create column names
    df1.columns = df2.columns = ['choice', 'num_votes']
    # concatenate into one df
    county_single = pd.concat([df1, df2], axis=0)
    # drop rows that are empty in choice and vote columns
    county_single_filt = county_single.dropna(subset=['choice', 'num_votes'], how = 'all')
    # remove '.', ',' to match to sos_r_dict
    county_single_filt['choice'] = county_single_filt['choice'].apply(lambda x: x.replace('.', ''))
    county_single_filt['choice'] = county_single_filt['choice'].apply(lambda x: x.replace(',', ''))
    # use sos_r_dict to match candidate name to contest
    county_single_filt['contest'] = county_single_filt['choice'].map(sos_r_dict)
    # drop rows without contest
    county_single_filt = county_single_filt.dropna(subset=['contest'])
    # add county name column
    county_single_filt['county'] = str(county)
    # add county name to a list of counties
    counties_worked.append(county)
    # add dataframe to receiving df
    sos_r_county = pd.concat([sos_r_county, county_single_filt], ignore_index=True)

In [43]:
# check that all counties worked
len(counties_worked)
# Check number of counties in new df
sos_r_county['county'].nunique()

67

In [44]:
# add in party column, fill null votes with 0
sos_r_county['num_votes'].fillna(0, inplace=True)
sos_r_county['party'] = 'REP'

In [45]:
sos_r_county.head(2)

Unnamed: 0,choice,num_votes,contest,county,party
0,Lindy Blanchard,1864.0,Governor,Autauga,REP
1,Lew Burdette,302.0,Governor,Autauga,REP


#### Democratic Election Results from SOS

In [46]:
# Read in results
df = pd.read_excel('./raw-from-source/AL Democratic Party 2022 Primary Results.xlsx')

In [47]:
sos_d = df[['Office', 'Ballot Name', 'Votes', 'County Name']]
sos_d['party'] = 'DEM'
sos_d.columns = ['contest', 'choice', 'num_votes', 'county', 'party']

In [48]:
# Filter to contests with statewide reach
contest_keywords = set(keyword.upper() for keyword in contest_keywords)
contest_titles_set = set(sos_d['contest'])
comb_list = [title for title in contest_titles_set if any(keyword in title for keyword in contest_keywords)]
sos_d_filt = sos_d[sos_d['contest'].isin(comb_list)].copy()

In [49]:
sos_d_county = sos_d_filt[['choice', 'num_votes', 'contest', 'county', 'party']]

#### Combine Republican and Democratic Results

In [50]:
sos_total = pd.concat([sos_r_county, sos_d_county], ignore_index = True)

In [51]:
#check
sos_r_county.shape[0] + sos_d_county.shape[0] == sos_total.shape[0]

True

In [52]:
# add 'pivot' col
# use helper function
sos_tot_pvt = hlp.create_pivot_col(sos_total, 'choice', 'contest', 'party', 'pivot')

In [53]:
# use pivot col to add column with VEST names
sos_tot_pvt['VEST'] = sos_tot_pvt['pivot'].apply(lambda x: hlp.get_VEST(str(x).strip()))

In [54]:
# Check to see if the number of contests is the same
print('# of contests in SOS df:', sos_tot_pvt['VEST'].nunique())
print('# of contests in RDH df:', len(vote_cols))

# of contests in SOS df: 210
# of contests in RDH df: 212


In [55]:
# #Check which contests missing in SOS combined df
set(vote_cols) == set(sos_tot_pvt['VEST'])
set(vote_cols) - set(sos_tot_pvt['VEST'])

{'G22A01NO', 'G22A01YES'}

In [56]:
#pivot SOS county df
sos_tot_pvt =pd.pivot_table(sos_tot_pvt,index=['county'],columns=['VEST'],values=['num_votes'],aggfunc=sum)
sos_tot_pvt = sos_tot_pvt.fillna(0)
sos_tot_pvt.columns = sos_tot_pvt.columns.droplevel(0)
sos_tot_pvt.reset_index(inplace = True)

In [57]:
sos_tot_pvt.head()

VEST,county,P22ATGRMAR,P22ATGRSTI,P22AUDRCOO,P22AUDRGLO,P22AUDRSOR,P22GOVDFLO,P22GOVDFOR,P22GOVDJAM,P22GOVDKEN,...,PSU23DSAN,PSU23DSPE,PSU23DSTE,PSU27RHOV,PSU27RWHA,PSU28DBEA,PSU28DLEE,PSU31RCAR,PSU31RHOR,PSU31RJON
0,Autauga,8316.0,790.0,2358.0,2391.0,3503.0,482.0,353.0,147.0,119.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,Baldwin,25898.0,4797.0,5992.0,13045.0,10727.0,564.0,517.0,374.0,236.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,Barbour,1780.0,180.0,519.0,748.0,504.0,474.0,565.0,133.0,191.0,...,0.0,0.0,0.0,0.0,0.0,1344.0,535.0,0.0,0.0,0.0
3,Bibb,3006.0,233.0,1280.0,764.0,979.0,217.0,75.0,32.0,16.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,Blount,8223.0,599.0,3676.0,1660.0,2771.0,63.0,28.0,33.0,21.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


### Statewide Check

In [58]:
# Check statewide totals for each contest
statewide_check_list = []
doesnt_check = []
for item in vote_cols:
    if item not in sos_tot_pvt.columns:
        doesnt_check.append(item)
#         print(item)
#         print(contest_updates_reversed[item])
    else:
        official = sos_tot_pvt[item].sum().astype(int)
    rdh = final_allocation[item].sum()
    if official != rdh:
        statewide_check_list.append(item)
        print(contest_updates_reversed[item])
        print('')
        print(f"{item}\n\tOfficial: {official}\n\tRDH: {rdh}")

UNITED STATES SENATOR-:-Will Boyd-:-DEM

P22USSDBOY
	Official: 107588
	RDH: 107594
UNITED STATES SENATOR-:-Brandaun Dean-:-DEM

P22USSDDEA
	Official: 32863
	RDH: 32865
UNITED STATES SENATOR-:-Katie Britt-:-REP

P22USSRBRI
	Official: 289425
	RDH: 289423
UNITED STATES SENATOR-:-Lanny Jackson-:-DEM

P22USSDJAC
	Official: 28402
	RDH: 28403
UNITED STATES SENATOR-:-Mo Brooks-:-REP

P22USSRBRO
	Official: 188539
	RDH: 188537
UNITED STATES SENATOR-:-Mike Durant-:-REP

P22USSRDUR
	Official: 150817
	RDH: 150811
UNITED STATES REPRESENTATIVE, 2ND CONGRESSIONAL DISTRICT-:-Phyllis Harvey-Hall-:-DEM

PCON02DHAR
	Official: 16884
	RDH: 16889
UNITED STATES REPRESENTATIVE, 2ND CONGRESSIONAL DISTRICT-:-Vimal Patel-:-DEM

PCON02DPAT
	Official: 7667
	RDH: 7668
GOVERNOR-:-Yolanda Rochelle Flowers-:-DEM

P22GOVDFLO
	Official: 56991
	RDH: 56993
GOVERNOR-:-Lindy Blanchard-:-REP

P22GOVRBLA
	Official: 126202
	RDH: 126205
GOVERNOR-:-Malika Sanders Fortier-:-DEM

P22GOVDFOR
	Official: 54699
	RDH: 54700
GOVERNOR-:-L

In [59]:
len(statewide_check_list)

50

### Countywide Totals Check

In [60]:
#county totals check
rdh = al_prim
sos = sos_tot_pvt
partner_name = 'SOS'
source_name = 'RDH'
county_col = 'county'
hlp.county_totals_check(sos,partner_name, rdh, source_name, sos_tot_pvt.columns[1:], county_col,full_print=False, method='county')

***Countywide Totals Check***

Baldwin contains differences in these races:
	P22GOVRGEO has a difference of 1.0 vote(s)
		SOS: 147.0 vote(s)
		RDH: 146 vote(s)
Barbour contains differences in these races:
	P22PS2RLIT has a difference of 10.0 vote(s)
		SOS: 233.0 vote(s)
		RDH: 223 vote(s)
Blount contains differences in these races:
	P22PS2RMCC has a difference of 1.0 vote(s)
		SOS: 2432.0 vote(s)
		RDH: 2431 vote(s)
Conecuh contains differences in these races:
	P22GOVDFOR has a difference of -1.0 vote(s)
		SOS: 350.0 vote(s)
		RDH: 351 vote(s)
	P22GOVDKEN has a difference of -1.0 vote(s)
		SOS: 122.0 vote(s)
		RDH: 123 vote(s)
	P22GOVDSMI has a difference of -2.0 vote(s)
		SOS: 184.0 vote(s)
		RDH: 186 vote(s)
	P22GOVRBLA has a difference of -2.0 vote(s)
		SOS: 216.0 vote(s)
		RDH: 218 vote(s)
	P22USSDBOY has a difference of -4.0 vote(s)
		SOS: 1024.0 vote(s)
		RDH: 1028 vote(s)
	P22USSDDEA has a difference of -2.0 vote(s)
		SOS: 230.0 vote(s)
		RDH: 232 vote(s)
	PCON02DHAR has a diffe

In [61]:
issue_counties = ['Baldwin', 'Barbour', 'Blount', 'Conecuh', 'Covington', 'Crenshaw', 'Elmore', 'Escambia', 'Houston', 'Lauderdale', 'Lawrence', 'Marshall', 'St. Clair', 'Walker']

#### Checking statewide ballot amendment against SOS pdf

- [Secretary of State Certified Ballot Measure Results](https://www.sos.alabama.gov/sites/default/files/election-data/2022-11/Final%20Canvass%20of%20Results%20%28canvassed%20by%20state%20canvassing%20board%2011-28-2022%29.pdf)

In [62]:
# Display counts for statewide amendment 1
al_prim.groupby(['county'])['G22A01YES', 'G22A01NO'].agg(['sum']).reset_index()

Unnamed: 0_level_0,county,G22A01YES,G22A01NO
Unnamed: 0_level_1,Unnamed: 1_level_1,sum,sum
0,Autauga,7735,2822
1,Baldwin,28008,8014
2,Barbour,2898,803
3,Bibb,2549,1143
4,Blount,6718,2493
5,Bullock,1801,327
6,Butler,2871,816
7,Calhoun,12785,3437
8,Chambers,3722,1252
9,Cherokee,3586,1120


Anomalies in statewide ballot amendment 1

Russell County


    YES - RDH : 3974
          SOS : 3970
    
    NO - RDH : 869
         SOS : 868

In [63]:
statewide_check_list.append('G22A01YES')
statewide_check_list.append('G22A01NO')
issue_counties.append('Russell')

In [64]:
len(issue_counties)

15

<p><a name="discrep"></a></p>

### Addressing Discrepancies in Vote Total Checks

#### Summary

Counties w/ vote total discrepancies accounted for and corrected : Concecuh, Covington, Elmore, Lawrence

Counties w/ unaccounted for vote total discrepancies, likely typos : Baldwin, Barbour, Blount, Crenshaw, Marshall, Russell, St. Clair, Walker, Escambia, Houston, Lauderdale

###### Accounted for discrepancies:
- In the following counties, we were able to account for the vote total discrepancies between the SOS totals and RDH totals
 1. Conecuh County
        - The following Democratic candidates are missing provisional ballots in the SOS totals
                - P22GOVDFOR RDH has +1 vote
                - P22GOVDKEN RDH has +1 vote
                - P22GOVDSMI RDH has +2 vote
                - P22USSDBOY RDH has +4 vote
                - P22USSDDEA RDH has +2 vote
                - PCON02DHAR RDH has +5 vote
                - PCON02DPAT RDH has +1 vote
                - PSU23DMEL RDH has +2 vote
                - PSU23DSAN RDH has +1 vote
        - The following Republican candidate is missing absentee ballots in the SOS totals
                - P22GOVRBLA RDH has +2 votes
            
 2. Covington County
          - Covington County is missing provisional ballots in the precinct-level returns for the following candidates.
                - P22ATGRMAR RDH has -19 votes
                - P22ATGRSTI RDH has -2 votes
                - P22AUDRCOO RDH has -6 votes
                - P22AUDRGLO RDH has -4 votes
                - P22AUDRSOR RDH has -9 votes
                - P22GOVRBLA RDH has -3 votes
                - P22GOVRBUR RDH has -1 votes
                - P22GOVRIVE RDH has -16 votes
                - P22GOVRJAM RDH has -3 votes
                - P22GOVRJON RDH has -1 votes
                - P22GOVRODL RDH has -2 votes
                - P22PS1RHAM RDH has -10 votes
                - P22PS1RODE RDH has -3 votes
                - P22PS1RWOO RDH has -5 votes
                - P22PS2RBEE RDH has -10 votes
                - P22PS2RLIT RDH has -1 votes
                - P22PS2RMCC RDH has -7 votes
                - P22SOSRALL RDH has -9 votes
                - P22SOSRHOR RDH has -2 votes
                - P22SOSRPAC RDH has -2 votes
                - P22SOSRZEI RDH has -9 votes
                - P22USSRBRI RDH has -10 votes
                - P22USSRBRO RDH has -7 votes
                - P22USSRDUR RDH has -7 votes
                - PSL092RHAM RDH has -12 votes
                - PSL092RWHI RDH has -13 votes
                - PSSC5RCOO RDH has -14 votes
                - PSSC5RJON RDH has -8 votes
                - PSU31RCAR RDH has -10 votes
                - PSU31RHOR RDH has -2 votes
                - PSU31RJON RDH has -14 votes     
                
 3. Elmore County
        - SOS missing provisional ballots for this candidate
                - P22GOVDFLO RDH has +2 votes
        - SOS missing 40 ballots for this candidate
                - P22SOSRZEI RDH has +40 votes
 4. Lawrence County
         - SOS is missing provisional ballots for the following candidates
                - P22ATGRMAR RDH has +11 votes
                - P22AUDRCOO RDH has +4 votes
                - P22AUDRGLO RDH has +1 votes
                - P22AUDRSOR RDH has +3 votes
                - P22GOVRBLA RDH has +4 votes
                - P22GOVRBUR RDH has +1 vote
                - P22GOVRIVE RDH has +6 votes
                - P22GOVRJAM RDH has +2 votes
                - P22GOVRODL RDH has +1 vote
                - P22PS1RHAM RDH has +2 votes
                - P22PS1RODE RDH has +2 votes
                - P22PS1RWOO RDH has +5 votes
                - P22PS2RBEE RDH has +4 votes
                - P22PS2RLIT RDH has +1 vote
                - P22PS2RMCC RDH has +3 votes
                - P22SOSRALL RDH has +6 votes
                - P22SOSRPAC RDH has +1 vote
                - P22SOSRZEI RDH has +5 votes
                - P22USSRBRI RDH has +8 votes
                - P22USSRBRO RDH has +5 votes
                - P22USSRDUR RDH has +1 vote
                - PSL007RROB RDH has +4 votes
                - PSL007RYAR RDH has +8 votes
                - PSSC5RCOO RDH has +6 votes
                - PSSC5RJON RDH has +5 votes
                
                
                
###### Unaccounted for discrepancies
- After reviewing the data, we are unable to determine the origin of the discrepancy, or believe the discrepancy is likely due to a data entry error or typo. Since we are unable to determine if each typo happened at the county or state level, we have chosen to leave the precinct-level results unaltered in the following cases.
    1. Baldwin County
            - P22GOVRGEO RDH has -1 vote
    2. Barbour County
            - P22PS2RLIT RDH has -10 votes
    3. Blount County
            - P22PS2RMCC RDH has -1 vote
    4. Crenshaw County
            - P22USSDBOY RDH has +1 vote
    5. Marshall County
            - P22GOVDSMI RDH has +1 vote
    6. Russell County
            - G22A01YES  RDH has +4 votes
            - G22A01NO   RDH has +1 vote
    7. St. Clair County
            - P22GOVRBUR RDH has -10 votes
    8. Walker County
                - P22GOVDJAM RDH has -1 vote
                - P22GOVDSMI RDH has -1 vote
                - P22USSDBOY RDH has -1 vote
                - P22USSDJAC RDH has -1 vote
                - PCON04DGOR RDH has -1 vote
                - PCON04DNEI RDH has -1 vote
    9. Escambia County
        - SOS results missing 310 votes, we believe this is an unintentional error or typo, since SOS has 6 votes recorded for this candidate.
                - P22ATGRSTI RDH has +310 votes
    10. Houston County
        - SOS has an additional 70 ballots for this candidate, this is likely a typo
                - PBOE02RBAL RDH has -70 votes

    11. Lauderdale County
        - SOS has an additional 10 ballots for this candidate
                - PCON04DNEI RDH has -10 votes


##### Changes to precint-level results

In [66]:
# create DF for provisional votes in Covington county
covington_prov = remaining_absentee[remaining_absentee['UNIQUE_ID'] == '039-PROVISIONAL'].copy()

In [67]:
# Covington Changes to RDH precint-level returns
Covington = '''P22ATGRMAR RDH has -19 votes
       -: P22ATGRSTI RDH has -2 votes
       -: P22AUDRCOO RDH has -6 votes
       -: P22AUDRGLO RDH has -4 votes
       -: P22AUDRSOR RDH has -9 votes
       -: P22GOVRBLA RDH has -3 votes
       -: P22GOVRBUR RDH has -1 votes
       -: P22GOVRIVE RDH has -16 votes
       -: P22GOVRJAM RDH has -3 votes
       -: P22GOVRJON RDH has -1 votes
       -: P22GOVRODL RDH has -2 votes
       -: P22PS1RHAM RDH has -10 votes
       -: P22PS1RODE RDH has -3 votes
       -: P22PS1RWOO RDH has -5 votes
       -: P22PS2RBEE RDH has -10 votes
       -: P22PS2RLIT RDH has -1 votes
       -: P22PS2RMCC RDH has -7 votes
       -: P22SOSRALL RDH has -9 votes
       -: P22SOSRHOR RDH has -2 votes
       -: P22SOSRPAC RDH has -2 votes
       -: P22SOSRZEI RDH has -9 votes
       -: P22USSRBRI RDH has -10 votes
       -: P22USSRBRO RDH has -7 votes
       -: P22USSRDUR RDH has -7 votes
       -: PSL092RHAM RDH has -12 votes
       -: PSL092RWHI RDH has -13 votes
       -: PSSC5RCOO RDH has -14 votes
       -: PSSC5RJON RDH has -8 votes
       -: PSU31RCAR RDH has -10 votes
       -: PSU31RHOR RDH has -2 votes
       -: PSU31RJON RDH has -14 votes '''

In [68]:
# create dictionary of changes
covington_dict = {}
for item in Covington.split('-:'):
    item = item.strip()
    choice = item.split( )[0]
    votechange = abs(int(item.split( )[-2]))
    covington_dict[choice] = votechange

In [69]:
# apply changes to covington_prov DF
for item in covington_dict.keys():
    old = covington_prov[item]
    covington_prov[item] += covington_dict[item]
    #print(item, int(covington_prov[item]))

In [70]:
# check
covington_prov[['P22ATGRMAR']]

pivot_col,P22ATGRMAR
39,19


In [71]:
# Re allocate provisional votes for Covington county
# run final vote allocation for 66 non jefferson counties
final_allocation_cov = hlp.allocate_absentee(final_allocation, covington_prov, list(covington_dict.keys()), "county", allocating_to_all_empty_precs=False)

##### Discrepancies in SOS totals

In [72]:
# create function to make changes to SOS totals df
def sos_change(countyname, countychangedict):
    county_index = sos_tot_pvt[sos_tot_pvt['county'] == countyname].index[0]
    print(countyname)
    for item in countychangedict.keys():
        old = sos_tot_pvt.at[county_index, item]
        sos_tot_pvt.at[county_index, item] += countychangedict[item]
        #print(item, int(old), int(sos_tot_pvt.at[county_index, item]))

In [73]:
#Conecuh County Changes
# copy conecuh discrepancies into string
Conecuh = '''  P22GOVDFOR RDH has +1 vote
  - P22GOVDKEN RDH has +1 vote
  - P22GOVDSMI RDH has +2 vote
  - P22USSDBOY RDH has +4 vote
  - P22USSDDEA RDH has +2 vote
  - PCON02DHAR RDH has +5 vote
  - PCON02DPAT RDH has +1 vote
  - PSU23DMEL RDH has +2 vote
  - PSU23DSAN RDH has +1 vote
  - P22GOVRBLA RDH has +2 votes'''
# split into a list
conecuh_changes = Conecuh.split(' -')
# create dictionary of changes
conecuh_dict = {}
for item in conecuh_changes:
    choice = item.split( )[0]
    votechange = int(item.split( )[-2])
    conecuh_dict[choice] = votechange

In [74]:
# Elmore County Changes
elmore_dict ={}
elmore_dict['P22GOVDFLO'] = 2
elmore_dict['P22SOSRZEI'] = 40
# Escambia County Changees
escambia_dict = {}
escambia_dict['P22ATGRSTI'] = 310
# Baldwin County Changes
baldwin_dict = {}
baldwin_dict['P22GOVRGEO'] = -1
# Barbour County Changes
barbour_dict = {}
barbour_dict['P22PS2RLIT'] = -10
# Blount County Changes
blount_dict = {}
blount_dict['P22PS2RMCC'] = -1
# Crenshaw County Changes
crenshaw_dict = {}
crenshaw_dict['P22USSDBOY'] = +1
# Marshall County Changes
marshall_dict = {}
marshall_dict['P22GOVDSMI'] = +1
# Russell County Changes
russell_dict = {}
russell_dict['G22A01YES'] = +4
russell_dict['G22A01NO'] = +1
# St Clair County Changes
stclair_dict = {}
stclair_dict['P22GOVRBUR'] = -10
# Walker County
walker_dict = {}
walker_dict['P22GOVDJAM'] = -1
walker_dict['P22GOVDSMI'] = -1
walker_dict['P22USSDBOY'] = -1
walker_dict['P22USSDJAC'] = -1
walker_dict['PCON04DGOR'] = -1
walker_dict['PCON04DNEI'] = -1

In [75]:
# Lawrence County Changes
Lawrence = '''        P22ATGRMAR RDH has +11 votes
       - P22AUDRCOO RDH has +4 votes
       - P22AUDRGLO RDH has +1 votes
       - P22AUDRSOR RDH has +3 votes
       - P22GOVRBLA RDH has +4 votes
       - P22GOVRBUR RDH has +1 vote
       - P22GOVRIVE RDH has +6 votes
       - P22GOVRJAM RDH has +2 votes
       - P22GOVRODL RDH has +1 vote
       - P22PS1RHAM RDH has +2 votes
       - P22PS1RODE RDH has +2 votes
       - P22PS1RWOO RDH has +5 votes
       - P22PS2RBEE RDH has +4 votes
       - P22PS2RLIT RDH has +1 vote
       - P22PS2RMCC RDH has +3 votes
       - P22SOSRALL RDH has +6 votes
       - P22SOSRPAC RDH has +1 vote
       - P22SOSRZEI RDH has +5 votes
       - P22USSRBRI RDH has +8 votes
       - P22USSRBRO RDH has +5 votes
       - P22USSRDUR RDH has +1 vote
       - PSL007RROB RDH has +4 votes
       - PSL007RYAR RDH has +8 votes
       - PSSC5RCOO RDH has +6 votes
       - PSSC5RJON RDH has +5 votes'''
# split into a list
lawrence_changes = Lawrence.split(' -')
# create dictionary of changes
lawrence_dict = {}
for item in lawrence_changes:
    choice = item.split( )[0]
    votechange = int(item.split( )[-2])
    lawrence_dict[choice] = votechange

In [76]:
# Changing the counties with accounted for discrepancies only
sos_change('Conecuh', conecuh_dict)
sos_change('Elmore', elmore_dict)
sos_change('Lawrence', lawrence_dict)
sos_change('Escambia', escambia_dict)

Conecuh
Elmore
Lawrence
Escambia


### Rechecking Vote Totals 

In [77]:
#county totals re- check
rdh = final_allocation_cov
sos = sos_tot_pvt
partner_name = 'SOS'
source_name = 'RDH'
county_col = 'county'
hlp.county_totals_check(sos,partner_name, rdh, source_name, sos_tot_pvt.columns[1:], county_col,full_print=False, method='county')

***Countywide Totals Check***

Baldwin contains differences in these races:
	P22GOVRGEO has a difference of 1.0 vote(s)
		SOS: 147.0 vote(s)
		RDH: 146 vote(s)
Barbour contains differences in these races:
	P22PS2RLIT has a difference of 10.0 vote(s)
		SOS: 233.0 vote(s)
		RDH: 223 vote(s)
Blount contains differences in these races:
	P22PS2RMCC has a difference of 1.0 vote(s)
		SOS: 2432.0 vote(s)
		RDH: 2431 vote(s)
Crenshaw contains differences in these races:
	P22USSDBOY has a difference of -1.0 vote(s)
		SOS: 206.0 vote(s)
		RDH: 207 vote(s)
Houston contains differences in these races:
	PBOE02RBAL has a difference of 70.0 vote(s)
		SOS: 4299.0 vote(s)
		RDH: 4229 vote(s)
Lauderdale contains differences in these races:
	PCON04DNEI has a difference of 10.0 vote(s)
		SOS: 518.0 vote(s)
		RDH: 508 vote(s)
Marshall contains differences in these races:
	P22GOVDSMI has a difference of -1.0 vote(s)
		SOS: 61.0 vote(s)
		RDH: 62 vote(s)
St. Clair contains differences in these races:
	P22GOVR

<p><a name="readme"></a></p>

### Creating README

In [78]:
rm_df = pd.read_csv('./field_names.csv')
rm_df.columns = ['VEST', 'PIVOT']

In [79]:
rm_df.head()

Unnamed: 0,VEST,PIVOT
0,G22A01NO,PROPOSED STATEWIDE AMENDMENT NUMBER ONE (1)-:-No
1,G22A01YES,PROPOSED STATEWIDE AMENDMENT NUMBER ONE (1)-:-Yes
2,P22ATGRMAR,ATTORNEY GENERAL-:-Steve Marshall-:-REP
3,P22ATGRSTI,ATTORNEY GENERAL-:-Harry Bartlett Still III-:-REP
4,P22AUDRCOO,STATE AUDITOR-:-Stan Cooke-:-REP


In [80]:
rm_df['contest'] = rm_df['PIVOT'].apply(lambda x : x.split('-:-')[0])
rm_df['name'] = rm_df['PIVOT'].apply(lambda x: x.split('-:-')[1])
rm_df['party'] = rm_df['PIVOT'].apply(lambda x : x.split('-:-')[-1])
rm_df['contest_stnd'] = rm_df['contest'] + ' - ' + rm_df['party']
rm_df['rm_name'] =rm_df['name'].apply(lambda x: x.split()[-1] + ', ' + ' '.join(x.split()[:-1]))

In [81]:
contests_unord = rm_df['contest'].unique()
#contests_unord

In [82]:
undist = ['PROPOSED STATEWIDE AMENDMENT NUMBER ONE (1)',
          'UNITED STATES SENATOR',
          'GOVERNOR',
          'ATTORNEY GENERAL',
          'SECRETARY OF STATE',
          'STATE AUDITOR']
boe = list(contests_unord[8:11])
psc = list(contests_unord[4:6])
ssc = ['ASSOCIATE JUSTICE OF THE SUPREME COURT, PLACE 5']
us_cong = list(contests_unord[11:16])
al_senate = list(contests_unord[-14:])
al_house = list(contests_unord[16:60])

In [83]:
contests_ord = undist + boe + psc + ssc + us_cong + al_senate + al_house

In [84]:
# Create list of ordered stnd contests for all contest types
contests_order =[]
for i in contests_ord:
    temp_df = rm_df.loc[rm_df['contest'] == i]
    contests_sorted = sorted(temp_df['contest_stnd'].unique().tolist())
    contests_order += contests_sorted

In [85]:
rm_df['vote_tot'] = rm_df['VEST'].apply(lambda x: sum(final_allocation[x]))

In [86]:
# create list to order README and final dataset by contest, and vote share
rm_order =[]
for i in contests_order:
    temp = rm_df.loc[rm_df['contest_stnd'] == i].sort_values('vote_tot', ascending = False)
    rm_order += temp['VEST'].to_list()
#check to make sure all contests included
set(rm_order) == set(vote_cols)

True

In [87]:
#Create order column, mapping ordering list to VEST names
rm_df['Order'] = rm_df['VEST'].map(lambda x: rm_order.index(x))
#order the readme DF
rm_df = rm_df.sort_values('Order')
#create field name column
rm_df['description'] = rm_df['rm_name'] + ' - ' + rm_df['contest_stnd']

In [88]:
rm_df.head()

Unnamed: 0,VEST,PIVOT,contest,name,party,contest_stnd,rm_name,vote_tot,Order,description
0,G22A01NO,PROPOSED STATEWIDE AMENDMENT NUMBER ONE (1)-:-No,PROPOSED STATEWIDE AMENDMENT NUMBER ONE (1),No,No,PROPOSED STATEWIDE AMENDMENT NUMBER ONE (1) - No,"No,",181139,0,"No, - PROPOSED STATEWIDE AMENDMENT NUMBER ONE..."
1,G22A01YES,PROPOSED STATEWIDE AMENDMENT NUMBER ONE (1)-:-Yes,PROPOSED STATEWIDE AMENDMENT NUMBER ONE (1),Yes,Yes,PROPOSED STATEWIDE AMENDMENT NUMBER ONE (1) - Yes,"Yes,",605333,1,"Yes, - PROPOSED STATEWIDE AMENDMENT NUMBER ON..."
33,P22USSDBOY,UNITED STATES SENATOR-:-Will Boyd-:-DEM,UNITED STATES SENATOR,Will Boyd,DEM,UNITED STATES SENATOR - DEM,"Boyd, Will",107594,2,"Boyd, Will - UNITED STATES SENATOR - DEM"
34,P22USSDDEA,UNITED STATES SENATOR-:-Brandaun Dean-:-DEM,UNITED STATES SENATOR,Brandaun Dean,DEM,UNITED STATES SENATOR - DEM,"Dean, Brandaun",32865,3,"Dean, Brandaun - UNITED STATES SENATOR - DEM"
35,P22USSDJAC,UNITED STATES SENATOR-:-Lanny Jackson-:-DEM,UNITED STATES SENATOR,Lanny Jackson,DEM,UNITED STATES SENATOR - DEM,"Jackson, Lanny",28403,4,"Jackson, Lanny - UNITED STATES SENATOR - DEM"


In [89]:
#create fields_dict
fields_dict = dict(zip(rm_df['VEST'], rm_df['description']))

In [90]:
### Create README

fields_dict['UNIQUE_ID']='Unique ID for each precinct'
fields_dict['COUNTYFP']='County FIP identifier'
fields_dict['county']='County Name'
fields_dict['precinct']='Precinct Name'

title = "Alabama 2022 Primary Election Precinct-Level Results"
retrieval_date = "10/05/23"
fields_dict = fields_dict
github_link = "https://github.com/nonpartisan-redistricting-datahub/pber_collection/tree/main/AL"
file_folder = "./"
source = "Alabama Secretary of State"

In [91]:
def full_readme_text(title, retrieval_date, source, fields_dict, github_link):

#First section of README
    readme_p1 = '''{title}\n
## RDH Date Retrieval
{retrieval_date}

## Sources
{source}

## Notes on Field Names (adapted from VEST):
Columns reporting votes generally follow the pattern: 
One example is:
G16PREDCLI
The first character is G for a general election, P for a primary, S for a special, and R for a runoff.
Characters 2 and 3 are the year of the election.*
Characters 4-6 represent the office type (see list below).
Character 7 represents the party of the candidate.
Characters 8-10 are the first three letters of the candidate's last name.

*To fit within the GIS 10 character limit for field names, the naming convention is slightly different for the State Legislature, Public Service Commissioners and US House of Representatives. All fields are listed below with definitions.

Office Codes Used:
A## - Proposed Statewide Amendement
AGR - Commissioner of Agriculture
ATG - Attorney General
AUD - State Auditor
BOE## - State Board of Education
GOV - Governor
PS# - Public Service Commissioner
SOS - Secretary Of State
USS - United States Senator
CON## - U.S. Congress
SL###  - State Legislative Lower
SU##  - State Legislative Upper
SSC - Associate Justice State Supreme Court

## Fields:
'''.format(title = title, source = source, retrieval_date = retrieval_date)

#Second section of README
    fields_table = pd.DataFrame.from_dict(fields_dict.items())
    fields_table.columns = ["Field Name", "Description"]
    readme_p2 = fields_table.to_string(formatters={'Description':'{{:<{}s}}'.format(fields_table['Description'].str.len().max()).format, 'Field Name':'{{:<{}s}}'.format(fields_table['Field Name'].str.len().max()).format}, index=False, justify = "left")

#Third section of README
    readme_p3 = '''\n
## Processing Steps
Visit the RDH GitHub and the processing script for this code [here]({github_link})

## Additional Notes

Files were checked against republican and democratic offician election results files also available from the Alabama Secretary of State. Results matches exactly except for in the instances below.


In the following counties, we were able to account for the vote total discrepancies between the SOS totals and RDH totals.

SOS totals were missing provisional ballots for the following candidates in the counties below:
Conecuh County - P22GOVDFOR, P22GOVDKEN, P22GOVDSMI, P22USSDBOY, P22USSDDEA, PCON02DHAR, PCON02DPAT, PSU23DMEL, PSU23DSAN, P22GOVRBLA
Elmore County - P22GOVFLO
Lawrence County - P22ATGRMAR, P22AUDRCOO, P22AUDRGLO, P22AUDRSOR, P22GOVRBLA, P22GOVRBUR, P22GOVRIVE, P22GOVRJAM, P22GOVRODL, P22PS1RHAM, P22PS1RODE, P22PS1RWOO, P22PS2RBEE, P22PS2RLIT, P22PS2RMCC, P22SOSRALL, P22SOSRPAC, P22SOSRZEI, P22USSRBRI, P22USSRBRO, P22USSRDUR, PSL007RROB, PSL007RYAR, PSSC5RCOO, PSSC5RJON

Precinct-level returns for Covington County had no provisional votes reported. We believe the vote total discrepancies in this county were likely due to the omission of provisional ballots in the precint-level results.
The dataset reflects the addition of the following votes as provisional.
Covington County
       - P22ATGRMAR  +19 votes
       - P22ATGRSTI +2 votes
       - P22AUDRCOO +6 votes
       - P22AUDRGLO +4 votes
       - P22AUDRSOR +9 votes
       - P22GOVRBLA +3 votes
       - P22GOVRBUR +1 votes
       - P22GOVRIVE +16 votes
       - P22GOVRJAM +3 votes
       - P22GOVRJON +1 votes
       - P22GOVRODL +2 votes
       - P22PS1RHAM +10 votes
       - P22PS1RODE +3 votes
       - P22PS1RWOO +5 votes
       - P22PS2RBEE +10 votes
       - P22PS2RLIT +1 votes
       - P22PS2RMCC +7 votes
       - P22SOSRALL +9 votes
       - P22SOSRHOR +2 votes
       - P22SOSRPAC +2 votes
       - P22SOSRZEI +9 votes
       - P22USSRBRI +10 votes
       - P22USSRBRO +7 votes
       - P22USSRDUR +7 votes
       - PSL092RHAM +12 votes
       - PSL092RWHI +13 votes
       - PSSC5RCOO +14 votes
       - PSSC5RJON +8 votes
       - PSU31RCAR +10 votes
       - PSU31RHOR +2 votes
       - PSU31RJON +14 votes   
       

For the following counties and candidates, we were unable to account for vote total discrepancies between the precinct-level results and the state-level results.
In some instances, we believe the discrepancy is likely due to a data entry error or typo. Since we are unable to determine if each typo happened at the county or state level, we have chosen to leave the precinct-level results unaltered in the following cases.

Baldwin County - P22GOVRGEO RDH has -1 vote
Barbour County - P22PS2RLIT RDH has -10 votes
Blount County - P22PS2RMCC RDH has -1 vote
Crenshaw County - P22USSDBOY RDH has +1 vote
Elmore County - P22SOSRZEI RDH has +40 votes
Marshall County - P22GOVDSMI RDH has +1 vote
Russell County - G22A01YES  RDH has +4 votes, G22A01NO   RDH has +1 vote
St. Clair County - P22GOVRBUR RDH has -10 votes
Walker County- RDH has one fewer vote for each of the following candidates - P22GOVDJAM, P22GOVDSMI, P22USSDBOY, P22USSDJAC, PCON04DGOR, PCON04DNEI
Escambia County - P22ATGRSTI RDH has +310 votes, we believe this is an unintentional error or typo, since SOS has 6 votes recorded for this candidate in total.
Houston County - PBOE02RBAL RDH has -70 votes
Lauderdale County- PCON04DNEI RDH has -10 votes

Please direct questions related to processing this dataset to info@redistrictingdatahub.org.
'''.format(github_link=github_link)
    
    full_readme = str(readme_p1)+str(readme_p2)+str(readme_p3)
    return full_readme

In [92]:
if not os.path.exists(file_folder):
    os.mkdir(file_folder)

with open(file_folder+"README.txt", 'w') as tf:
        tf.write(full_readme_text(title, retrieval_date, source, fields_dict, github_link))

<p><a name="exp"></a></p>

### Exporting Cleaned Precinct-Level Dataset

In [93]:
rm_order = ['UNIQUE_ID', 'COUNTYFP', 'county', 'precinct'] + rm_order

In [94]:
#checks
len(rm_order) == len(final_allocation_cov.columns)
set(rm_order) == set(final_allocation_cov.columns)

True

In [95]:
#reorder df
final_allocation_cov = final_allocation_cov[rm_order]

In [96]:
if not os.path.exists("./al_2022_prim_prec/"):
    os.mkdir("./al_2022_prim_prec/")

final_allocation.to_csv("./al_2022_prim_prec/al_2022_prim_prec.csv", index = False)