# Discrepancy between voting-age adults and voters currently registered

## Datasets

- "MnRegisteredVoterCountsByPrecinctJune2021.csv"
    - Number of registered voters by precinct for all precincts in MN as of June 1, 2021
    precinct
    - From MN SOS: https://www.sos.state.mn.us/election-administration-campaigns/data-maps/voter-registration-counts/
    - Will be updated June 1, 2022
    - Note: there is more than 1 entry for each precinct (since precincts also split by school dist, etc)
    
    
- "18PlusVotersByVotingDistrictInDodgeOlmsteadCounties.csv"
    - Dataset of 18+ voters by each voting district(precinct) for all precincts in Olmstead and Dodge county.
    - From Census.gov: https://data.census.gov/cedsci/table?t=Populations%20and%20People%3AVoting%20and%20Registration&g=0500000US27039,27039%247000000,27109,27109%247000000&tid=DECENNIALPL2020.P3&tp=true
    - Also has breakdowns by race if we want to look into that
    

- "18PlusVotersByVotingDistrictInDodgeOlmsteadCounties_AddedPrecinctCodes_Trimmed.csv"
    - Dataset of 18+ voters by each voting district(precinct) for all precincts in State Sen Dist 24.
    - Same dataset as above but edited to include a column with corresponding precinct codes (Precinct codes from MN SOS dataset)
    - Also edited dataset to only include precincts in senate dist 24
        - (Ideally we could get the census.gov dataset of 18+ voters by each voting district(precinct) in senate dist 24, but the group of voting dists inside senate districts don't seem to be updated to account for redistricting [https://data.census.gov/cedsci/table?t=Populations%20and%20People%3AVoting%20and%20Registration&g=0500000US27039%247000000_610XX00US27024&tid=DECENNIALPL2020.P3&tp=true] so we're getting voting dists in the 2 counties and manually removing the ones we don't need)

## Part 1: Number of people 18+ by precinct

In [1]:
import pandas as pd
import numpy as np

In [2]:
# Read the modified data file taken from the census.gov website

eighteenPlusDF = pd.read_csv('18PlusVotersByVotingDistrictInDodgeOlmsteadCounties_AddedPrecinctCodes_Trimmed.csv')

In [3]:
# Clean up data - We only want the precinct name (includes county), precinct code, and count of 18+ voters

eighteenPlusTotalsDF = eighteenPlusDF[['Label (Grouping)','Precinct Code','Total:']] # gets only the label and total count columns
eighteenPlusTotalsCleanDF = eighteenPlusTotalsDF[eighteenPlusTotalsDF['Total:'].notna()] # gets only the rows that have values in the "Total:" column
eighteenPlusTotalsCleanDF = eighteenPlusTotalsCleanDF[eighteenPlusTotalsCleanDF['Precinct Code'].notna()] # gets only the rows that have values in the "Precinct Code" column
eighteenPlusTotalsCleanDF = eighteenPlusTotalsCleanDF.rename(columns = {'Total:':'Number of People 18+'})

### Now, can have the data table of the number of eligible voters for each precinct in State Sen dist 24

In [4]:
print(eighteenPlusTotalsCleanDF)

                                      Label (Grouping)  Precinct Code  \
5                 Ashland Twp, Dodge County, Minnesota            5.0   
7                Canisteo Twp, Dodge County, Minnesota           10.0   
9                   Claremont, Dodge County, Minnesota           15.0   
11              Claremont Twp, Dodge County, Minnesota           20.0   
13                Concord Twp, Dodge County, Minnesota           25.0   
15           Dodge Center P-1, Dodge County, Minnesota           30.0   
17           Dodge Center P-2, Dodge County, Minnesota           32.0   
19              Ellington Twp, Dodge County, Minnesota           35.0   
21                   Hayfield, Dodge County, Minnesota           40.0   
23               Hayfield Twp, Dodge County, Minnesota           45.0   
25                 Kasson P-1, Dodge County, Minnesota           50.0   
27                 Kasson P-2, Dodge County, Minnesota           51.0   
29                 Kasson P-3, Dodge County, Minnes

## Part 2: Number of registered voters by precinct

In [5]:
# Read the data file taken from the MN SOS website:

sen24RegVotersDF = pd.read_csv('MnRegisteredVoterCountsByPrecinctJune2021.csv')

In [6]:
# Clean up data - We only want county name, precinct code, precinct name, number of registered voters

sen24RegVotersCleanDF = sen24RegVotersDF[['County Name','Precinct Name','Precinct Code','Number of Registered Voters']]

### Now, we can have the data table of the number of registered voters for each precinct in Minnesota:

In [7]:
print(sen24RegVotersCleanDF)

          County Name                 Precinct Name  Precinct Code  \
0              Aitkin                        AITKIN            5.0   
1              Aitkin                    AITKIN TWP           10.0   
2              Aitkin                BALL BLUFF TWP           15.0   
3              Aitkin                BALL BLUFF TWP           15.0   
4              Aitkin                    BALSAM TWP           20.0   
5              Aitkin                    BEAVER TWP           25.0   
6              Aitkin                     CLARK TWP           30.0   
7              Aitkin                   CORNISH TWP           35.0   
8              Aitkin                   CORNISH TWP           35.0   
9              Aitkin  UNORG 47-24 (DAVIDSON UNORG)           38.0   
10             Aitkin               FARM ISLAND TWP           40.0   
11             Aitkin               FARM ISLAND TWP           40.0   
12             Aitkin                   FLEMING TWP           45.0   
13             Aitki

## Part 3: Compare number of registered voters by precinct and number of eligble voters side by side in the same table

In [8]:
# Create an empty list to store the number of registered voters by unique precinct

totalRegVotersByPrecinct = np.empty(len(eighteenPlusTotalsCleanDF))

In [9]:
# Go through unique precinct codes, add up reg voter counts for each precinct in SOS data. 
# Make sure counties are the same for precinct codes being compared 
# (ex: state sen dist 24 has 2 Precinct 5s, one in Dodge and one in Olmsted)

sen24UniquePNames = eighteenPlusTotalsCleanDF['Label (Grouping)'].tolist()
sen24UniquePCodes = eighteenPlusTotalsCleanDF['Precinct Code'].tolist()
num18PlusByPrecinct = eighteenPlusTotalsCleanDF['Number of People 18+'].tolist()

sosDataCounties = sen24RegVotersCleanDF['County Name'].tolist()
sosDataPcodes = sen24RegVotersCleanDF['Precinct Code'].tolist()
sosDataNumVoters = sen24RegVotersCleanDF['Number of Registered Voters'].tolist()

i=0
for pcode in sen24UniquePCodes: # goes through each precinct code in sen 24
    totalRegVotersByPrecinct[i] = 0 # each precinct's count starts with 0
    matchingPcodeIndeces = [i for i, val in enumerate(sosDataPcodes) if val==pcode] # gets all the indeces of the rows where the current precinct code appears in SOS data
    for indx in matchingPcodeIndeces:
#         if indx == 13:
#             print(pcode)
#             print(matchingPcodeIndeces)
#             print("MANTORVIILLE",indx, num18PlusByPrecinct[indx], sosDataNumVoters[indx],sosDataCounties[indx],sen24UniquePNames[i])
        if sosDataCounties[indx] in sen24UniquePNames[i]: # makes sure the current precinct code and the one being compared to are the same county
            totalRegVotersByPrecinct[i] += sosDataNumVoters[indx] # if the precinct code is the same and the county is the same, add the registerd voter count to the total count for that precicnt
    i+=1

In [10]:
# add the list of registered voter counts to the data table with the number of eligble voters

eighteenPlusTotalsCleanDF['Number of Registered Voters'] = totalRegVotersByPrecinct 

In [11]:
# Reindex data table

eighteenPlusTotalsCleanDF = eighteenPlusTotalsCleanDF.reset_index() # reset_index resets the indeces to start from 0, saves old indeces in column called index
del eighteenPlusTotalsCleanDF['index'] # delete the newly create index column

### Now we can see the number of eligble voters (18+) and number of registered voters in each precinct side by side:

In [12]:
print(eighteenPlusTotalsCleanDF)

                                     Label (Grouping)  Precinct Code  \
0                Ashland Twp, Dodge County, Minnesota            5.0   
1               Canisteo Twp, Dodge County, Minnesota           10.0   
2                  Claremont, Dodge County, Minnesota           15.0   
3              Claremont Twp, Dodge County, Minnesota           20.0   
4                Concord Twp, Dodge County, Minnesota           25.0   
5           Dodge Center P-1, Dodge County, Minnesota           30.0   
6           Dodge Center P-2, Dodge County, Minnesota           32.0   
7              Ellington Twp, Dodge County, Minnesota           35.0   
8                   Hayfield, Dodge County, Minnesota           40.0   
9               Hayfield Twp, Dodge County, Minnesota           45.0   
10                Kasson P-1, Dodge County, Minnesota           50.0   
11                Kasson P-2, Dodge County, Minnesota           51.0   
12                Kasson P-3, Dodge County, Minnesota           

In [13]:
# create an empty list to store percent of eligble voters that are registered voters

percentVotersReg = [0] * len(eighteenPlusTotalsCleanDF)

In [14]:
# calculate the percents of registered voters for each precinct

numVotersReg = eighteenPlusTotalsCleanDF['Number of Registered Voters'].tolist()
numVotersEligible = eighteenPlusTotalsCleanDF['Number of People 18+'].tolist()

for n in range(len(numVotersReg)):
    numVotersEligible[n] = numVotersEligible[n].replace(',', '') # takes out the comma from the string number (ex 1,556 becomes 1556)
    percentVotersReg[n] = round((numVotersReg[n]/(int(numVotersEligible[n])))*100) # converts # eligible to int and divides registered by eligible, rounds to nearst whole

In [15]:
# add percent of registered voters by precinct to data table

eighteenPlusTotalsCleanDF['% of Registered Voters'] = percentVotersReg 

### Now we can see the percent of registered voters in each precinct

In [16]:
print(eighteenPlusTotalsCleanDF)

                                     Label (Grouping)  Precinct Code  \
0                Ashland Twp, Dodge County, Minnesota            5.0   
1               Canisteo Twp, Dodge County, Minnesota           10.0   
2                  Claremont, Dodge County, Minnesota           15.0   
3              Claremont Twp, Dodge County, Minnesota           20.0   
4                Concord Twp, Dodge County, Minnesota           25.0   
5           Dodge Center P-1, Dodge County, Minnesota           30.0   
6           Dodge Center P-2, Dodge County, Minnesota           32.0   
7              Ellington Twp, Dodge County, Minnesota           35.0   
8                   Hayfield, Dodge County, Minnesota           40.0   
9               Hayfield Twp, Dodge County, Minnesota           45.0   
10                Kasson P-1, Dodge County, Minnesota           50.0   
11                Kasson P-2, Dodge County, Minnesota           51.0   
12                Kasson P-3, Dodge County, Minnesota           

In [17]:
### Precicts in order of percentage of registered voters:

In [18]:
# rename precinct name column 
eighteenPlusTotalsCleanDF = eighteenPlusTotalsCleanDF.rename(columns = {'Label (Grouping)':'Precinct Name'})


# sorts by percentage of registered voters
eighteenPlusTotalsCleanDF.sort_values('% of Registered Voters')

Unnamed: 0,Precinct Name,Precinct Code,Number of People 18+,Number of Registered Voters,% of Registered Voters
44,"Rochester W-4 P-6, Olmsted County, Minnesota",164.0,828,513.0,62
26,"Rochester W-1 P-1, Olmsted County, Minnesota",96.0,1500,928.0,62
45,"Rochester W-4 P-7, Olmsted County, Minnesota",166.0,2544,1668.0,66
50,"Rochester Twp P-3, Olmsted County, Minnesota",212.0,6,4.0,67
2,"Claremont, Dodge County, Minnesota",15.0,392,261.0,67
6,"Dodge Center P-2, Dodge County, Minnesota",32.0,488,336.0,69
12,"Kasson P-3, Dodge County, Minnesota",52.0,1180,843.0,71
31,"Rochester W-1 P-6, Olmsted County, Minnesota",106.0,645,462.0,72
38,"Rochester W-2 P-3, Olmsted County, Minnesota",126.0,2567,1838.0,72
35,"Rochester W-1 P-10, Olmsted County, Minnesota",114.0,1484,1110.0,75


In [19]:
# export the df to a csv file 

eighteenPlusTotalsCleanDF.to_csv('percentOfRegisteredVotersInSen24.csv', index=False, encoding='utf-8')