In [1]:
import os
import geopandas as gp
import pandas as pd
wd = os.getcwd()

# Butler_County_Turnout_09_23_2022

## Background:
- We received a request for "Unregistered voters butler county ohio" from Get Out The Vote to get "people registered to vote".
- RDH does not have information of individuals who are not registered to vote, but we can provide information regarding how many people are registered at various geographies.
- The user was also interested in turnout statistics from the 2020 general election, as well as other demographic statistics.
- We used data at the block level, as it is the most granular data we can provide.

## Approach:
- Use RDH datasests of Disaggregated 2020 CVAP to the Block Level, 2020 Voter Turnout at the 2020 Block Level, and Public Law 94-171 redistricting data (all for Ohio).
- Query out data for Butler County and join together to have block level dataset.
- Create unregistered voter estimate counts at the block level by subtracting the total registered voters from Citizen Voting Age Population (CVAP) and Voting Age Population (VAP). We provided an estimate using both methods, as some may prefer to work with data readily available at the block level (VAP) and others prefer the CVAP estimate for voting related data because of the citizenship question.
- Please note the unregistered voter estimates are *estimates* based on the two methods described above.

## Links to datasets used:
- 2020 OH L2 Voterfile Elections Turnout Statistics Aggregated to Census Blocks: https://redistrictingdatahub.org/dataset/2020-oh-l2-voterfile-elections-turnout-statistics-aggregated-to-census-blocks/
- Ohio CVAP Data Disaggregated to the 2020 Block Level (2020): https://redistrictingdatahub.org/dataset/ohio-cvap-data-disaggregated-to-the-2020-block-level-2020/
- Ohio block PL 94-171 2020 (by table): https://redistrictingdatahub.org/dataset/ohio-block-pl-94171-2020-by-table/


You can find the final file on the Redistricting Data Hub website here: https://redistrictingdatahub.org/dataset/butler-county-ohio-2020-voter-statistics-and-unregistered-voter-estimates-on-2020-census-blocks/

*Please note that in order to run this notebook you would need the files listed above downloaded to you working directory.*

Read in voter file and query out for Butler County (FIPS 017)

In [2]:
vf = pd.read_csv(os.path.join(os.path.join(wd,'OH_l2_turnout_2020blockAgg'),'OH_l2_turnout_stats_block20.csv'))
vf['GEOID20']=vf['geoid20'].apply(lambda x: str(x).split('.')[0])
vf['COUNTY']=vf['geoid20'].apply(lambda x: str(x)[2:5])
vf_bc = vf[vf['COUNTY']=='017'].copy()
for i in list(vf.columns):
    if 'pct' in i:
        vf[i] = vf[i].apply(lambda x: round(x,2))
vf_bc.drop(columns='geoid20',inplace=True)

Read in CVAP data and query out for Butler County

In [3]:
cvap = gp.read_file(os.path.join(os.path.join(wd,'oh_cvap_2020_2020_b'),'oh_cvap_2020_2020_b.shp'))
cvap_bc = cvap[cvap['COUNTYFP20']==17].copy()
cvap_bc['GEOID20']=cvap_bc['GEOID20'].astype(str)

Read in PL data and query out for Butler County

In [4]:
pl = gp.read_file(os.path.join(os.path.join(wd,'oh_pl2020_b'),'oh_pl2020_p4_b.shp'))
pl_bc = pl[pl['COUNTY']=='017']

Merge all files together

In [5]:
joined = pd.merge(cvap_bc,vf_bc,on='GEOID20',how='outer',indicator=False)
pl_bc = pl_bc[['GEOID20','P0040001']]
joined = pd.merge(joined,pl_bc,on='GEOID20',how='outer',indicator=False)
joined = joined.fillna(0)

Create unregistered voter count estimates

In [6]:
joined['UNREG_COUNT_EST_CVAP'] = round(joined['CVAP_TOT20'] - joined['total_reg'],2)
joined['UNREG_COUNT_EST_VAP'] = round(joined['P0040001'] - joined['total_reg'],2)
joined = joined.fillna(0)

Clean file and organize fields into "short" and "long" files

In [7]:
joined.rename(columns = {'P0040001':'VAP_TOT20'},inplace=True)
joined_order= list(joined.columns)
joined_order.remove('COUNTY')
joined_order.remove('VAP_TOT20')
joined_order.remove('UNREG_COUNT_EST_CVAP')
joined_order.remove('UNREG_COUNT_EST_VAP')
joined_order.remove('GEOID20')
joined_order.remove('COUNTYFP20')
joined_order.remove('CVAP_TOT20')
joined_order.remove('total_reg')
joined_order.remove('g20201103_voted_all')
joined_order.remove('geometry')


first_cols = ['GEOID20','COUNTYFP20','CVAP_TOT20','VAP_TOT20','total_reg','UNREG_COUNT_EST_CVAP','UNREG_COUNT_EST_VAP','g20201103_voted_all']
all_cols = first_cols + joined_order

joined_long = joined[all_cols]
joined_short = joined[first_cols]

Extract data

In [8]:
joined_short.to_csv('butler_county_request_short.csv',index=False)
joined_long.to_csv('butler_county_request_long.csv',index=False)