# North Carolina Voter Registration Analysis

### Requirements
- Download North Carolina voter registration database available here: https://www.ncsbe.gov/results-data/voter-registration-data
- Using the BISG implementation available here
https://surgeo.readthedocs.io/en/dev/
- and the “weighted estimator” as described in this paper
https://arxiv.org/pdf/1811.11154
Task

Write code (in python preferably) to approximate the racial composition of each political party (DEM, REP, LIB, IND) using the weighted estimator and the BISG implementation as your proxy predictor. Do this for a county of your choosing. Also chose some appropriate visualization to show the error of your estimates and the true race proportions

### Some things to keep in mind
- You will need to do a little bit of data processing of the North Carolina voter registration dataset. Make sure that the code you write to do this is well-documented and easy to follow
- I would recommend wrapping the BISG library in a custom class since we will be implementing many other methods for prediction by proxy. Try writing a “ProxyPredictor” interface that contains an “inference” method
- Your subclass’s implementation of the “inference” method should take as input a pandas data frame, and should output a pandas data frame with race predictions
Note: this method will not be complicated for this example, and should just interface the functionality of Surgeo (the BISG library) with the codebase that you are developing

### Download Dataset:


In [2]:
import os

datapath = "data/ncvoter_Statewide.txt"
if not os.path.exists("data"):
    os.makedirs("data")
if not os.path.isfile(datapath):
  !wget -O data.zip "https://s3.amazonaws.com/dl.ncsbe.gov/data/ncvoter_Statewide.zip"
  !unzip data.zip -d data

In [5]:
import pandas as pd
voter_data = pd.read_csv(datapath, sep='\t', encoding="latin1")
voter_data.head()


  voter_data = pd.read_csv(datapath, sep='\t', encoding="latin1")


Unnamed: 0,county_id,county_desc,voter_reg_num,ncid,last_name,first_name,middle_name,name_suffix_lbl,status_cd,voter_status_desc,...,sanit_dist_abbrv,sanit_dist_desc,rescue_dist_abbrv,rescue_dist_desc,munic_dist_abbrv,munic_dist_desc,dist_1_abbrv,dist_1_desc,vtd_abbrv,vtd_desc
0,1,ALAMANCE,9005990,AA56273,AABEL,RUTH,EVELYN,,R,REMOVED,...,,,,,,,,,,
1,1,ALAMANCE,9178574,AA201627,AARDEN,JONI,AUTUMN,,R,REMOVED,...,,,,,,,,,,
2,1,ALAMANCE,9205561,AA216996,AARMSTRONG,TIMOTHY,DUANE,,A,ACTIVE,...,,,,,,,17.0,PROSECUTORIAL DISTRICT 17,103,103
3,1,ALAMANCE,9048723,AA98377,AARON,CHRISTINA,CASTAGNA,,A,ACTIVE,...,,,,,BUR,BURLINGTON,17.0,PROSECUTORIAL DISTRICT 17,03S,03S
4,1,ALAMANCE,9019674,AA69747,AARON,CLAUDIA,HAYDEN,,A,ACTIVE,...,,,,,BUR,BURLINGTON,17.0,PROSECUTORIAL DISTRICT 17,124,124


Unnamed: 0,county_id,county_desc,voter_reg_num,ncid,last_name,first_name,middle_name,name_suffix_lbl,status_cd,voter_status_desc,...,sanit_dist_abbrv,sanit_dist_desc,rescue_dist_abbrv,rescue_dist_desc,munic_dist_abbrv,munic_dist_desc,dist_1_abbrv,dist_1_desc,vtd_abbrv,vtd_desc
0,1,ALAMANCE,9005990,AA56273,AABEL,RUTH,EVELYN,,R,REMOVED,...,,,,,,,,,,
1,1,ALAMANCE,9178574,AA201627,AARDEN,JONI,AUTUMN,,R,REMOVED,...,,,,,,,,,,
2,1,ALAMANCE,9205561,AA216996,AARMSTRONG,TIMOTHY,DUANE,,A,ACTIVE,...,,,,,,,17.0,PROSECUTORIAL DISTRICT 17,103,103
3,1,ALAMANCE,9048723,AA98377,AARON,CHRISTINA,CASTAGNA,,A,ACTIVE,...,,,,,BUR,BURLINGTON,17.0,PROSECUTORIAL DISTRICT 17,03S,03S
4,1,ALAMANCE,9019674,AA69747,AARON,CLAUDIA,HAYDEN,,A,ACTIVE,...,,,,,BUR,BURLINGTON,17.0,PROSECUTORIAL DISTRICT 17,124,124
