# How to Prepare data for ZRP Predictions
The purpose of this notebook is to illustrate how to use `ZRP_Prepare`, a module that prepares user input data for generating predictions, models, & analysis. 

In [1]:
%load_ext autoreload
%autoreload 2
%config Completer.use_jedi=False

In [2]:
from os.path import join, expanduser, dirname
import pandas as pd
import sys
import os
import re
import warnings

In [3]:
warnings.filterwarnings(action='ignore')
home = expanduser('~')

src_path = '{}/zrp'.format(home)
sys.path.append(src_path)

In [4]:
from zrp.prepare.prepare import ZRP_Prepare
from zrp.prepare.utils import load_file

  from scipy.sparse.csr import csr_matrix


## Load sample data for prediction
Load processed list of New Jersey Mayors downloaded from https://www.nj.gov/dca/home/2022mayors.csv 

In [24]:
os.listdir()

['acs_mapper.py',
 'acs_lookup.py',
 '__init__.py',
 'geo_lookup.py',
 'geo_geocoder.py',
 'utils.py',
 'preprocessing.py',
 '.ipynb_checkpoints',
 'preparing_the_data.ipynb',
 'base.py',
 'prepare.py']

In [25]:
nj_mayors = load_file("/Users/jay/Documents/zrp/examples/2022-nj-mayors-sample.csv")
nj_mayors.shape

(462, 9)

In [26]:
nj_mayors

Unnamed: 0,first_name,middle_name,last_name,house_number,street_address,city,state,zip_code,ZEST_KEY
0,Gabe,,Plumer,782,Frenchtown Road,Milford,NJ,08848,2
1,Ari,,Bernstein,500,West Crescent Avenue,Allendale,NJ,07401,4
2,David,J.,Mclaughlin,125,Corlies Avenue,Allenhurst,NJ,07711-1049,5
3,Thomas,C.,Fritts,8,North Main Street,Allentown,NJ,08501-1607,6
4,P.,,McCkelvey,49,South Greenwich Street,Alloway,NJ,08001-0425,7
...,...,...,...,...,...,...,...,...,...
457,William,,Degroff,3943,Route,Chatsworth,NJ,08019,558
458,Joseph,,Chukwueke,200,Cooper Avenue,Woodlynne,NJ,08107-2108,559
459,Paul,,Sarlo,85,Humboldt Street,Wood-Ridge,NJ,07075-2344,560
460,Craig,,Frederick,120,Village Green Drive,Woolwich Township,NJ,08085-3180,562


#### ZRP Prepare  
To prepare the data we will use `ZRP_Prepare` 

Input data into the prediction/modeling pipeline is tabluar data with the following columns: first name, middle name, last name, house number, street address (street name), city, state, zip code, and zest key. The `ZEST_KEY` must be specified to establish correspondence between inputs and outputs; it's effectively used as an index for the data table.

`ZRP_Prepare` is used to process this input data into the set of requisite feature vectors necessary for prediction. When called, the `.transform()` function's processing steps include geocoding the data (converting addresses to block groups or census tracts), and matching the geocoded data on American Community Survey data lookup tables. This ultimately links input data to additional  demographic data based on individuals' geography. In the end, the input data is bolstered with additional features, which are used for predictions with enhanced feature fidelity. 

In [27]:
nj_mayors.head(2)

Unnamed: 0,first_name,middle_name,last_name,house_number,street_address,city,state,zip_code,ZEST_KEY
0,Gabe,,Plumer,782,Frenchtown Road,Milford,NJ,8848,2
1,Ari,,Bernstein,500,West Crescent Avenue,Allendale,NJ,7401,4


In [28]:
%%time
prepare = ZRP_Prepare()
## fit function just checkes the data quality 
prepare.fit(nj_mayors)

# zrp_output = prepare.transform(nj_mayors)

CPU times: user 48 µs, sys: 22 µs, total: 70 µs
Wall time: 75.1 µs


In [39]:
from os.path import dirname, join, expanduser
from zrp.validate import ValidateGeo
from preprocessing import *
from base import BaseZRP
from utils import *
import pandas as pd
import numpy as np
import statistics
import json
import sys
import os
import re

In [115]:
curpath = '/Users/jay/Documents/zrp/zrp/prepare'

In [116]:
data_path = join(curpath, f'../data/processed')
lookup_tables_config = load_json(join(data_path, "lookup_tables_config.json"))

In [119]:
print(data_path)

/Users/jay/Documents/zrp/zrp/prepare/../data/processed


In [117]:
lookup_tables_config

{'acs_year': '2019', 'acs_span': '5yr', 'geo_year': '2019'}

In [53]:
    geo_folder = os.path.join(data_path, "geo", lookup_tables_config['geo_year'])
    acs_folder = os.path.join(data_path, 'acs', lookup_tables_config['acs_year'], lookup_tables_config['acs_span'])

In [54]:
geo_folder 

'/Users/jay/Documents/zrp/zrp/prepare/../data/processed/geo/2019'

In [55]:
acs_folder

'/Users/jay/Documents/zrp/zrp/prepare/../data/processed/acs/2019/5yr'

In [59]:
gen_process = ProcessStrings(file_path=prepare.file_path)

In [61]:
gen_process.fit(data)

   [Start] Validating input data
     Number of observations: 462
     Is key unique: True
   [Completed] Validating input data



<preprocessing.ProcessStrings at 0x7fa409e00fa0>

In [62]:
data = gen_process.transform(data)

   Formatting P1
   Formatting P2
   reduce whitespace


In [63]:
data.head()

Unnamed: 0_level_0,first_name,middle_name,last_name,house_number,street_address,city,state,zip_code,house_number_LEFT,house_number_RIGHT
ZEST_KEY,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1
2,GABE,,PLUMER,782,FRENCHTOWN ROAD,MILFORD,NJ,8848,,782
4,ARI,,BERNSTEIN,500,WEST CRESCENT AVENUE,ALLENDALE,NJ,7401,,500
5,DAVID,J,MCLAUGHLIN,125,CORLIES AVENUE,ALLENHURST,NJ,77111049,,125
6,THOMAS,C,FRITTS,8,NORTH MAIN STREET,ALLENTOWN,NJ,85011607,,8
7,P,,MCCKELVEY,49,SOUTH GREENWICH STREET,ALLOWAY,NJ,80010425,,49


In [64]:
inv_state_map = load_json(join(data_path, "inv_state_mapping.json"))

In [66]:
gen_process.state

'state'

In [67]:
### replacing the state names with the state code and filling it in zest_in_state_flip column
data['zest_in_state_fips'] = data[gen_process.state].replace(inv_state_map)

In [68]:
data.head()

Unnamed: 0_level_0,first_name,middle_name,last_name,house_number,street_address,city,state,zip_code,house_number_LEFT,house_number_RIGHT,zest_in_state_fips
ZEST_KEY,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1
2,GABE,,PLUMER,782,FRENCHTOWN ROAD,MILFORD,NJ,8848,,782,34
4,ARI,,BERNSTEIN,500,WEST CRESCENT AVENUE,ALLENDALE,NJ,7401,,500,34
5,DAVID,J,MCLAUGHLIN,125,CORLIES AVENUE,ALLENHURST,NJ,77111049,,125,34
6,THOMAS,C,FRITTS,8,NORTH MAIN STREET,ALLENTOWN,NJ,85011607,,8,34
7,P,,MCCKELVEY,49,SOUTH GREENWICH STREET,ALLOWAY,NJ,80010425,,49,34


In [69]:
gen_process.file_path

In [107]:
from geo_geocoder import *
geocode = ZGeo(file_path=gen_process.file_path)

In [108]:
geocode_out = [] 
geo_grps = data.groupby([gen_process.state])

In [109]:
geo_dict = {}
for s, g in geo_grps:
    geo_dict[s] = g
gdkys = list(geo_dict.keys())
print("  The following states are included in the data:", gdkys)

  The following states are included in the data: ['NJ']


In [110]:
 if not set(gdkys) <= set(list(inv_state_map.keys())):
            raise ValueError("Provided unrecognizable state codes. Please use standard 2-letter abbreviation to indicate states to geocode, ex:'CA' for Californina")


In [111]:
inv_state_map.keys()

dict_keys(['AL', 'AK', 'AZ', 'AR', 'CA', 'CO', 'CT', 'DE', 'DC', 'FL', 'GA', 'HI', 'ID', 'IL', 'IN', 'IA', 'KS', 'KY', 'LA', 'ME', 'MD', 'MA', 'MI', 'MN', 'MS', 'MO', 'MT', 'NE', 'NV', 'NH', 'NJ', 'NM', 'NY', 'NC', 'ND', 'OH', 'OK', 'OR', 'PA', 'RI', 'SC', 'SD', 'TN', 'TX', 'UT', 'VT', 'VA', 'WA', 'WV', 'WI', 'WY', 'AS', 'FM', 'GU', 'MH', 'MP', 'PW', 'PR', 'UM', 'VI'])

In [122]:
geo_out = [] 
for s in tqdm(gdkys):
    print("   ... on state:", str(s))
    geo = inv_state_map[s].zfill(2)
    output = geocode.transform(geo_dict[s], geo, processed = True, replicate = True, save_table = True)
    geocode_out.append(output)
    break

  0%|                                                                              | 0/1 [00:00<?, ?it/s]
  0%|                                                                            | 0/462 [00:00<?, ?it/s][A[Parallel(n_jobs=-1)]: Using backend ThreadingBackend with 8 concurrent workers.
[Parallel(n_jobs=-1)]: Done  34 tasks      | elapsed:    0.0s
[Parallel(n_jobs=-1)]: Done 184 tasks      | elapsed:    0.0s
[Parallel(n_jobs=-1)]: Done 434 tasks      | elapsed:    0.0s
100%|███████████████████████████████████████████████████████████████| 462/462 [00:00<00:00, 32162.13it/s]
[Parallel(n_jobs=-1)]: Done 462 out of 462 | elapsed:    0.0s finished


   ... on state: NJ

   Data is loaded
   [Start] Processing geo data
      ...address cleaning
      ...replicating address
         ...Base
         ...Number processing...
         House number dataframe expansion is complete! (n=462)
         ...Base
         ...Map street suffixes...
         ...Mapped & split by street suffixes...
         ...Number processing...

         Address dataframe expansion is complete! (n=900)
      ...formatting
   [Completed] Processing geo data
   [Start] Mapping geo data
      ...merge user input & lookup table
      ...mapping


  0%|                                                                              | 0/1 [00:02<?, ?it/s]

   [Completed] Validating input geo data
Directory already exists
...Output saved
   [Completed] Mapping geo data





In [124]:
pd.set_option('display.max_columns',None)

In [126]:
output.head(2)

Unnamed: 0_level_0,first_name,middle_name,last_name,house_number,street_address,city,state,zip_code,house_number_LEFT,house_number_RIGHT,zest_in_state_fips,ZEST_KEY_COL,small,big,ZIP_Match_1,ZIP_Match_2,FROMHN_numeric,TOHN_numeric,house_numer_numeric,GEOID_CT,GEOID_BG,GEOID_ZIP,GEOID
ZEST_KEY,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1
10,JOHN,M,MORGAN,137,MAIN STREET,ANDOVER,NJ,7821,,137,34,10,,,,,,,,34037373500,340373735003,7821,
100,MICHAEL,L,TEMPLETON,770,COOPERTOWN ROAD,DELANCO,NJ,8075,,770,34,100,,,,,,,,34005700800,340057008003,8075,


### Zest geo data used here for lookup

In [129]:
pd.read_parquet('/Users/jay/Documents/zrp/zrp/data/processed/geo/2019/Zest_Geo_Lookup_2019_State_34.parquet').head()

Unnamed: 0,STATEFP,COUNTYFP,TRACTCE,BLKGRPCE,ZEST_FULLNAME,FROMHN,TOHN,ZEST_ZIP,ZCTA5CE,ZCTA5CE10,FROMHN_LEFT,FROMHN_RIGHT,TOHN_LEFT,TOHN_RIGHT,PARITY
0,34,1,10900,3,BATCHELOR LN,100,135,8037,8037,8037,,100,,135,B
1,34,1,10900,2,CARA LN,1,32,8037,8037,8037,,1,,32,B
2,34,1,10900,1,CENTENNIAL DR,1,81,8037,8037,8037,,1,,81,B
3,34,1,10900,1,CENTENNIAL DR,83,90,8037,8037,8037,,83,,90,B
4,34,1,10900,1,JAMESTOWN BLVD,200,211,8037,8037,8037,,200,,211,B


In [132]:
# geo_out = [] 
# for s in tqdm(gdkys):
#     print("   ... on state:", str(s))
#     geo = inv_state_map[s].zfill(2)
#     output = geocode.transform(geo_dict[s], geo, processed = True, replicate = True, save_table = True)
#     geocode_out.append(output)
# if len(geocode_out) > 0:
#     geo_coded = pd.concat(geocode_out)
geo_coded_keys = list(geo_coded.ZEST_KEY_COL.values)
data_not_geo_coded = data[~data.index.isin(geo_coded_keys)]
geo_coded = pd.concat([geo_coded, data_not_geo_coded])

In [135]:
if self.block_group is not None and self.census_tract is not None:
            geo_coded = geo_coded.drop([self.block_group, self.census_tract], axis = 1)
            geo_coded = geo_coded.merge(data[[self.block_group, self.census_tract]], right_index = True, left_index = True, how = 'left')
            geo_coded['GEOID_BG'] = np.where((geo_coded[self.block_group].isna()) | (geo_coded[self.block_group].str.contains("None") | (geo_coded[self.block_group] == ''))
                                             ,geo_coded['GEOID_BG']
                                             ,geo_coded[self.block_group])
            geo_coded['GEOID_CT'] = np.where((geo_coded[self.census_tract].isna()) | (geo_coded[self.census_tract].str.contains("None") | (geo_coded[self.census_tract] == ''))
                                             ,geo_coded['GEOID_CT']
                                             ,geo_coded[self.census_tract])
            geo_coded = geo_coded.drop([self.block_group, self.census_tract], axis = 1) 

924

In [139]:
print("[Completed] Preparing geo data")
print("")

[Completed] Preparing geo data



In [141]:
print("[Start] Preparing ACS data")  
print("   [Start] Validating ACS input data")

[Start] Preparing ACS data
   [Start] Validating ACS input data


In [143]:
validate = ValidateGeocoded()
validate.fit()
acs_validator = validate.transform(geo_coded)

     Number of observations: 924
     Is key unique: False



In [136]:
gen_process.block_group

In [138]:
gen_process.census_tract

In [148]:
# amp = ACSModelPrep(gen_process.params_dict)
data.head(2)

Unnamed: 0_level_0,first_name,middle_name,last_name,house_number,street_address,city,state,zip_code,house_number_LEFT,house_number_RIGHT,zest_in_state_fips
ZEST_KEY,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1
2,GABE,,PLUMER,782,FRENCHTOWN ROAD,MILFORD,NJ,8848,,782,34
4,ARI,,BERNSTEIN,500,WEST CRESCENT AVENUE,ALLENDALE,NJ,7401,,500,34


In [150]:
geo_coded.head(2)

Unnamed: 0_level_0,first_name,middle_name,last_name,house_number,street_address,city,state,zip_code,house_number_LEFT,house_number_RIGHT,zest_in_state_fips,ZEST_KEY_COL,small,big,ZIP_Match_1,ZIP_Match_2,FROMHN_numeric,TOHN_numeric,house_numer_numeric,GEOID_CT,GEOID_BG,GEOID_ZIP,GEOID
ZEST_KEY,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1
10,JOHN,M,MORGAN,137,MAIN STREET,ANDOVER,NJ,7821,,137,34,10,,,,,,,,34037373500,340373735003,7821,
100,MICHAEL,L,TEMPLETON,770,COOPERTOWN ROAD,DELANCO,NJ,8075,,770,34,100,,,,,,,,34005700800,340057008003,8075,


In [182]:
def acs_search(year, span):
    """
    Searches for processed ACS data
    Parameters:
    -----------
    year: str
        Release year of ACS data
    span: str
        Span of ACS data (ie '1' or '5')      
    """
    file_list_z = []
    file_list_c = []
    file_list_b = []
    curpath = dirname('/Users/jay/opt/anaconda3/envs/zrp_test/lib/python3.9/site-packages/zrp/')
    data_path = join(curpath, f'data/processed/acs/{year}/{span}yr')
    print(data_path)
    for root, dirs, files in os.walk(os.path.join(data_path)):
        for file in files:
            if (f"_zip" in file) & ("processed" in file):
                file_list_z.append(os.path.join(root, file))
            if (f"tract" in file) & ("processed" in file):
                file_list_c.append(os.path.join(root, file))
            if (f"blockgroup" in file) & ("processed" in file):
                file_list_b.append(os.path.join(root, file))
    return (file_list_z, file_list_c, file_list_b)

In [183]:
self =  gen_process
# from acs_mapper import *
file_list_z, file_list_c, file_list_b = acs_search(self.year,
                                                           self.span)

/Users/jay/opt/anaconda3/envs/zrp_test/lib/python3.9/site-packages/zrp/data/processed/acs/2019/5yr


In [190]:
pd.read_parquet(file_list_z[0]).head()

Unnamed: 0,GEOID,GEO_NAME,EXT_GEOID,B01003_001,B04006_006,B04006_031,B04006_035,B04006_038,B04006_073,B04006_090,B04006_094,B04007_002,B04007_005,B03001_001,B03001_002,B03001_003,B03001_006,B03001_008,B03001_016,B05011_002,B05011_003,B05011_004,B05011_005,B05011_006,B05011_007,B05011_008,B05011_009,B05011_010,B05012_001,B05012_002,B05012_003,B04004_001,B04004_006,B04004_035,B04004_038,B04004_073,B04004_094,B06009_001,B06009_002,B06009_003,B06009_004,B06009_005,B06009_006,B06009_007,B06009_020,B02001_001,B02001_002,B02001_003,B02001_004,B02001_005,B02001_006,B02001_007,B02001_008,B02001_009,B02001_010,B08301_002,B08301_003,B08301_004,B08301_011,B08301_012,B08301_013,B08301_016,B08301_018,B08301_019,B08301_020,B08301_021,B10051A_005,B10051B_003,B10051B_004,B10051B_005,B10051D_001,B10051D_005,B10051D_007,B10051I_001,B10051I_005,B10051I_007,C16001_001,C16001_002,C16001_003,C16001_006,C16001_009,C16001_012,C16001_015,C16001_018,C16001_021,C16001_024,C16001_029,C16001_030,C16001_033,C16001_036,B19001_001,B19001_002,B19001_003,B19001_004,B19001_005,B19001_006,B19001_007,B19001_008,B19001_009,B19001_010,B19001_011,B19001_012,B19001_013,B19001_014,B19001_015,B19001_016,B19001_017,B19001B_001,B19001B_002,B19001B_003,B19001B_004,B19001B_005,B19001B_009,B19001B_013,B19001D_001,B19001D_012,B19001D_013,B19001D_014,B19001D_015,B19001D_016,B19001H_010,B19001I_001,B19001I_012,B19001I_013,B19001I_014,B23020_001,B23020_002,B23020_003,B25004_004,B25075_001,B25075_002,B25075_003,B25075_004,B25075_005,B25075_006,B25075_007,B25075_008,B25075_009,B25075_010,B25075_011,B25075_012,B25075_013,B25075_014,B25075_015,B25075_016,B25075_017,B25075_018,B25075_019,B25075_020,B25075_021,B25075_022,B25075_023,B25075_024,B25075_025,B99021_001,B99021_002,B99021_003,B99162_007
0,1001,ZCTA5 01001,86000US01001,17312,26,63,130,293,23,13,113,14772,2540,17312,16414,898,42,19,0,396,1298,28,150,285,285,134,152,264,17312,15618,1694,8349,0,100,143,23,113,13291,1016,4018,3969,2640,1648,8816,51,17312,16030,450,6,478,0,39,309,0,309,7980,7666,314,77,0,0,19,0,57,137,446,87,0,0,0,0,0,0,0,0,0,16356,13520,484,394,0,1043,574,59,74,0,9,87,0,93,7413,393,320,238,342,216,287,381,203,263,784,833,732,870,562,650,339,212,0,32,0,0,0,0,152,0,14,46,8,11,263,176,15,14,0,,,,0,5408,8,0,40,15,18,0,15,0,0,8,58,95,19,97,563,602,755,1312,778,794,131,43,57,0,17312,177,17135,2418
1,1002,ZCTA5 01002,86000US01002,30014,153,100,586,759,1224,14,145,23386,6628,30014,27754,2260,59,322,258,3430,2050,198,619,107,399,385,46,296,30014,24534,5480,12894,60,381,448,996,80,14069,597,1904,2226,3385,5957,4256,41,30014,22651,1687,137,3367,119,457,1596,224,1372,9500,8332,1168,1452,11,41,17,566,1954,213,1182,175,17,19,0,41,41,9,11,7,7,29142,23227,1322,423,412,179,1083,297,1005,97,0,399,53,599,9798,938,465,423,322,479,426,694,177,387,556,562,1228,857,432,822,1030,347,14,42,11,23,0,28,973,111,142,86,29,58,203,564,39,81,14,,,,33,4802,19,0,32,0,15,0,0,0,0,0,0,0,0,48,62,126,67,475,739,1474,821,732,135,53,30014,2312,27702,5020
2,1003,ZCTA5 01003,86000US01003,11357,138,7,56,97,161,31,125,6270,5087,11357,10579,778,40,77,218,1100,539,28,330,58,38,81,0,4,11357,9718,1639,3101,39,31,35,92,67,105,25,19,44,13,4,67,0,11357,8329,486,40,2066,0,77,359,26,333,524,420,104,121,0,0,8,0,2122,0,507,0,0,0,0,0,0,0,0,0,0,11357,8466,537,115,79,72,621,121,722,253,0,205,59,107,42,24,14,0,0,0,0,0,0,0,0,0,4,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,,,,0,4,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,4,0,0,11357,4926,6431,1735
3,1005,ZCTA5 01005,86000US01005,5128,15,0,0,0,0,0,1,4185,943,5128,5025,103,18,0,0,57,130,0,0,30,22,0,42,36,5128,4941,187,2198,0,0,0,0,1,3612,166,1377,885,572,612,2932,0,5128,4862,122,0,33,30,0,81,45,36,2386,2042,344,0,0,0,0,0,0,0,422,61,0,0,0,0,0,0,0,0,0,5001,4870,60,20,0,13,22,0,0,0,0,0,0,0,1944,78,34,17,91,16,36,73,70,142,290,221,154,164,181,209,168,67,0,0,0,0,25,0,0,0,0,0,0,0,100,18,0,0,18,,,,41,1674,25,0,0,0,0,0,0,16,44,0,0,0,0,83,150,115,144,357,298,300,46,17,21,20,5128,196,4932,101
4,1007,ZCTA5 01007,86000US01007,15005,179,120,69,148,72,0,0,12859,2146,15005,14791,214,12,32,39,278,631,26,203,8,147,75,22,150,15005,14096,909,6615,165,15,148,58,0,10131,485,2339,2384,2559,2364,6458,0,15005,14007,83,0,654,0,39,222,0,222,7964,7082,882,60,0,0,0,23,138,12,336,227,0,0,0,0,0,0,0,0,0,14126,13013,145,143,69,50,185,25,308,0,14,174,0,0,5563,190,105,124,65,170,273,128,83,186,265,618,862,693,501,838,462,83,0,0,0,0,0,58,200,43,11,23,0,0,186,50,17,0,0,,,,0,4531,0,0,16,13,0,0,31,32,0,48,32,81,0,37,42,155,173,1056,1020,1216,389,142,30,0,15005,231,14774,1080


In [192]:
pd.read_parquet(file_list_c[0]).head(2)

Unnamed: 0,GEOID,GEO_NAME,EXT_GEOID,B01003_001,B02001_001,B02001_002,B02001_003,B02001_004,B02001_005,B02001_006,B02001_007,B02001_008,B02001_009,B02001_010,B03001_001,B03001_002,B03001_003,B03001_006,B03001_008,B03001_012,B03001_016,B04004_001,B04004_006,B04004_035,B04004_038,B04004_073,B04004_094,B04006_006,B04006_035,B04006_038,B04006_049,B04006_073,B04006_094,B04007_002,B04007_005,B05011_001,B05011_002,B05011_003,B05011_004,B05011_005,B05011_006,B05011_007,B05011_008,B05011_009,B05011_010,B05012_001,B05012_002,B05012_003,B06009_001,B06009_002,B06009_003,B06009_004,B06009_005,B06009_006,B06009_014,B06009_025,B08301_002,B08301_003,B08301_004,B08301_011,B08301_012,B08301_013,B08301_016,B08301_018,B08301_019,B08301_020,B08301_021,B10051B_002,C16001_001,C16001_002,C16001_003,C16001_006,C16001_009,C16001_012,C16001_015,C16001_018,C16001_021,C16001_024,C16001_029,C16001_030,C16001_033,C16001_036,B19001_001,B19001_002,B19001_003,B19001_004,B19001_005,B19001_006,B19001_007,B19001_008,B19001_009,B19001_010,B19001_011,B19001_012,B19001_013,B19001_014,B19001_015,B19001_016,B19001_017,B19001A_001,B19001B_001,B19001B_002,B19001B_003,B19001B_013,B19001C_001,B19001D_001,B19001I_001,B19001I_002,B23020_001,B23020_002,B23020_003,B25075_001,B25075_002,B25075_003,B25075_004,B25075_005,B25075_006,B25075_007,B25075_008,B25075_009,B25075_010,B25075_011,B25075_012,B25075_013,B25075_014,B25075_015,B25075_016,B25075_017,B25075_018,B25075_019,B25075_020,B25075_021,B25075_022,B25075_023,B25075_024,B25075_025,B99021_001,B99021_002,B99021_003,B99162_003,B99162_007
0,1001020100,"Census Tract 201, Autauga County, Alabama",14000US01001020100,1993,1993,1685,152,0,2,0,0,154,0,154,1993,1967,26,0,0,0,0,960,0,0,9,28,0,0,0,26,162,28,0,1303,690,14,8,6,6,0,0,0,0,0,0,1993.0,1979.0,14.0,1323.0,166.0,463.0,335.0,205.0,154.0,23.0,14.0,918,836,82,0,0,0,0,0,0,0,25,0,1878,1814,31,6,0,25,0,0,0,0,0,0,0,0,709,26,55,100,28,12,7,34,16,18,56,78,98,24,46,73,38,608,83,4,3,3,0,0,20,0,,,,541,54,0,26,0,0,4,0,54,0,0,33,40,29,19,26,46,19,87,37,34,30,0,0,3,1993,8,1985,64,58
1,1001020200,"Census Tract 202, Autauga County, Alabama",14000US01001020200,1959,1959,759,1117,0,0,21,6,56,0,56,1959,1929,30,0,0,0,0,1162,0,0,22,26,0,0,0,40,178,26,0,1344,615,6,0,6,0,0,0,0,0,0,6,1959.0,1953.0,6.0,1403.0,208.0,646.0,309.0,184.0,56.0,18.0,6.0,724,690,34,0,0,0,0,0,0,0,9,35,1860,1846,0,0,14,0,0,0,0,0,0,0,0,0,688,64,83,65,3,21,52,25,51,30,72,35,90,45,36,16,0,247,431,59,71,24,0,0,0,0,,,,431,3,12,6,0,0,15,0,0,17,39,45,77,33,40,37,42,17,22,8,12,0,6,0,0,1959,37,1922,14,14


In [193]:
pd.read_parquet(file_list_b[0]).head(2)

Unnamed: 0,GEOID,GEO_NAME,EXT_GEOID,B01003_001,B02001_001,B02001_002,B02001_003,B02001_004,B02001_005,B02001_006,B02001_007,B02001_008,B02001_009,B02001_010,B08301_002,B08301_003,B08301_004,B08301_010,B08301_011,B08301_012,B08301_013,B08301_016,B08301_018,B08301_019,B08301_020,B08301_021,C16001_001,C16001_002,C16001_003,C16001_006,C16001_009,C16001_012,C16001_015,C16001_018,C16001_021,C16001_024,C16001_029,C16001_030,C16001_033,C16001_036,B19001_001,B19001_002,B19001_003,B19001_004,B19001_005,B19001_006,B19001_007,B19001_008,B19001_009,B19001_010,B19001_011,B19001_012,B19001_013,B19001_014,B19001_015,B19001_016,B19001_017,B25004_006,B25004_008,B25075_001,B25075_002,B25075_003,B25075_004,B25075_005,B25075_006,B25075_007,B25075_008,B25075_009,B25075_010,B25075_011,B25075_012,B25075_013,B25075_014,B25075_015,B25075_016,B25075_017,B25075_018,B25075_019,B25075_020,B25075_021,B25075_022,B25075_023,B25075_024,B25075_025,B99021_001,B99021_002,B99021_003,B99162_003,B99162_004,B99162_005,B99162_007
0,10010201001,"Block Group 1, Census Tract 201, Autauga Count...",15000US010010201001,730,730,613,60,0,0,0,0,57,0,57,307,254,53,0,0,0,0,0,0,0,0,0,,,,,,,,,,,,,,,315,15,52,62,10,7,7,32,9,8,4,36,19,5,7,26,16,0,0,227,39,0,26,0,0,4,0,28,0,0,0,33,0,2,4,25,0,40,13,13,0,0,0,0,730,0,730,25,0,0,25
1,10010201002,"Block Group 2, Census Tract 201, Autauga Count...",15000US010010201002,1263,1263,1072,92,0,2,0,0,97,0,97,611,582,29,0,0,0,0,0,0,0,0,25,,,,,,,,,,,,,,,394,11,3,38,18,5,0,2,7,10,52,42,79,19,39,47,22,0,16,314,15,0,0,0,0,0,0,26,0,0,33,7,29,17,22,21,19,47,24,21,30,0,0,3,1263,8,1255,39,6,0,33


In [202]:
print("   ...loading ACS lookup tables")
acs_bg = load_file(file_list_b[0])
acs_ct = load_file(file_list_c[0])
acs_zip = load_file(file_list_z[0])

print("   ... combining ACS & user input data")
# data_out = acs_combine(data,
#                             acs_bg,
#                             acs_ct,
#                             acs_zip)

   ... combining ACS & user input data


In [22]:
pd.set_option('max_columns',None)
zrp_output.head(2)

Unnamed: 0_level_0,B01003_001,B02001_001,B02001_002,B02001_003,B02001_004,B02001_005,B02001_006,B02001_007,B02001_008,B02001_009,B02001_010,B03001_001,B03001_002,B03001_003,B03001_006,B03001_008,B03001_012,B03001_016,B04004_001,B04004_006,B04004_035,B04004_038,B04004_073,B04004_094,B04006_006,B04006_031,B04006_035,B04006_038,B04006_049,B04006_073,B04006_090,B04006_094,B04007_002,B04007_005,B05011_001,B05011_002,B05011_003,B05011_004,B05011_005,B05011_006,B05011_007,B05011_008,B05011_009,B05011_010,B05012_001,B05012_002,B05012_003,B06009_001,B06009_002,B06009_003,B06009_004,B06009_005,B06009_006,B06009_007,B06009_014,B06009_020,B06009_025,B08301_002,B08301_003,B08301_004,B08301_010,B08301_011,B08301_012,B08301_013,B08301_016,B08301_018,B08301_019,B08301_020,B08301_021,B10051A_005,B10051B_002,B10051B_003,B10051B_004,B10051B_005,B10051D_001,B10051D_005,B10051D_007,B10051I_001,B10051I_005,B10051I_007,B19001A_001,B19001B_001,B19001B_002,B19001B_003,B19001B_004,B19001B_005,B19001B_009,B19001B_013,B19001C_001,B19001D_001,B19001D_012,B19001D_013,B19001D_014,B19001D_015,B19001D_016,B19001H_010,B19001I_001,B19001I_002,B19001I_012,B19001I_013,B19001I_014,B19001_001,B19001_002,B19001_003,B19001_004,B19001_005,B19001_006,B19001_007,B19001_008,B19001_009,B19001_010,B19001_011,B19001_012,B19001_013,B19001_014,B19001_015,B19001_016,B19001_017,B23020_001,B23020_002,B23020_003,B25004_004,B25004_006,B25004_008,B25075_001,B25075_002,B25075_003,B25075_004,B25075_005,B25075_006,B25075_007,B25075_008,B25075_009,B25075_010,B25075_011,B25075_012,B25075_013,B25075_014,B25075_015,B25075_016,B25075_017,B25075_018,B25075_019,B25075_020,B25075_021,B25075_022,B25075_023,B25075_024,B25075_025,B99021_001,B99021_002,B99021_003,B99162_003,B99162_004,B99162_005,B99162_007,C16001_001,C16001_002,C16001_003,C16001_006,C16001_009,C16001_012,C16001_015,C16001_018,C16001_021,C16001_024,C16001_029,C16001_030,C16001_033,C16001_036,EXT_GEOID,FROMHN_numeric,GEOID,GEOID_BG,GEOID_CT,GEOID_ZIP,GEOID_x,GEOID_y,GEO_NAME,TOHN_numeric,ZEST_KEY_COL,ZIP_Match_1,ZIP_Match_2,acs_source,big,city,first_name,house_number,house_number_LEFT,house_number_RIGHT,house_numer_numeric,last_name,middle_name,small,state,street_address,zest_in_state_fips,zip_code
ZEST_KEY,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1,Unnamed: 24_level_1,Unnamed: 25_level_1,Unnamed: 26_level_1,Unnamed: 27_level_1,Unnamed: 28_level_1,Unnamed: 29_level_1,Unnamed: 30_level_1,Unnamed: 31_level_1,Unnamed: 32_level_1,Unnamed: 33_level_1,Unnamed: 34_level_1,Unnamed: 35_level_1,Unnamed: 36_level_1,Unnamed: 37_level_1,Unnamed: 38_level_1,Unnamed: 39_level_1,Unnamed: 40_level_1,Unnamed: 41_level_1,Unnamed: 42_level_1,Unnamed: 43_level_1,Unnamed: 44_level_1,Unnamed: 45_level_1,Unnamed: 46_level_1,Unnamed: 47_level_1,Unnamed: 48_level_1,Unnamed: 49_level_1,Unnamed: 50_level_1,Unnamed: 51_level_1,Unnamed: 52_level_1,Unnamed: 53_level_1,Unnamed: 54_level_1,Unnamed: 55_level_1,Unnamed: 56_level_1,Unnamed: 57_level_1,Unnamed: 58_level_1,Unnamed: 59_level_1,Unnamed: 60_level_1,Unnamed: 61_level_1,Unnamed: 62_level_1,Unnamed: 63_level_1,Unnamed: 64_level_1,Unnamed: 65_level_1,Unnamed: 66_level_1,Unnamed: 67_level_1,Unnamed: 68_level_1,Unnamed: 69_level_1,Unnamed: 70_level_1,Unnamed: 71_level_1,Unnamed: 72_level_1,Unnamed: 73_level_1,Unnamed: 74_level_1,Unnamed: 75_level_1,Unnamed: 76_level_1,Unnamed: 77_level_1,Unnamed: 78_level_1,Unnamed: 79_level_1,Unnamed: 80_level_1,Unnamed: 81_level_1,Unnamed: 82_level_1,Unnamed: 83_level_1,Unnamed: 84_level_1,Unnamed: 85_level_1,Unnamed: 86_level_1,Unnamed: 87_level_1,Unnamed: 88_level_1,Unnamed: 89_level_1,Unnamed: 90_level_1,Unnamed: 91_level_1,Unnamed: 92_level_1,Unnamed: 93_level_1,Unnamed: 94_level_1,Unnamed: 95_level_1,Unnamed: 96_level_1,Unnamed: 97_level_1,Unnamed: 98_level_1,Unnamed: 99_level_1,Unnamed: 100_level_1,Unnamed: 101_level_1,Unnamed: 102_level_1,Unnamed: 103_level_1,Unnamed: 104_level_1,Unnamed: 105_level_1,Unnamed: 106_level_1,Unnamed: 107_level_1,Unnamed: 108_level_1,Unnamed: 109_level_1,Unnamed: 110_level_1,Unnamed: 111_level_1,Unnamed: 112_level_1,Unnamed: 113_level_1,Unnamed: 114_level_1,Unnamed: 115_level_1,Unnamed: 116_level_1,Unnamed: 117_level_1,Unnamed: 118_level_1,Unnamed: 119_level_1,Unnamed: 120_level_1,Unnamed: 121_level_1,Unnamed: 122_level_1,Unnamed: 123_level_1,Unnamed: 124_level_1,Unnamed: 125_level_1,Unnamed: 126_level_1,Unnamed: 127_level_1,Unnamed: 128_level_1,Unnamed: 129_level_1,Unnamed: 130_level_1,Unnamed: 131_level_1,Unnamed: 132_level_1,Unnamed: 133_level_1,Unnamed: 134_level_1,Unnamed: 135_level_1,Unnamed: 136_level_1,Unnamed: 137_level_1,Unnamed: 138_level_1,Unnamed: 139_level_1,Unnamed: 140_level_1,Unnamed: 141_level_1,Unnamed: 142_level_1,Unnamed: 143_level_1,Unnamed: 144_level_1,Unnamed: 145_level_1,Unnamed: 146_level_1,Unnamed: 147_level_1,Unnamed: 148_level_1,Unnamed: 149_level_1,Unnamed: 150_level_1,Unnamed: 151_level_1,Unnamed: 152_level_1,Unnamed: 153_level_1,Unnamed: 154_level_1,Unnamed: 155_level_1,Unnamed: 156_level_1,Unnamed: 157_level_1,Unnamed: 158_level_1,Unnamed: 159_level_1,Unnamed: 160_level_1,Unnamed: 161_level_1,Unnamed: 162_level_1,Unnamed: 163_level_1,Unnamed: 164_level_1,Unnamed: 165_level_1,Unnamed: 166_level_1,Unnamed: 167_level_1,Unnamed: 168_level_1,Unnamed: 169_level_1,Unnamed: 170_level_1,Unnamed: 171_level_1,Unnamed: 172_level_1,Unnamed: 173_level_1,Unnamed: 174_level_1,Unnamed: 175_level_1,Unnamed: 176_level_1,Unnamed: 177_level_1,Unnamed: 178_level_1,Unnamed: 179_level_1,Unnamed: 180_level_1,Unnamed: 181_level_1,Unnamed: 182_level_1,Unnamed: 183_level_1,Unnamed: 184_level_1,Unnamed: 185_level_1,Unnamed: 186_level_1,Unnamed: 187_level_1,Unnamed: 188_level_1,Unnamed: 189_level_1,Unnamed: 190_level_1,Unnamed: 191_level_1,Unnamed: 192_level_1,Unnamed: 193_level_1,Unnamed: 194_level_1,Unnamed: 195_level_1,Unnamed: 196_level_1,Unnamed: 197_level_1,Unnamed: 198_level_1
10,589,589,534,8,0,8,0,0,39,0,39,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,283,277,6,15,0,0,15,0,0,13,0,20,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,245,12,0,17,3,3,0,7,4,8,53,39,32,22,18,13,14,,,,,0,16,146,0,0,0,0,0,0,0,1,0,0,0,0,0,6,0,11,16,12,42,44,12,2,0,0,589,7,582,52,4,4,48,,,,,,,,,,,,,,,15000US340373735003,,,340373735003,34037373500,7821,,340373735003,"Block Group 3, Census Tract 3735, Sussex Count...",,10,,,BG,,ANDOVER,JOHN,137,,137,,MORGAN,M,,NJ,MAIN STREET,34,7821
100,1266,1266,999,233,0,0,0,0,34,0,34,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,416,308,108,27,0,0,27,0,0,0,0,46,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,695,32,48,22,78,0,56,9,47,42,36,123,72,21,37,49,23,,,,,0,0,450,9,0,13,0,0,0,0,0,0,7,0,0,0,13,31,3,16,109,153,87,0,9,0,0,1266,10,1256,171,22,0,149,,,,,,,,,,,,,,,15000US340057008003,,,340057008003,34005700800,8075,,340057008003,"Block Group 3, Census Tract 7008, Burlington C...",,100,,,BG,,DELANCO,MICHAEL,770,,770,,TEMPLETON,L,,NJ,COOPERTOWN ROAD,34,8075


In [9]:
zrp_output.head()

Unnamed: 0_level_0,B01003_001,B02001_001,B02001_002,B02001_003,B02001_004,B02001_005,B02001_006,B02001_007,B02001_008,B02001_009,...,house_number_LEFT,house_number_RIGHT,house_numer_numeric,last_name,middle_name,small,state,street_address,zest_in_state_fips,zip_code
ZEST_KEY,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
10,589,589,534,8,0,8,0,0,39,0,...,,137,,MORGAN,M,,NJ,MAIN STREET,34,7821
100,1266,1266,999,233,0,0,0,0,34,0,...,,770,,TEMPLETON,L,,NJ,COOPERTOWN ROAD,34,8075
106,1722,1722,1447,44,0,108,0,50,73,0,...,,1011,,MEDANY,,,NJ,COOPER STREET,34,8096
107,1071,1071,755,55,0,107,0,137,17,0,...,,37,,BLACKMAN,,,NJ,NORTH SUSSEX STREET,34,7801
108,667,667,578,4,67,3,0,0,15,0,...,,288,,CAMPBELL,,,NJ,MAIN STREET,34,8345


`ZRP_Prepare` generates multiple artifacts that are automatically saved:
- Dataframe with address to GEOID mappings
    - `Zest_Geocoded_test_{year}__{state_fips}.parquet`
- Validation dictionary for input data
    - `input_validator.json`
- Validation dictionary for geographic data
    - `input_geo_validator.json`
- Validation dictionary for American Community Survey data
    - `input_acs_validator.json`


In [203]:
zrp_output = prepare.transform(nj_mayors)

  0%|                                                                              | 0/1 [00:00<?, ?it/s]
  0%|                                                                            | 0/462 [00:00<?, ?it/s][A[Parallel(n_jobs=-1)]: Using backend ThreadingBackend with 8 concurrent workers.
[Parallel(n_jobs=-1)]: Done  34 tasks      | elapsed:    0.0s
[Parallel(n_jobs=-1)]: Done 184 tasks      | elapsed:    0.0s
[Parallel(n_jobs=-1)]: Done 434 tasks      | elapsed:    0.0s
100%|███████████████████████████████████████████████████████████████| 462/462 [00:00<00:00, 28435.96it/s]

Data is loaded
   [Start] Validating input data
     Number of observations: 462
     Is key unique: True
Directory already exists
   [Completed] Validating input data

   Formatting P1
   Formatting P2
   reduce whitespace

[Start] Preparing geo data

  The following states are included in the data: ['NJ']
   ... on state: NJ

   Data is loaded
   [Start] Processing geo data
      ...address cleaning
      ...replicating address
         ...Base
         ...Number processing...
         House number dataframe expansion is complete! (n=462)
         ...Base
         ...Map street suffixes...
         ...Mapped & split by street suffixes...
         ...Number processing...




[Parallel(n_jobs=-1)]: Done 462 out of 462 | elapsed:    0.0s finished


         Address dataframe expansion is complete! (n=900)
      ...formatting
   [Completed] Processing geo data
   [Start] Mapping geo data
      ...merge user input & lookup table
      ...mapping


100%|██████████████████████████████████████████████████████████████████████| 1/1 [00:02<00:00,  2.91s/it]

   [Completed] Validating input geo data
Directory already exists
...Output saved
   [Completed] Mapping geo data

[Completed] Preparing geo data

[Start] Preparing ACS data
   [Start] Validating ACS input data
     Number of observations: 462
     Is key unique: True

   [Completed] Validating ACS input data

   ...loading ACS lookup tables





   ... combining ACS & user input data
 ...Copy dataframes
 ...Block group
 ...Census tract
 ...Zip code
 ...No match
 ...Merge
 ...Merging complete
[Complete] Preparing ACS data



In [204]:
zrp_output.head(2)

Unnamed: 0_level_0,B01003_001,B02001_001,B02001_002,B02001_003,B02001_004,B02001_005,B02001_006,B02001_007,B02001_008,B02001_009,B02001_010,B03001_001,B03001_002,B03001_003,B03001_006,B03001_008,B03001_012,B03001_016,B04004_001,B04004_006,B04004_035,B04004_038,B04004_073,B04004_094,B04006_006,B04006_031,B04006_035,B04006_038,B04006_049,B04006_073,B04006_090,B04006_094,B04007_002,B04007_005,B05011_001,B05011_002,B05011_003,B05011_004,B05011_005,B05011_006,B05011_007,B05011_008,B05011_009,B05011_010,B05012_001,B05012_002,B05012_003,B06009_001,B06009_002,B06009_003,B06009_004,B06009_005,B06009_006,B06009_007,B06009_014,B06009_020,B06009_025,B08301_002,B08301_003,B08301_004,B08301_010,B08301_011,B08301_012,B08301_013,B08301_016,B08301_018,B08301_019,B08301_020,B08301_021,B10051A_005,B10051B_002,B10051B_003,B10051B_004,B10051B_005,B10051D_001,B10051D_005,B10051D_007,B10051I_001,B10051I_005,B10051I_007,B19001A_001,B19001B_001,B19001B_002,B19001B_003,B19001B_004,B19001B_005,B19001B_009,B19001B_013,B19001C_001,B19001D_001,B19001D_012,B19001D_013,B19001D_014,B19001D_015,B19001D_016,B19001H_010,B19001I_001,B19001I_002,B19001I_012,B19001I_013,B19001I_014,B19001_001,B19001_002,B19001_003,B19001_004,B19001_005,B19001_006,B19001_007,B19001_008,B19001_009,B19001_010,B19001_011,B19001_012,B19001_013,B19001_014,B19001_015,B19001_016,B19001_017,B23020_001,B23020_002,B23020_003,B25004_004,B25004_006,B25004_008,B25075_001,B25075_002,B25075_003,B25075_004,B25075_005,B25075_006,B25075_007,B25075_008,B25075_009,B25075_010,B25075_011,B25075_012,B25075_013,B25075_014,B25075_015,B25075_016,B25075_017,B25075_018,B25075_019,B25075_020,B25075_021,B25075_022,B25075_023,B25075_024,B25075_025,B99021_001,B99021_002,B99021_003,B99162_003,B99162_004,B99162_005,B99162_007,C16001_001,C16001_002,C16001_003,C16001_006,C16001_009,C16001_012,C16001_015,C16001_018,C16001_021,C16001_024,C16001_029,C16001_030,C16001_033,C16001_036,EXT_GEOID,FROMHN_numeric,GEOID,GEOID_BG,GEOID_CT,GEOID_ZIP,GEOID_x,GEOID_y,GEO_NAME,TOHN_numeric,ZEST_KEY_COL,ZIP_Match_1,ZIP_Match_2,acs_source,big,city,first_name,house_number,house_number_LEFT,house_number_RIGHT,house_numer_numeric,last_name,middle_name,small,state,street_address,zest_in_state_fips,zip_code
ZEST_KEY,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1,Unnamed: 24_level_1,Unnamed: 25_level_1,Unnamed: 26_level_1,Unnamed: 27_level_1,Unnamed: 28_level_1,Unnamed: 29_level_1,Unnamed: 30_level_1,Unnamed: 31_level_1,Unnamed: 32_level_1,Unnamed: 33_level_1,Unnamed: 34_level_1,Unnamed: 35_level_1,Unnamed: 36_level_1,Unnamed: 37_level_1,Unnamed: 38_level_1,Unnamed: 39_level_1,Unnamed: 40_level_1,Unnamed: 41_level_1,Unnamed: 42_level_1,Unnamed: 43_level_1,Unnamed: 44_level_1,Unnamed: 45_level_1,Unnamed: 46_level_1,Unnamed: 47_level_1,Unnamed: 48_level_1,Unnamed: 49_level_1,Unnamed: 50_level_1,Unnamed: 51_level_1,Unnamed: 52_level_1,Unnamed: 53_level_1,Unnamed: 54_level_1,Unnamed: 55_level_1,Unnamed: 56_level_1,Unnamed: 57_level_1,Unnamed: 58_level_1,Unnamed: 59_level_1,Unnamed: 60_level_1,Unnamed: 61_level_1,Unnamed: 62_level_1,Unnamed: 63_level_1,Unnamed: 64_level_1,Unnamed: 65_level_1,Unnamed: 66_level_1,Unnamed: 67_level_1,Unnamed: 68_level_1,Unnamed: 69_level_1,Unnamed: 70_level_1,Unnamed: 71_level_1,Unnamed: 72_level_1,Unnamed: 73_level_1,Unnamed: 74_level_1,Unnamed: 75_level_1,Unnamed: 76_level_1,Unnamed: 77_level_1,Unnamed: 78_level_1,Unnamed: 79_level_1,Unnamed: 80_level_1,Unnamed: 81_level_1,Unnamed: 82_level_1,Unnamed: 83_level_1,Unnamed: 84_level_1,Unnamed: 85_level_1,Unnamed: 86_level_1,Unnamed: 87_level_1,Unnamed: 88_level_1,Unnamed: 89_level_1,Unnamed: 90_level_1,Unnamed: 91_level_1,Unnamed: 92_level_1,Unnamed: 93_level_1,Unnamed: 94_level_1,Unnamed: 95_level_1,Unnamed: 96_level_1,Unnamed: 97_level_1,Unnamed: 98_level_1,Unnamed: 99_level_1,Unnamed: 100_level_1,Unnamed: 101_level_1,Unnamed: 102_level_1,Unnamed: 103_level_1,Unnamed: 104_level_1,Unnamed: 105_level_1,Unnamed: 106_level_1,Unnamed: 107_level_1,Unnamed: 108_level_1,Unnamed: 109_level_1,Unnamed: 110_level_1,Unnamed: 111_level_1,Unnamed: 112_level_1,Unnamed: 113_level_1,Unnamed: 114_level_1,Unnamed: 115_level_1,Unnamed: 116_level_1,Unnamed: 117_level_1,Unnamed: 118_level_1,Unnamed: 119_level_1,Unnamed: 120_level_1,Unnamed: 121_level_1,Unnamed: 122_level_1,Unnamed: 123_level_1,Unnamed: 124_level_1,Unnamed: 125_level_1,Unnamed: 126_level_1,Unnamed: 127_level_1,Unnamed: 128_level_1,Unnamed: 129_level_1,Unnamed: 130_level_1,Unnamed: 131_level_1,Unnamed: 132_level_1,Unnamed: 133_level_1,Unnamed: 134_level_1,Unnamed: 135_level_1,Unnamed: 136_level_1,Unnamed: 137_level_1,Unnamed: 138_level_1,Unnamed: 139_level_1,Unnamed: 140_level_1,Unnamed: 141_level_1,Unnamed: 142_level_1,Unnamed: 143_level_1,Unnamed: 144_level_1,Unnamed: 145_level_1,Unnamed: 146_level_1,Unnamed: 147_level_1,Unnamed: 148_level_1,Unnamed: 149_level_1,Unnamed: 150_level_1,Unnamed: 151_level_1,Unnamed: 152_level_1,Unnamed: 153_level_1,Unnamed: 154_level_1,Unnamed: 155_level_1,Unnamed: 156_level_1,Unnamed: 157_level_1,Unnamed: 158_level_1,Unnamed: 159_level_1,Unnamed: 160_level_1,Unnamed: 161_level_1,Unnamed: 162_level_1,Unnamed: 163_level_1,Unnamed: 164_level_1,Unnamed: 165_level_1,Unnamed: 166_level_1,Unnamed: 167_level_1,Unnamed: 168_level_1,Unnamed: 169_level_1,Unnamed: 170_level_1,Unnamed: 171_level_1,Unnamed: 172_level_1,Unnamed: 173_level_1,Unnamed: 174_level_1,Unnamed: 175_level_1,Unnamed: 176_level_1,Unnamed: 177_level_1,Unnamed: 178_level_1,Unnamed: 179_level_1,Unnamed: 180_level_1,Unnamed: 181_level_1,Unnamed: 182_level_1,Unnamed: 183_level_1,Unnamed: 184_level_1,Unnamed: 185_level_1,Unnamed: 186_level_1,Unnamed: 187_level_1,Unnamed: 188_level_1,Unnamed: 189_level_1,Unnamed: 190_level_1,Unnamed: 191_level_1,Unnamed: 192_level_1,Unnamed: 193_level_1,Unnamed: 194_level_1,Unnamed: 195_level_1,Unnamed: 196_level_1,Unnamed: 197_level_1,Unnamed: 198_level_1
10,589,589,534,8,0,8,0,0,39,0,39,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,283,277,6,15,0,0,15,0,0,13,0,20,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,245,12,0,17,3,3,0,7,4,8,53,39,32,22,18,13,14,,,,,0,16,146,0,0,0,0,0,0,0,1,0,0,0,0,0,6,0,11,16,12,42,44,12,2,0,0,589,7,582,52,4,4,48,,,,,,,,,,,,,,,15000US340373735003,,,340373735003,34037373500,7821,,340373735003,"Block Group 3, Census Tract 3735, Sussex Count...",,10,,,BG,,ANDOVER,JOHN,137,,137,,MORGAN,M,,NJ,MAIN STREET,34,7821
100,1266,1266,999,233,0,0,0,0,34,0,34,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,416,308,108,27,0,0,27,0,0,0,0,46,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,695,32,48,22,78,0,56,9,47,42,36,123,72,21,37,49,23,,,,,0,0,450,9,0,13,0,0,0,0,0,0,7,0,0,0,13,31,3,16,109,153,87,0,9,0,0,1266,10,1256,171,22,0,149,,,,,,,,,,,,,,,15000US340057008003,,,340057008003,34005700800,8075,,340057008003,"Block Group 3, Census Tract 7008, Burlington C...",,100,,,BG,,DELANCO,MICHAEL,770,,770,,TEMPLETON,L,,NJ,COOPERTOWN ROAD,34,8075
