## Match LODES Joblist Data to Housing Unit Allocation Data

The LODES Joblist (now linked to the school building) needs to be linked to the Housing Unit Inventory (which is linked to residential buildings).

The current version of the Lumberton Housing Unit Inventory includes details on Race and Ethnicity and Tenure by Census Block.

The current version of the joblist includes information on Race and Ethnicity and Earnings by coarsened Census Geography (Tract, County, State).

Future versions of the Housing Unit Inventory will provide details on household income and individual population data that will make the link between the joblist and population data more robust.

This notebook demonstrates an initial attempt to link the housing unit inventory with the joblist.

The results of this notebook will provide insights into how many staff at one elementary school in Lumberton, NC connects to the IN-CORE data structure and model environment.

Goal is to accurately assign job (labor) information to home (residential) buildings.

The basic workflow:
Randomly sort Housing Unit Inventory data
Randomly sort joblist data

Add a randomsort counter to both files based on Census Tract, Race and Ethnicity. 
Link job to housing unit invenotry via random sort.

Check to see how many jobs are linked to housing unit inventory and how many jobs are linked to building invenotry.


## Description of Program
- program:    IN-CORE_2dv1_MatchJoblisttoHUA_2021-08-19
- task:       Match LODES joblist point data to Housing Unit Inventory
- Version:    2021-08-19
- project:    Interdependent Networked Community Resilience Modeling Environment (IN-CORE) Subtask 5.2 - Social Institutions
- funding:	  NIST Financial Assistance Award Numbers: 70NANB15H044 and 70NANB20H008 
- author:     Nathanael Rosenheim

- Suggested Citation:
Rosenheim, N. (2021) “Obtain, Clean, and Explore Labor Market Allocation Methods". 
Archived on Github and ICPSR.

In [1]:
%matplotlib inline

import pandas as pd
import geopandas as gpd
import numpy as np  # group by aggregation
import folium as fm # folium has more dynamic maps - but requires internet connection

In [2]:
# Display versions being used - important information for replication
import os
import sys
print("Python Version     ", sys.version)
print("geopandas version: ", gpd.__version__)
print("pandas version:    ", pd.__version__)
print("numpy version:     ", np.__version__)
print("folium version:    ", fm.__version__)

Python Version      3.7.10 | packaged by conda-forge | (default, Feb 19 2021, 15:37:01) [MSC v.1916 64 bit (AMD64)]
geopandas version:  0.9.0
pandas version:     1.3.1
numpy version:      1.21.1
folium version:     unknown


In [3]:
import os # For saving output to path
# Store Program Name for output files to have the same name
programname = "IN-CORE_2dv1_MatchJoblisttoHUA_2021-08-19"

# Save Outputfolder - due to long folder name paths output saved to folder with shorter name
# files from this program will be saved with the program name - this helps to follow the overall workflow
outputfolder = "workflow_output"
# Make directory to save output
if not os.path.exists(outputfolder):
    os.mkdir(outputfolder)

# Setup access to IN-CORE
https://incore.ncsa.illinois.edu/

In [4]:
from pyincore import IncoreClient, Dataset, FragilityService, MappingSet, DataService
from pyincore_viz.geoutil import GeoUtil as viz

In [5]:
#client = IncoreClient()
# IN-CORE chaches files on the local machine, it might be necessary to clear the memory
#client.clear_cache()

In [6]:
# create data_service object for loading files
#data_service = DataService(client)

### Setup notebook enviroment to access Cloned Github Package
This notebook uses packages that are in developement. The packages are available at:

https://github.com/npr99/Labor_Market_Allocation

To replicate this notebook Clone the Github Package to a folder that is a sibling of this notebook.

To access the sibling package you will need to append the parent directory ('..') to the system path list.

In [7]:
# append the path of the
# parent directory
sys.path.append("..")

### IN-CORE addons
This program uses coded that is being developed as potential add ons to pyincore. These functions are in a folder called pyincore_addons - this folder is located in the same directory as this notebook.
The add on functions are organized to mirror the folder sturcture of https://github.com/IN-CORE/pyincore

Each add on function attempts to follow the structure of existing pyincore functions and includes some help information.

In [8]:
# To reload submodules need to use this magic command to set autoreload on
%load_ext autoreload
%autoreload 2

# open, read, and execute python program with reusable commands
from pyincore_addons.geoutil_20210618 import df2gdf_WKTgeometry
from pyincore_viz_addons.viz_pop_results import pop_results_table, add_race_ethnicity_to_hua_df

## Read in Housing Unit Inventory Data


In [9]:
sourceprogram = "IN_CORE_CommunityDescription_Lumberton_2021-08-19"
filename = "IN-CORE_Lumberton_Testbed_2021-08-19"+"/"+sourceprogram+".csv"
housingunit_df = pd.read_csv(filename)

  exec(code_obj, self.user_global_ns, self.user_ns)


In [10]:
# Check columns in housing unit inventory after housing unit allocation
#for col in housingunit_df:
#    print(col)

In [11]:
housingunit_df[['addrptid','strctid','guid','huid','BLOCKID10','tractid','race','hispan']].head()

Unnamed: 0,addrptid,strctid,guid,huid,BLOCKID10,tractid,race,hispan
0,CB0000000000000000000371559601011002.0AP000000,CB0000000000000000000371559601011002.0,,H371559601011002001,371559601011002,37155960000.0,2.0,0.0
1,CB0000000000000000000371559601011003.0AP000002,CB0000000000000000000371559601011003.0,,H371559601011003001,371559601011003,37155960000.0,1.0,0.0
2,CB0000000000000000000371559601011003.0AP000001,CB0000000000000000000371559601011003.0,,H371559601011003003,371559601011003,37155960000.0,2.0,0.0
3,CB0000000000000000000371559601011003.0AP000000,CB0000000000000000000371559601011003.0,,H371559601011003002,371559601011003,37155960000.0,1.0,0.0
4,CB0000000000000000000371559601011005.0AP000002,CB0000000000000000000371559601011005.0,,H371559601011005003,371559601011005,37155960000.0,1.0,0.0


In [12]:
housingunit_df[['addrptid','strctid','guid','huid','BLOCKID10','tractid','race','hispan']].astype(str).describe()

Unnamed: 0,addrptid,strctid,guid,huid,BLOCKID10,tractid,race,hispan
count,61505,61505,61505.0,61505.0,61505,61505.0,61505.0,61505.0
unique,61505,22842,20092.0,52802.0,4323,32.0,8.0,3.0
top,CB0000000000000000000371559601011002.0AP000000,CB0000000000000000000371559602013030.0,,,371559613011091,,3.0,0.0
freq,1,184,33135.0,8704.0,278,8704.0,17905.0,44592.0


In [13]:
# Tract Merge ID - based on BLOCKID 
geocodevar = 'BLOCKID10'
housingunit_df.loc[:,geocodevar+'_str'] = housingunit_df[geocodevar].apply(lambda x : str(int(x)).zfill(15))
housingunit_df.loc[:,'h_geocode_tractid'] = housingunit_df[geocodevar+'_str'].str[0:11]
housingunit_df['h_geocode_tractid'].describe()

count           61505
unique             31
top       37155960600
freq             3263
Name: h_geocode_tractid, dtype: object

In [14]:
housingunit_df[['race','hispan']].describe()

Unnamed: 0,race,hispan
count,47997.0,46861.0
mean,2.249411,0.04842
std,1.236691,0.214654
min,1.0,0.0
25%,1.0,0.0
50%,2.0,0.0
75%,3.0,0.0
max,7.0,1.0


In [15]:
housingunit_df[['guid','geometry','x','y']].head()

Unnamed: 0,guid,geometry,x,y
0,,POINT (-78.94752 34.90556),-78.94752,34.90556
1,,POINT (-78.95882 34.89573),-78.95882,34.89573
2,,POINT (-78.95882 34.89573),-78.95882,34.89573
3,,POINT (-78.95882 34.89573),-78.95882,34.89573
4,,POINT (-78.96133 34.88307),-78.96133,34.88307


In [16]:
housingunit_df = housingunit_df.rename(columns={"guid": "h_guid", "geometry" : "h_geometry"})

## Read in LODES Joblist Data matched to School Building Data
The LODES Joblist data matched to Rowland Elementary

In [17]:
sourceprogram = "IN-CORE_2cv1_MatchLodesSchoolBuilding_2021-08-19"
filename = outputfolder+"/"+sourceprogram+".csv"
joblist_df = pd.read_csv(filename)
joblist_df.head()

# Convert dataframe to gdf
joblist_gdf = df2gdf_WKTgeometry(df = joblist_df, projection = "epsg:26917",reproject="epsg:4326")
joblist_gdf.head(2)

Unnamed: 0.2,Unnamed: 0,guid,ncesid,school_name,uniquejobid,geometry_y,LON_y,LAT_y,Unnamed: 0.1,Unnamed: 0.1.1,...,blklatdd_w,blklondd_w,h_geocode_stfips,h_geocode_stabbr,tabblk2010_h,blklatdd_h,blklondd_h,od_distance,h_geocode_coarse,geometry
0,0,f8c00a5d-f1ed-400b-b4dc-398c76119a9b,370393002242,Rowland Norment Elementary,ID371559612002006370179503001059jidodJT07213jo...,POINT (683766.0696250959 3832976.288860729),683766.069625,3832976.0,0,1,...,34.622429,-78.995762,37,nc,370179503001059,34.644261,-78.71662,25.654264,37017950300,POINT (-78.99576 34.62243)
1,1,f8c00a5d-f1ed-400b-b4dc-398c76119a9b,370393002242,Rowland Norment Elementary,ID371559612002006370190203042017jidodJT07223jo...,POINT (683766.0696250959 3832976.288860729),683766.069625,3832976.0,1,2,...,34.622429,-78.995762,37,nc,370190203042017,33.953199,-78.086835,111.849522,37019,POINT (-78.99576 34.62243)


In [18]:
# Check columns in housing unit inventory after housing unit allocation
#for col in joblist_df:
#    print(col)

In [19]:
joblist_gdf[['uniquejobid','guid','h_geocode_str','h_geocode_tractid','Race','Ethnicity']].head()

Unnamed: 0,uniquejobid,guid,h_geocode_str,h_geocode_tractid,Race,Ethnicity
0,ID371559612002006370179503001059jidodJT07213jo...,f8c00a5d-f1ed-400b-b4dc-398c76119a9b,370179503001059,37017950300,1,1
1,ID371559612002006370190203042017jidodJT07223jo...,f8c00a5d-f1ed-400b-b4dc-398c76119a9b,370190203042017,37019020304,3,1
2,ID371559612002006370479302001019jidodJT07333jo...,f8c00a5d-f1ed-400b-b4dc-398c76119a9b,370479302001019,37047930200,3,1
3,ID371559612002006370479306003057jidodJT07323jo...,f8c00a5d-f1ed-400b-b4dc-398c76119a9b,370479306003057,37047930600,1,1
4,ID371559612002006370510030013000jidodJT07233jo...,f8c00a5d-f1ed-400b-b4dc-398c76119a9b,370510030013000,37051003001,1,1


In [20]:
joblist_gdf[['uniquejobid','guid','h_geocode_str','h_geocode_tractid','Race','Ethnicity']].astype(str).describe()

Unnamed: 0,uniquejobid,guid,h_geocode_str,h_geocode_tractid,Race,Ethnicity
count,74,74,74,74,74,74
unique,74,1,72,41,3,2
top,ID371559612002006370179503001059jidodJT07213jo...,f8c00a5d-f1ed-400b-b4dc-398c76119a9b,371559615003025,37155960702,3,1
freq,1,74,2,6,27,73


In [21]:
# Tractid should be a string based on a 15 digit zerofilled blockid
joblist_gdf['h_geocode_tractid'].describe()

count    7.400000e+01
mean     3.692605e+10
std      1.850833e+09
min      2.122310e+10
25%      3.715596e+10
50%      3.715596e+10
75%      3.715596e+10
max      3.719500e+10
Name: h_geocode_tractid, dtype: float64

In [22]:
# Tract Merge ID - based on BLOCKID 
geocodevar = 'h_geocode'
joblist_gdf.loc[:,geocodevar+'_str'] = joblist_gdf[geocodevar].apply(lambda x : str(int(x)).zfill(15))
joblist_gdf.loc[:,'h_geocode_tractid'] = joblist_gdf[geocodevar+'_str'].str[0:11]
joblist_gdf['h_geocode_tractid'].describe()

count              74
unique             41
top       37155960702
freq                6
Name: h_geocode_tractid, dtype: object

In [23]:
joblist_gdf[['Race','Ethnicity']].describe()

Unnamed: 0,Race,Ethnicity
count,74.0,74.0
mean,2.027027,1.013514
std,0.843557,0.116248
min,1.0,1.0
25%,1.0,1.0
50%,2.0,1.0
75%,3.0,1.0
max,3.0,2.0


In [24]:
# rame race - lowercase R - to match Housing Unit Inventory
joblist_gdf = joblist_gdf.rename(columns={"Race": "race"})

In [25]:
# Set Hispanic Variable to match housing unit inventor
joblist_gdf.loc[:,'hispan'] = 0
joblist_gdf.loc[joblist_gdf['Ethnicity'] == 1,'hispan'] = 0
joblist_gdf.loc[joblist_gdf['Ethnicity'] == 2,'hispan'] = 1
# Check new variable
pd.pivot_table(joblist_gdf, columns = ['hispan'], index = ['Ethnicity'], values = 'uniquejobid', aggfunc = 'count')

hispan,0,1
Ethnicity,Unnamed: 1_level_1,Unnamed: 2_level_1
1,73.0,
2,,1.0


In [26]:
joblist_gdf[['Ethnicity','hispan']].head()

Unnamed: 0,Ethnicity,hispan
0,1,0
1,1,0
2,1,0
3,1,0
4,1,0


In [27]:
# Renme guid and geometry for work location
joblist_gdf = joblist_gdf.rename(columns={"guid": "w_guid", "geometry" : "w_geometry"})

## Merge Housing Unit Inventory with Joblist

1. Add random sort counter to Housing Unit Inventorty and Joblist

Attempt to modify code for housing unit allocation. The process for job allocation is similar to the housing unit allocation.

The main difference is the sort variables.

Original code:
https://github.com/IN-CORE/pyincore/blob/master/pyincore/analyses/housingunitallocation/housingunitallocation.py

In [28]:
def prepare_randommerge(df, seed, 
                       unique_sort_vars = ["huid"],
                       groupby_vars = ["blockid"],
                       groupby_vars_ascending = [True],
                       sort_vars = ["ownershp", "vacancy"],
                       sort_vars_ascending = [True, True]
                       ):
    """
    Set random merge order for Housing Unit Inventory, Address Point Inventory, and Joblists
    Args:
        housing_unit_inventory (pd.DataFrame): Housing unit inventory.
        seed (int): Random number generator seed for reproducibility.
        groupby_vars: List of variables to group data by - default ["blockid"]
        sort_vars: List of variables to sort data by. This will prioritize some observations to be merged first
            default = ["ownershp", "vacancy"]
            The default matches to the prepare_infrastructure_inventory which places residential structures
            with fewer housing units at the top of the list. This sorting helps ensure that
            owner occupied houses are more likely to be linked to single dwelling unit houses.
        unique_sort_vars: List of variables that uniquely identify unit of analysis
            default = ["HUID"]
            to replicate the random sort the dataframe must first be sorted by a unique id
    Returns:
        pd.DataFrame: Ramdomly Sorted Dataframe with Random Merge ID
    """
    size_row, size_col = df.shape

    random_generator = np.random.RandomState(seed)
    sorted_0 = df.sort_values(by=unique_sort_vars)

    # Create Random merge order for housing unit inventory
    random_order = random_generator.uniform(0, 1, size_row)

    sorted_0["random_order"] = random_order

    #  gsort BlockID -LiveTypeUnit Tenure randomaorderpop
    sorted_1 = sorted_0.sort_values(by=groupby_vars+sort_vars+["random_order"],
                                    ascending=groupby_vars_ascending + sort_vars_ascending+[True])

    # by BlockID: gen RandomMergeOrder = _n (+1 to be consistent with STATA starting from 1)
    sorted_1["random_mergeorder"] = sorted_1.groupby(groupby_vars).cumcount() + 1

    sorted_2 = sorted_1.sort_values(by=groupby_vars+["random_mergeorder"],
                                    ascending=groupby_vars_ascending + [True])
    return sorted_2

### Prepare Housing Unit Inventory
The housing unit inventory needs a random merge order based on the tractid, race, and hispanic variables.
The merge to the joblist will be based on the same varaibles.
With this merge order jobs will be linked to a housing unit in the same tract. This process assumes that the race and ethnicity of the householder in the housing unit inventory is the same as the race and ethnicity of job characteristics.

In [29]:
# Select random seed - this can be any number but needs to be recorded to replicate results
seed = 2564
unique_sort_vars = ["huid"]
groupby_vars = ['h_geocode_tractid','race','hispan']
groupby_vars_ascending=[True, True, True]
sort_vars = []
sort_vars_ascending = []
housingunit_df = prepare_randommerge(df = housingunit_df,
                              seed = seed,
                              unique_sort_vars = unique_sort_vars,
                              groupby_vars = groupby_vars,
                              groupby_vars_ascending = groupby_vars_ascending,
                              sort_vars = sort_vars,
                              sort_vars_ascending = sort_vars_ascending)

In [30]:
housingunit_df[unique_sort_vars+groupby_vars+sort_vars+['random_order','random_mergeorder']].head()

Unnamed: 0,huid,h_geocode_tractid,race,hispan,random_order,random_mergeorder
999,H371559601011062016,37155960101,1.0,0.0,6.7e-05,1
641,H371559601011046033,37155960101,1.0,0.0,0.000201,2
9,H371559601011006002,37155960101,1.0,0.0,0.001027,3
896,H371559601011056092,37155960101,1.0,0.0,0.001794,4
375,H371559601011031094,37155960101,1.0,0.0,0.005493,5


### Prepare Joblist

In [31]:
# Select random seed - this can be any number but needs to be recorded to replicate results
seed = 2564
unique_sort_vars = ["uniquejobid"]
groupby_vars = ['h_geocode_tractid','race','hispan']
groupby_vars_ascending=[True, True, True]
sort_vars = []
sort_vars_ascending = []
joblist_gdf = prepare_randommerge(df = joblist_gdf,
                              seed = seed,
                              unique_sort_vars = unique_sort_vars,
                              groupby_vars = groupby_vars,
                              groupby_vars_ascending = groupby_vars_ascending,
                              sort_vars = sort_vars,
                              sort_vars_ascending = sort_vars_ascending)

In [32]:
joblist_gdf[unique_sort_vars+groupby_vars+sort_vars+['random_order','random_mergeorder']].head()

Unnamed: 0,uniquejobid,h_geocode_tractid,race,hispan,random_order,random_mergeorder
73,ID371559612002006212231002003022jidodJT07133jo...,21223100200,3,0,0.410508,1
0,ID371559612002006370179503001059jidodJT07213jo...,37017950300,1,0,0.992862,1
1,ID371559612002006370190203042017jidodJT07223jo...,37019020304,3,0,0.366463,1
2,ID371559612002006370479302001019jidodJT07333jo...,37047930200,3,0,0.176463,1
3,ID371559612002006370479306003057jidodJT07323jo...,37047930600,1,0,0.375654,1


## Merge Joblist with Housing Unit Inventory

In [33]:
sorted_housing_unit = housingunit_df[['h_guid','h_geometry','strctid','huid','h_geocode_tractid','race','hispan','gqtype','ownershp','numprec','random_mergeorder','plcname10','Building Data']]
sorted_joblist = joblist_gdf[['w_guid','w_geometry','school_name','uniquejobid','h_geocode_tractid',
                              'race','hispan','Earnings','random_mergeorder','jobtype','blklatdd_h', 'blklondd_h']]

huajoblist_inventory = pd.merge(left = sorted_housing_unit, 
                              right = sorted_joblist,
                              how='outer', 
                              left_on= ['h_geocode_tractid','race','hispan','random_mergeorder'],
                              right_on=['h_geocode_tractid','race','hispan','random_mergeorder'],
                              sort=True, suffixes=("_x", "_y"),
                              copy=True, indicator=True, validate="1:1")
huajoblist_inventory = huajoblist_inventory.rename(columns={"_merge": "huajoblistmerge"})
huajoblist_inventory.head()                   

Unnamed: 0,h_guid,h_geometry,strctid,huid,h_geocode_tractid,race,hispan,gqtype,ownershp,numprec,...,Building Data,w_guid,w_geometry,school_name,uniquejobid,Earnings,jobtype,blklatdd_h,blklondd_h,huajoblistmerge
0,,,,,21223100200,3.0,0.0,,,,...,,f8c00a5d-f1ed-400b-b4dc-398c76119a9b,POINT (-78.99576 34.62243),Rowland Norment Elementary,ID371559612002006212231002003022jidodJT07133jo...,3.0,JT07,38.51265,-85.271997,right_only
1,,,,,37017950300,1.0,0.0,,,,...,,f8c00a5d-f1ed-400b-b4dc-398c76119a9b,POINT (-78.99576 34.62243),Rowland Norment Elementary,ID371559612002006370179503001059jidodJT07213jo...,1.0,JT07,34.644261,-78.71662,right_only
2,,,,,37019020304,3.0,0.0,,,,...,,f8c00a5d-f1ed-400b-b4dc-398c76119a9b,POINT (-78.99576 34.62243),Rowland Norment Elementary,ID371559612002006370190203042017jidodJT07223jo...,2.0,JT07,33.953199,-78.086835,right_only
3,,,,,37047930200,3.0,0.0,,,,...,,f8c00a5d-f1ed-400b-b4dc-398c76119a9b,POINT (-78.99576 34.62243),Rowland Norment Elementary,ID371559612002006370479302001019jidodJT07333jo...,3.0,JT07,34.362963,-78.411764,right_only
4,,,,,37047930600,1.0,0.0,,,,...,,f8c00a5d-f1ed-400b-b4dc-398c76119a9b,POINT (-78.99576 34.62243),Rowland Norment Elementary,ID371559612002006370479306003057jidodJT07323jo...,2.0,JT07,34.284822,-78.899931,right_only


In [34]:
huajoblist_inventory['huajoblistmerge'].describe()

count         61526
unique            3
top       left_only
freq          61452
Name: huajoblistmerge, dtype: object

In [35]:
# Keep only observations with linked data
huajoblist_inventory = huajoblist_inventory.loc[~huajoblist_inventory['jobtype'].isna()]
huajoblist_inventory.head()

Unnamed: 0,h_guid,h_geometry,strctid,huid,h_geocode_tractid,race,hispan,gqtype,ownershp,numprec,...,Building Data,w_guid,w_geometry,school_name,uniquejobid,Earnings,jobtype,blklatdd_h,blklondd_h,huajoblistmerge
0,,,,,21223100200,3.0,0.0,,,,...,,f8c00a5d-f1ed-400b-b4dc-398c76119a9b,POINT (-78.99576 34.62243),Rowland Norment Elementary,ID371559612002006212231002003022jidodJT07133jo...,3.0,JT07,38.51265,-85.271997,right_only
1,,,,,37017950300,1.0,0.0,,,,...,,f8c00a5d-f1ed-400b-b4dc-398c76119a9b,POINT (-78.99576 34.62243),Rowland Norment Elementary,ID371559612002006370179503001059jidodJT07213jo...,1.0,JT07,34.644261,-78.71662,right_only
2,,,,,37019020304,3.0,0.0,,,,...,,f8c00a5d-f1ed-400b-b4dc-398c76119a9b,POINT (-78.99576 34.62243),Rowland Norment Elementary,ID371559612002006370190203042017jidodJT07223jo...,2.0,JT07,33.953199,-78.086835,right_only
3,,,,,37047930200,3.0,0.0,,,,...,,f8c00a5d-f1ed-400b-b4dc-398c76119a9b,POINT (-78.99576 34.62243),Rowland Norment Elementary,ID371559612002006370479302001019jidodJT07333jo...,3.0,JT07,34.362963,-78.411764,right_only
4,,,,,37047930600,1.0,0.0,,,,...,,f8c00a5d-f1ed-400b-b4dc-398c76119a9b,POINT (-78.99576 34.62243),Rowland Norment Elementary,ID371559612002006370479306003057jidodJT07323jo...,2.0,JT07,34.284822,-78.899931,right_only


In [36]:
# Add check on connection between joblist and IN-CORE data enviroment
huajoblist_inventory.loc[:,'Job Data Link'] = "Not Linked"
huajoblist_inventory['Job Data Link'].notes = "Flag job data linked to structures and housing unit inventory"
huajoblist_inventory.loc[huajoblist_inventory['Building Data'] == "Linked to guid", 'Job Data Link'] = "Linked to huid and guid"
huajoblist_inventory.loc[huajoblist_inventory['Building Data'] == 'Not Linked', 'Job Data Link'] = "Linked to huid but not guid"
huajoblist_inventory.loc[huajoblist_inventory['huajoblistmerge'] == "right_only", 'Job Data Link'] = "Not Linked to huid or guid"

In [37]:
def add_job_earnings(df):

    df['Job Earnings'] = "0 No Earnings Data"
    df['Job Earnings'].notes = "Identify Job Earnings by LODES Data."

    df.loc[(df['Earnings'] == 1),'Job Earnings'] = "\\$1,250/month or less"
    df.loc[(df['Earnings'] == 2),'Job Earnings'] = "\\$1,251/month to \\$3,333/month"
    df.loc[(df['Earnings'] == 3),'Job Earnings'] = "greater than \\$3,333/month"
    # Set Earnings data is missing set to missing- makes tables look nicer
    df.loc[(df['Job Earnings'] == "0 No Earnings Data"),'Job Earnings'] = np.nan

    return df

In [38]:
huajoblist_inventory = add_race_ethnicity_to_hua_df(huajoblist_inventory)

In [39]:
huajoblist_inventory = add_job_earnings(huajoblist_inventory)

In [43]:
pop_results_table(huajoblist_inventory, 
                  who = "Total Jobs", 
                  what = "by Job Earnings and Data Link",
                  where = "Rowland Norment Elementary, Lumberton NC",
                  when = "2015",
                  row_index = 'Job Earnings',
                  col_index = 'Job Data Link',
                  row_percent = 'Linked to huid and guid')

Job Data Link,Linked to huid and guid,Linked to huid but not guid,Not Linked to huid or guid,Total Jobs,Percent Row Linked to huid and guid
Job Earnings,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
"\$1,250/month or less",5,6,3,14,35.7%
"\$1,251/month to \$3,333/month",9,10,7,26,34.6%
"greater than \$3,333/month",13,10,11,34,38.2%
Total,27,26,21,74,36.5%


In [44]:
pop_results_table(huajoblist_inventory, 
                  who = "Total Jobs", 
                  what = "by Job Race and Data Link",
                  where = "Rowland Norment Elementary, Lumberton NC",
                  when = "2015",
                  row_index = 'Race Ethnicity',
                  col_index = 'Job Data Link',
                  row_percent = 'Linked to huid and guid')

Job Data Link,Linked to huid and guid,Linked to huid but not guid,Not Linked to huid or guid,Total Jobs,Percent Row Linked to huid and guid
Race Ethnicity,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
"1 White alone, Not Hispanic",7.0,7.0,11,25,28.0%
"2 Black alone, Not Hispanic",9.0,9.0,4,22,40.9%
"3 American Indian and Alaska Native alone, Not Hispanic",11.0,10.0,5,26,42.3%
"6 Any Race, Hispanic",,,1,1,nan%
Total,27.0,26.0,21,74,36.5%


## Check Joblist with School Building

In [42]:
# Save Work at this point as CSV
savefile = sys.path[0]+"/"+outputfolder+"/"+programname+".csv"
huajoblist_inventory.to_csv(savefile)