## Galveston Housing Unit Allocation

Housing Unit Allocation using Galveston Oregon Housing Unit Inventory
Notebook from
https://github.com/IN-CORE/pyincore/tree/develop/pyincore/analyses/housingunitallocation

In [1]:
import pandas as pd
import numpy as np
import sys # For displaying package versions
import os # For managing directories and file paths if drive is mounted


from pyincore import IncoreClient, Dataset, FragilityService, MappingSet, DataService
from pyincore.analyses.housingunitallocation import HousingUnitAllocation

In [2]:
client = IncoreClient()

Enter username:  mondrejc
Enter password:  ········


Connection successful to IN-CORE services. pyIncore version detected: 0.9.0


In [3]:
# Check package versions - good practice for replication
print("Python Version ",sys.version)
print("pandas version: ", pd.__version__)
print("numpy version: ", np.__version__)

Python Version  3.7.10 | packaged by conda-forge | (default, Feb 19 2021, 16:07:37) 
[GCC 9.3.0]
pandas version:  1.2.2
numpy version:  1.18.1


In [4]:
# Check working directory - good practice for relative path access
os.getcwd()

'/home/mondrejc'

## Initial Interdependent Community Description - Galveston, Texas

Explore building inventory and social systems. Specifically look at how the building inventory connects with the housing unit inventory using the housing unit allocation.
The housing unit allocation method will provide detail demographic characteristics for the community allocated to each structure.

In [5]:
# Galveston, TX Housing unit inventory
housing_unit_inv = "5fc6ab1cd2066956f49e7a03"
# Galveston, TX Address point inventory
address_point_inv = "5fc6aadcc38a0722f563392e"
# Galveston, TX Building inventory
building_inv = "60354b6c123b4036e6837ef7"

## Run Housing Unit Allocation 
https://github.com/IN-CORE/incore-docs/blob/master/notebooks/housingunitallocation.ipynb

Rosenheim, Nathanael, Roberto Guidotti, Paolo Gardoni & Walter Gillis Peacock. (2019). Integration of detailed household and housing unit characteristic data with critical infrastructure for post-hazard resilience modeling. Sustainable and Resilient Infrastructure. doi.org/10.1080/23789689.2019.1681821

In [6]:
# Create housing allocation 
hua = HousingUnitAllocation(client)

# Load input dataset
hua.load_remote_input_dataset("housing_unit_inventory", housing_unit_inv)
hua.load_remote_input_dataset("address_point_inventory", address_point_inv)
hua.load_remote_input_dataset("buildings", building_inv)

# Specify the result name
result_name = "Galveston_HUA"

seed = 1238
iterations = 1

# Set analysis parameters
hua.set_parameter("result_name", result_name)
hua.set_parameter("seed", seed)
hua.set_parameter("iterations", iterations)

Dataset already exists locally. Reading from local cached zip.
Unzipped folder found in the local cache. Reading from it...
Dataset already exists locally. Reading from local cached zip.
Unzipped folder found in the local cache. Reading from it...
Dataset already exists locally. Reading from local cached zip.
Unzipped folder found in the local cache. Reading from it...


True

In [7]:
# Run Housing unit allocation analysis
hua.run_analysis()

True

In [8]:
# Retrieve result dataset
result = hua.get_output_dataset("result")

# Convert dataset to Pandas DataFrame
hua_df = result.get_dataframe_from_csv(low_memory=False)

# Display top 5 rows of output data
hua_df.head()

Unnamed: 0,addrptid,strctid,archetype,struct_typ,year_built,no_stories,a_stories,b_stories,bsmt_type,sq_foot,...,race,hispan,hispan_flag,vacancy,gqtype,incomegroup,randincome,randomhu,aphumerge,geometry
0,XREF0628-0065-0000-000AP014,XREF0628-0065-0000-000,,W1,,,,,,,...,2.0,0.0,2.0,0,0,15.0,142028.8,0.391309,both,POINT (-94.79252 29.3092)
1,XREF0628-0065-0000-000AP012,XREF0628-0065-0000-000,,W1,,,,,,,...,1.0,0.0,1.0,0,0,17.0,325475.7,0.414422,both,POINT (-94.79252 29.3092)
2,XREF0628-0065-0000-000AP004,XREF0628-0065-0000-000,,W1,,,,,,,...,1.0,0.0,1.0,0,0,17.0,260189.9,0.927559,both,POINT (-94.79252 29.3092)
3,XREF0628-0065-0000-000AP005,XREF0628-0065-0000-000,,W1,,,,,,,...,1.0,0.0,1.0,0,0,17.0,347593.4,0.090391,both,POINT (-94.79252 29.3092)
4,XREF0628-0065-0000-000AP010,XREF0628-0065-0000-000,,W1,,,,,,,...,1.0,0.0,1.0,0,0,17.0,558643.5,0.284164,both,POINT (-94.79252 29.3092)


## Explore results from Housing Unit Allocation

Keep observations that are matched to a building.

In [9]:
hua_df = hua_df.loc[hua_df['aphumerge'] == 'both']

In [10]:
hua_df['Race Ethnicity'] = "0 Vacant HU No Race Ethnicity Data"
hua_df['Race Ethnicity'].notes = "Identify Race and Ethnicity Housing Unit Characteristics."

hua_df.loc[(hua_df['race'] == 1) & (hua_df['hispan'] == 0),'Race Ethnicity'] = "1 White alone, Not Hispanic"
hua_df.loc[(hua_df['race'] == 2) & (hua_df['hispan'] == 0),'Race Ethnicity'] = "2 Black alone, Not Hispanic"
hua_df.loc[(hua_df['race'].isin([3,4,5,6,7])) & (hua_df['hispan'] == 0),'Race Ethnicity'] = "3 Other Race, Not Hispanic"
hua_df.loc[(hua_df['hispan'] == 1),'Race Ethnicity'] = "4 Any Race, Hispanic"
hua_df.loc[(hua_df['gqtype'] >= 1),'Race Ethnicity'] = "5 Group Quarters no Race Ethnicity Data"

# Check new variable
table_title = "Confirm housing unit characteristic by Race and Ethnicity."
pd.crosstab(hua_df['Race Ethnicity'], hua_df['race'], 
            margins=True, margins_name="Total").style.set_caption(table_title)

race,1.0,2.0,3.0,4.0,6.0,7.0,Total
Race Ethnicity,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
"1 White alone, Not Hispanic",10912,0,0,0,0,0,10912
"2 Black alone, Not Hispanic",0,3607,0,0,0,0,3607
"3 Other Race, Not Hispanic",0,0,26,607,4,145,782
"4 Any Race, Hispanic",2738,25,57,6,1520,184,4530
Total,13650,3632,83,613,1524,329,19831


In [11]:
# Check new variable
table_title = "Confirm housing unit characteristic by Race and Ethnicity."
pd.crosstab(hua_df['Race Ethnicity'], hua_df['hispan'], 
            margins=True, margins_name="Total").style.set_caption(table_title)

hispan,0.0,1.0,Total
Race Ethnicity,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
"1 White alone, Not Hispanic",10912,0,10912
"2 Black alone, Not Hispanic",3607,0,3607
"3 Other Race, Not Hispanic",782,0,782
"4 Any Race, Hispanic",0,4530,4530
Total,15301,4530,19831


In [12]:
table_title = "Table 1. Housing Unit Characteristics by Race and Ethnicity"
table1 = pd.pivot_table(hua_df, values='numprec', index=['Race Ethnicity'],
                              margins = True, margins_name = 'Total',
                              aggfunc=[len, np.sum], 
                              fill_value=0).reset_index().rename(
                                                            columns={'len': 'Housing Unit',
                                                                     'sum' : 'Population',
                                                                     'numprec': 'Count'})

varformat = {('Housing Unit','Count'): "{:,}", ('Population','Count'): "{:,}"}

In [13]:
table1.style.set_caption(table_title).format(varformat).set_table_styles([
    dict(selector='th', props=[('text-align', 'center')]),])

Unnamed: 0_level_0,Race Ethnicity,Housing Unit,Population
Unnamed: 0_level_1,Unnamed: 1_level_1,Count,Count
0,0 Vacant HU No Race Ethnicity Data,12657,0
1,"1 White alone, Not Hispanic",10912,21220
2,"2 Black alone, Not Hispanic",3607,8302
3,"3 Other Race, Not Hispanic",782,1698
4,"4 Any Race, Hispanic",4530,13424
5,5 Group Quarters no Race Ethnicity Data,13,240
6,Total,32501,44884


## Validate the Housing Unit Allocation has worked
Notice that the population count totals for the community should match (pretty closely) data collected for the 2010 Decennial Census.
This can be confirmed by going to data.census.gov

https://data.census.gov/cedsci/table?q=DECENNIALPL2010.P1&g=1600000US4828068,4837252&tid=DECENNIALSF12010.P1
    
Differences in the housing unit allocation and the Census count may be due to differences between political boundaries and the building inventory. See Rosenheim et al 2019 for more details.

The housing unit allocation, plus the building dresults will become the input for the dislocation model.

In [14]:
# Save cleaned HUA file as CSV
hua_df.to_csv(result_name+str(seed)+'_cleaned.csv')