# Understanding Housing Inventory
In order to create policies and take action, housing actors need to understand the deficit–or surplus–in available housing units for households at different income levels. Without having this understanding, builders, policy-makers, and service providers do not have a clear understanding of how much inventory is needed, by when, to alleviate housing burdens in a community and support healthy community growth. In this challenge, you’ll work to generate insights, repeatable processes, and prototype tools for housing stakeholders to use to inform their community action.

 - The big question a housing stakeholder is trying to answer with the results of this challenge is: does my target community have a housing inventory surplus or deficit for my target service population?
 - The big question a community resident is trying to answer is: is there housing affordable to me in my target community?
 - 
## Community focus: Orlando
The Christian Service Center, based in Central Florida, is looking to answer this specific inventory question for their service area of the MSA that is Orlando-Kissimmee-Sanford. Eric Gray, Executive Director of the Christian Service Center has tried to answer this question in several ways and is looking to the DataKind community for help. Eric and his team will be expert resources throughout the DataKit. If you’re local to Orlando, The Orlando Devs will be arranging a campus tour during the DataKit - stay tuned!

## Get started with existing data
In order to understand housing inventory needs, housing actors must understand the population living in their community across income levels. Housing stakeholders utilize a spectrum from extremely low income to upper income as the population income distribution, as published by the US Dept of Housing and Urban Development (HUD.) This spectrum is anchored by HUD/US Census calculations for area median income (AMI). AMI is also called median family income (MFI). The five segments of the housing and income spectrum are as follows:

 - Extremely low income: Below 30% of AMI
 - Very low income: Below 50% of AMI
 - Low income: Below 80% of AMI
 - Moderate income: Between 80% and 120% of AMI
 - Upper income: >120% AMI

Continue reading <u>[here](https://github.com/datakind/datakit-housing-fall-2024/discussions/1).</u>

Import Libraries

In [17]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import streamlit as st
import os, sys
from dotenv import load_dotenv, find_dotenv

# Add directory two levels up from the current working directory to Python's module search path,
# which allows modules import from that directory or its subdirectories.
sys.path.append('../../')

from column_mapper import map_column_names


load_dotenv(find_dotenv('paths.env'))

True

In [18]:
inventory = pd.read_csv('housing-data/ca/data_1-CA.csv')

inventory.head()

Unnamed: 0,geoid,geoid_year,state,county,state_fips_code,county_fips_code,loan_amount,median_mortgage_amount,median_prop_value,median_sba504_loan_amount,...,s2701_c05_015e,s2701_c05_015m,s2701_c05_001e,s2701_c05_001m,b19083_001e,b19083_001m,economic_distress_pop_agg,economic_distress_simple_agg,investment_areas,opzone
0,6001400400,2020,6,1,6,1,614.434904,825000.0,1775000.0,,...,3.5,3.3,2.3,1.9,0.5063,0.0557,NO,NO,NO,0
1,6001400700,2020,6,1,6,1,7888.636063,585000.0,1225000.0,379000.0,...,2.8,2.2,2.5,1.6,0.4433,0.0512,YES,YES,YES,0
2,6001400800,2020,6,1,6,1,88162.749414,615000.0,995000.0,459500.0,...,0.0,2.1,2.9,2.9,0.5274,0.0727,NO,YES,YES,0
3,6001400900,2020,6,1,6,1,111849.950855,625000.0,1125000.0,329500.0,...,4.9,5.5,6.2,5.1,0.4619,0.0603,YES,YES,YES,0
4,6001401500,2020,6,1,6,1,269796.817978,515000.0,855000.0,391000.0,...,5.1,3.6,7.6,3.3,0.5611,0.0546,YES,YES,YES,1


### Load column names meanings

From dictionary provided in the housing data. This is important to know what coded feature names mean

In [22]:
head_dict = map_column_names(os.getenv('data_dictionary_1_CA'))
head_dict.keys()

dict_keys(['b19083_001e', 'b19083_001m', 'b23025_002e', 'b23025_002m', 'b23025_004e', 'b23025_004m', 'b23025_005e', 'b23025_005m', 'b23025_006e', 'b23025_006m', 'dp05_0035pe', 'dp05_0037pe', 'dp05_0038pe', 'dp05_0039pe', 'dp05_0044pe', 'dp05_0052pe', 'dp05_0057pe', 'economic_distress_pop_agg', 'economic_distress_simple_agg', 'investment_areas', 'loan_amount', 'median_mortgage_amount', 'median_prop_value', 'median_sba504_loan_amount', 'median_sba7a_loan_amount', 'num_mortgage', 'num_mortgage_denials', 'num_mortgage_originated', 'number_of_sba504_loans', 'number_of_sba7a_loans', 'opzone', 's0101_c01_032e', 's0101_c01_032m', 's0101_c02_020e', 's0101_c02_020m', 's0101_c02_021e', 's0101_c02_021m', 's0101_c02_022e', 's0101_c02_022m', 's0101_c02_023e', 's0101_c02_023m', 's0101_c02_024e', 's0101_c02_024m', 's0101_c02_025e', 's0101_c02_025m', 's0101_c02_026e', 's0101_c02_026m', 's0101_c02_027e', 's0101_c02_027m', 's0101_c02_028e', 's0101_c02_028m', 's0101_c02_029e', 's0101_c02_029m', 's0101_c02

In [23]:
head_dict['s2701_c05_008m']

'ACS - Percentage of Noninstitutionalized Population without Health Insurance (Uninsured) - By Age - 55 to 64 years - Margin of Error'