CTA-NorCal Homeless Program Outcomes Analysis
What variables best predict whether an individual is categorized as ‘in permanent housing’ as an outcome, by population segment:
- Chronically Homeless
- Continuously Homeless
- Has Disabling Condition
- Domestic Violence Victim
Data is in HMIS format, a data standard defined by the US Department of Housing and Urban Development
View the HMIS Data Science Study Presentation for a summary of our findings
- Load and clean the data
- Explore the data
- Feature engineering to prepare input variables
- Make outcome predictions with logistic regression model
Install Jupyter Notebook; this is most easily done by installing Anaconda: https://www.continuum.io/downloads
Install seaborn. To do this in a new conda environment:
conda create --name datasci seaborn
To deactivate/activate the environment:
source deactivate datasci
source activate datasci
- Fork this repository and clone it locally.
- Locate the dataset (pinned in #datasci-homeless on Slack).
- Navigate to notebooks/load_data_example_v2.ipynb to start exploring the data.
Additional information on completed and open items can be found in the pinned documents in #datasci-homeless.