Forecasting Regional Aggregate Establishment Birth-Death Values: Using Algorithmic Modeling

Overview

This repository serves as a working codebase for our machine learning employment forecasting models, developed in partnership with The San Diego Association of Government (SANDAG). Using various python machine learning packages, we trained and evaluated candidate modeling architectures to forecast establishment growth in the San Diego County.

There are currently two ways you can interact with our work:

Interacting with our executable script run.py (recommended)
Exploring our work through our development interactive python notebooks
- note disclosures in notebooks/README.md

Setting up the Enviroment

To begin running our models, you must first replicate our virtual enviroment. To do so, follow the below steps:

Clone the repository and navigate to the project directory:

git clone https://github.com/inno-apfel/regional-business-growth-forecasting.git
cd regional-business-growth-forecasting

Create and activate a conda enviroment from the provided enviroment.yml file:

conda env create -f enviroment.yml
conda activate regional-business-growth-forecasting

Retrieving the Data Locally:

Before running our forecasting models, you must set up a few required datasets. Our models are built on the Census Bureau County Business Patterns zip-code industry details totals datasets, along with various American Community Survey socio-demographic and economics datasets. For ease of use and reproducibility, a mirror of the datasets we used can be obtained (here).

To access the included data, extract the compressed data into the directory src/data. The result should be the creation of the raw folder within the data directory.

Last Updated: 03/10/2024

Running the Project

To use our forecasting models, run the run.py script from your terminal with the following targets:

data: load and process the data according to config.json
- by default, models are trained and evaluated on data for the San Diego region between 2012-2021, recreating our results in report.pdf
- update config.json to include relavent zip-codes and years if you choose to explore different regions or year ranges.
features: build the neccessary features for our models
models:
1. train and evaluate our forecasting models for immediate-next-year and long-term forecasting
2. generate comparison visualization of model forecasts on test data
forecast:
- generate chloropleth maps for model forecasts on last year in test data
- generate region level forecasts from end of training data up to a user input year
  - uses our autoregressive feedback LSTM models
clean:
- removes all temporary files in the following directories:
  - src/data/temp
  - out/forecast_tables
  - out/models
  - out/plots
all: run the above targets, except clean, in sequential order

notes: stable versions of saved models are included by default, run clean to delete them before re-training

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

config

config

notebooks

notebooks

out

out

src

src

.gitignore

.gitignore

README.md

README.md

dev_notes.txt

dev_notes.txt

enviroment.yml

enviroment.yml

report.pdf

report.pdf

run.py

run.py

Repository files navigation

Forecasting Regional Aggregate Establishment Birth-Death Values: Using Algorithmic Modeling

Overview

Setting up the Enviroment

Retrieving the Data Locally:

Running the Project

Contributors

About

Releases

Packages

Contributors 3

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 193 Commits
config		config
notebooks		notebooks
out		out
src		src
.gitignore		.gitignore
README.md		README.md
dev_notes.txt		dev_notes.txt
enviroment.yml		enviroment.yml
report.pdf		report.pdf
run.py		run.py

inno-apfel/regional-business-growth-forecasting

Folders and files

Latest commit

History

Repository files navigation

Forecasting Regional Aggregate Establishment Birth-Death Values: Using Algorithmic Modeling

Overview

Setting up the Enviroment

Retrieving the Data Locally:

Running the Project

Contributors

About

Resources

Stars

Watchers

Forks

Languages