Skip to content


Switch branches/tags

Latest commit


Git stats


Failed to load latest commit information.
Latest commit message
Commit time

DEEP-MAPS Model of the Labor Force

This is the source code used to produce DEEP-MAPS estimates of the labor force, as described in this working paper.

Data Sources

To run this code, you will have to download (publicly available) data from the following sources:

Code is provided to download other data:


The code was primarily written in R. The following packages are used:

We also used the Vertica database to load and transform the CPS/ACS microdata. The transformations are straight forward andwritten in SQL; an alternative database can be used to do these transformations, or they can be done inside of R using the sqldf package (or others), given sufficient memory resources.

Directory Structure

Here is a brief description of the most important components of the code. To run the code, you will have to change the working directory name (or remove it entirely), and download data/packages that are listed above. To understand how the model works (and the order in which the code needs to be run), see the working paper. In the future, we intend to clean up and document the code more thoroughly, if there is sufficient outside interest.

  • unemployment_insurance_claims: downloads data for weekly unemployment insurance claims.
  • unemployment_cps_mrp
    • mrsp: creates synthetic joint distributions at state, county, and tract level, built from ACS data. The structure of the code to create the county-level distributions is:
    • run_mrp: here is where the bulk of the modeling and poststratification are run:
      • 01_load_acsocc_data.R: load data from the ACS, and prep data for the occupation/industry models.
      • 02_occ_models.R: Fit occupation/industry models, to be used in the Demographically Adjusted Geographic Predictors (DAGPs).
      • 03a_ipums_version.R: run the labor force models, using IPUMS data as the source for microdata.
      • 03b_bls_version.R: run the labor force models, downloading data straight from the BLS website (if IPUMS data is not available yet).
      • 04a_state_yhat.R: apply models to state-level joint distribution.
      • 04b_county_yhat.R: apply models to county-level joint distribution.
      • 04c_tract_yhat.R: apply models to tract-level joint distribution.
      • 05_correct_to_laus.R: adjust estimates once new LAUS data is available.
      • 06_cps_checks.R: check resulting model output against CPS and LAUS.
    • cross_dataset_variable_lineups: where we lined up the different coding schemes across data sources.
    • downloaded_data: code to download some of the necessary data.
    • helper_functions: functions used throughout the rest of the process.
    • adhoc: additional pieces, currently holds scripts for prepping additional tract-level data.


This code was primarily written by Yair Ghitza, in collaboration with Mark Steitz. Other members of the Catalist team contributed ideas and code review. See the working paper for additional acknowledgements.




[1] Sarah Flood, Miriam King, Renae Rodgers, Steven Ruggles and J. Robert Warren. Integrated Public Use Microdata Series, Current Population Survey: Version 7.0 [dataset]. Minneapolis, MN: IPUMS, 2020.

[2] Steven Ruggles, Sarah Flood, Ronald Goeken, Josiah Grover, Erin Meyer, Jose Pacas and Matthew Sobek. IPUMS USA: Version 10.0 [dataset]. Minneapolis, MN: IPUMS, 2020.


This is the source code used to produce DEEP-MAPS estimates of the labor force, as described in the included working paper.







No packages published