DEEP-MAPS Model of the Labor Force

This is the source code used to produce DEEP-MAPS estimates of the labor force, as described in this working paper.

Data Sources

To run this code, you will have to download (publicly available) data from the following sources:

IPUMS CPS: microdata for CPS surveys [1]
IPUMS USA: microdata for ACS surveys [2]
Weekly unemployment insurance claims: from the US Department of Labor. Code in the unemployment_insurance_claims folder scrapes historical data, but the most recent weeks have to be downloaded manually in PDF form.

Code is provided to download other data:

Geographic ACS averages: we use the tidycensus package.
CPS Time Series.
Local Area Unemployment Statistics.

Software

The code was primarily written in R. The following packages are used:

We also used the Vertica database to load and transform the CPS/ACS microdata. The transformations are straight forward andwritten in SQL; an alternative database can be used to do these transformations, or they can be done inside of R using the sqldf package (or others), given sufficient memory resources.

Directory Structure

Here is a brief description of the most important components of the code. To run the code, you will have to change the working directory name (or remove it entirely), and download data/packages that are listed above. To understand how the model works (and the order in which the code needs to be run), see the working paper. In the future, we intend to clean up and document the code more thoroughly, if there is sufficient outside interest.

unemployment_insurance_claims: downloads data for weekly unemployment insurance claims.
unemployment_cps_mrp
- mrsp: creates synthetic joint distributions at state, county, and tract level, built from ACS data. The structure of the code to create the county-level distributions is:
  - 01_county_2018_joint.R: turns "marginal" demographic distributions into a full joint distribution for every county.
  - 02_county_2018_occscore.R: applies various occupation/industry models to each cell.
  - 03_county_2018_occshift.R: adjusts each cell to add up to the correct number at the county level.
  - Similar code is found for the tract level. For the state level, the joint demographic distribution is pulled directly from ACS microdata, but we still do steps 2 and 3.
- run_mrp: here is where the bulk of the modeling and poststratification are run:
  - 01_load_acsocc_data.R: load data from the ACS, and prep data for the occupation/industry models.
  - 02_occ_models.R: Fit occupation/industry models, to be used in the Demographically Adjusted Geographic Predictors (DAGPs).
  - 03a_ipums_version.R: run the labor force models, using IPUMS data as the source for microdata.
  - 03b_bls_version.R: run the labor force models, downloading data straight from the BLS website (if IPUMS data is not available yet).
  - 04a_state_yhat.R: apply models to state-level joint distribution.
  - 04b_county_yhat.R: apply models to county-level joint distribution.
  - 04c_tract_yhat.R: apply models to tract-level joint distribution.
  - 05_correct_to_laus.R: adjust estimates once new LAUS data is available.
  - 06_cps_checks.R: check resulting model output against CPS and LAUS.
- cross_dataset_variable_lineups: where we lined up the different coding schemes across data sources.
- downloaded_data: code to download some of the necessary data.
- helper_functions: functions used throughout the rest of the process.
- adhoc: additional pieces, currently holds scripts for prepping additional tract-level data.

Contributors

This code was primarily written by Yair Ghitza, in collaboration with Mark Steitz. Other members of the Catalist team contributed ideas and code review. See the working paper for additional acknowledgements.

License

See LICENSE.md.

References

[1] Sarah Flood, Miriam King, Renae Rodgers, Steven Ruggles and J. Robert Warren. Integrated Public Use Microdata Series, Current Population Survey: Version 7.0 [dataset]. Minneapolis, MN: IPUMS, 2020. https://doi.org/10.18128/D030.V7.0

[2] Steven Ruggles, Sarah Flood, Ronald Goeken, Josiah Grover, Erin Meyer, Jose Pacas and Matthew Sobek. IPUMS USA: Version 10.0 [dataset]. Minneapolis, MN: IPUMS, 2020. https://doi.org/10.18128/D010.V10.0

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
unemployment_cps_mrp		unemployment_cps_mrp
unemployment_insurance_claims		unemployment_insurance_claims
LICENSE.md		LICENSE.md
README.md		README.md
deep_maps_20200804.pdf		deep_maps_20200804.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

unemployment_cps_mrp

unemployment_cps_mrp

unemployment_insurance_claims

unemployment_insurance_claims

LICENSE.md

LICENSE.md

README.md

README.md

deep_maps_20200804.pdf

deep_maps_20200804.pdf

Repository files navigation

DEEP-MAPS Model of the Labor Force

Data Sources

Software

Directory Structure

Contributors

License

References

About

Releases

Packages

Languages

License

Catalist-LLC/unemployment

Folders and files

Latest commit

History

Repository files navigation

DEEP-MAPS Model of the Labor Force

Data Sources

Software

Directory Structure

Contributors

License

References

About

Resources

License

Stars

Watchers

Forks

Languages