This repository contains the data and code underlying the paper "The Comparative Advantage of Cities" by Donald Davis and Jonathan Dingel. This replication package was prepared by Jonathan Dingel, with assistance from Dylan Clarke, Luis Costa, Antonio Miscio, and Shirley Yarin.
Our project is organized as a series of tasks.
The main project directory contains 9 folders that represent 9 tasks.
Each task folder contains three folders:
A task's output is used as an input by one or more downstream tasks.
This graph depicts the input-output relationships between tasks.
We use Unix's
make utility to automate this workflow.
After downloading this replication package (and installing the relevant software), you can reproduce the figures and tables appearing in the published paper and the online appendix simply by typing
make at the command line.
Note to Mac OS X users:
The code presumes that Stata scripts can be run from Terminal via the command
Please follow the instructions for Running Stata from the Terminal.
Download and configure
- Download (or clone) this repository by clicking the green
Clone or downloadbutton above. Uncompress the ZIP file into a working directory on your cluster or local machine.
- Download the IPUMS micro data from https://usa.ipums.org/usa/. You will need to register as an IPUMS-USA user in order to download the public-use micro data from the 1980 and 2000 Census of Population releases. See details below.
- (Optional) If you will be running your jobs using Slurm on a computing cluster, edit the file
commoncode/code/run.sbatchto specify the
IPUMS download details
- The files
initialdata/inputcontain lists of the variables to download.
- Do not extract any extra variables beyond the ones listed. The scripts
initialdata/codemake assumptions about the contents of these files.
- Rename the
.datfiles that you download from IPUMS as
IPUMS_2000.datand place them in the
make in the working directory at the Linux/MacOSX command line will execute all the project code.
Notes on running code
- It is best to replicate the project using the
makeapproach described above. Nonetheless, it is also possible to produce the results task-by-task in the order depicted in the flow chart. If all upstream tasks have been completed, you can complete a task by navigating to the task's
codedirectory and typing
install_packagestasks require an internet connection.
- The task
permutation_testsis pretty slow. It involves 30 (parallel) jobs that take about 12 hours each.