This repository is created to develop the paper Does Compulsory School Attendance Affect Schooling and Earnings? (1991) by Joshua D. Angrist and Alan B. Krueger. The group is made up of Frank Alvarado, Jeffry Bedia, Fernanda Martel and Andrea Romero.
Requirements Setup Folders Files Leveraging on Github Capabilities Writing Journal Submissions Principles
This workflow requires:
Python [Free] LaTeX [Free] Stata [Licensed]
Originally adapted for OSX (Apple) environments. But feel free to adapt it to Windows (and please share it with me!).
Create or join a cloud folder (e.g. on Dropbox or Drive) where large non-versioned files will reside. Clone this repository to a local folder outside the cloud folder. Fill out the correct environment paths in setup.sh. Run sh setup.sh to create symbolic links to all non-versioned folders. You're good to go. This repository is now ready for the standard workflow described below.
/code Versioned folder containing code that builds data and performs analyses.
All output data should be redirected into /output/data/, with one data file per observation level.
All output logs should be redirected into /output/logs/.
Other output files should be redirected into /output/tables/ or /output/figures/.
Keep all code clean and modularized.
/sub
Holds modularized code to implement subroutines for build and analysis code. /input Symbolic link to non-versioned folder with input data. Any original data source should be included here in clean and normalized form. Only include cleaned files. Raw external files should be cleaned in each data source specific folder. These data sets will then be manipulated and merged by the files in /code. /output Symbolic link to non-versioned folder with output data. Holds built data sets in /output/data/, to be then used in analysis code. Contains all analysis objects generated by files in /code. Will then serve as source for the generation of .tex files inside /products/. /tmp Symbolic link to non-versioned folder with temporary files. Contains any temporary file created during the manipulation of input data sets or the analysis routine. /extra Contains any extra file relevant to the paper. Examples: grant materials, previous analyses, submissions. /products Versioned folder containing files for preliminary results, papers, talks, and others.
/sub
Curated set of packages and shortcuts commonly used in Social Science papers and presentations.
run_paper.py Automates the whole paper construction. Runs everything in a pre-specified order, from beginning (building data sets) to end (compiling .tex files). Keeps clear what should be run when. Also cleans /output and /tmp folders before running other code. /code/get_input.py Erases any file inside /input and copies any original data set from outside sources. Ensures consistency across original data generation and data building for paper. 5. Leveraging on Github capabilities Use issues as tasks. Track it all on a project board named "Tasks". Add tags to tasks to track progress by area. Some template tags included: build, analysis, writing, review, enhancement, bug. Name commits following conventional notation. Add forward-looking tags and milestones to plan and version work. These help marking relevant releases, such as a minimum viable product (MVP), a paper submission, or a talk. Use semantic versioning for naming, e.g. v0.1, v1.0.2. Only modify files via pull requests. Use closing keywords to close issues.
All writing should be done within the repository to preserve versioning and consistency.
Keep a set of continuously-updated slides reflecting the current state and vision of the project. Sync this repository with online LaTeX editing tools, such as Overleaf, for simultaneous editing and comments. Follow these instructions to integrate Overleaf with Dropbox. Sync Overleaf project to a folder /products/paper. Important: the project's "true" state stays in Github. Thus, things always have to be committed on Github after edits. Keep all .bib references organized in /extra/references/library.bib. See principle about reference manager systems below.