Replication data and code for "The Unequal Impact of Parenthood in Academia"
Contains several helper or one-off (scripts
), as well as notebooks and scripts (allocation.py
& difference_in_differences.py
) for major analyses. Information on respondents knowledge of, perceptions around, and usage of parental leave policies (parental_support_responses.ipynb
; Fig 3 & Table 1). Comparisons of our survey respondents with the population they are drawn from (survey_post_stratification.ipynb
; Tables S1-S4).
Scripts allocation.py
read in censored survey data and allocate counterfactual births to the non-parent group. This script also generates several plots of comparisons between parents and non-parents (Figs 1 & S1-S2). The files generated by that script are read by difference_in_difference.py
, where the comparative interrupted time series modeling is carried out (Figs 2 & 4, & Table S5-S7).
Given the sensitive nature of this data, we are only releasing separate files, without identifiers so they are unable joined together. Prestige has been binned to deciles and real years and ages have been removed from the data to reduce the possibility of identifying individuals.
Basic statistics of faculty with and without children, such as their current institution's prestige decile, whether they had their child before tenure-track, etc. can be found in general_parenthood_demographics.tsv
. Data on usage and importance of parental leave by parenthood status and gender is in overall_parental_leave.tsv
. The research expectations and goals of early career faculty can be found in research_expectations.tsv
. Parental leave policies (from here) have been replicated in parental_leave_policies_apr_2018.tsv
.
Time series of faculty productivity was generated by allocation.py
script, and contains data on papers (y
) published per year relative to first child's birth (t
). Non-parents data can be found in data/control
& parents in data/treated
. Career age has been ommitted from files to further protect respondent identity. Missing data (respondent did not provide an answer) is coded as empty or -77.