-
This project contains Stata and R scripts that 'automagically' download labeled versions of IPEDS complete data files
- In the Stata implementation it simply downloads the data, .do file, and dictionary and after cleaning up any issues, uses the .do file IPEDS provide to add data labels
- In the R implementation the .do file is read as text into R then converted to labeling instructions passed to the
haven
R package
-
The logic of using both scripts is very similar
- Place either
IPEDtaS.do
(for Stata projects) orIPEDtaS.R
(for R projects) in your main project folder - Edit the
selected_files
list to the IPEDS complete data files you need
- By default the script will download the entirity of IPEDS, which takes multiple hours and around 10gb
- You can either delete or comment out any files you don't want. Simply download the script from here again if you need the full list back.
- Hit "do" or "run"
- After it's completed, both result in a
data/
folder containing labeled.dta
files and adictionaries/
folder with the matching dictionaries
- Place either
-
The project is intended to both make IPEDS data files easier to work with and also enhance reproducibility of research using IPEDS
- I encourage you to include a copy of the
IPEDtaS
script you use in your analyses in any code you share for reproduction- If you do so, please cite the repository so others can easily find and use it:
- I encourage you to include a copy of the
Capaldi, M. J. (2024). IPEDtaS: Automagically Download Labeled .dta IPEDS Files in Stata and R (Version 0.1) [Computer software]. https://doi.org/10.5281/zenodo.13388846
Hints
- In R, use
as_factor()
fromhaven
library to convert a labeled column to a factor which uses the labels as the levels