Skip to content
Notebook for looking at 35 years of historical US degrees data from NCES-IPEDS
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
AmacadCIPS.csv
outputs
.gitignore
1984.txt
1985.txt
1985to1990XWalk.csv
1986.txt
1987.txt
1988.txt
1990to2000XWalk.csv
1991.txt
1992.txt
1993.txt
1994.txt
1995.txt
1996.txt
1997.txt
Adjusted CIPS.csv
Aliens.Rmd
Atlantic.Rmd
Crosswalk2000to2010 (1).csv
EDA_functions.R
English_Model.csv
Long Term Humanities.xlsx
Perspectives.Rmd
README.md
Texas.Rmd
bookworming.Rmd
c1998_a.csv
c1999_a.csv
c2000_a.csv
c2001_a.csv
c2002_a.csv
c2003_a.csv
c2004_a.csv
c2005_a.csv
c2006_a.csv
c2007_a.csv
c2008_a_rv.csv
c2009_a_rv.csv
c2010_a.csv
c2010_a_rv.csv
c2011_a.csv
c2011_a_rv.csv
c2012_a.csv
c2012_a_rv.csv
c2013_a_rv.csv
c2014_a_rv.csv
c2015_a.csv
c2016_a.csv
c2017_a.csv
change_in_degrees_2011-2017.csv
functions.R
graduate_earnings_all.csv
hd2017.csv
history_changes.csv
ic2017_ay.csv
long_term.txt
long_term_gender.csv
pop_estimates.csv
read_cips.R
variable_fields_crosswalk.csv

README.md

US-degrees

This is a notebook for looking at 35 years of historical US degrees data from NCES-IPEDS.

Initial exploration for the blog post here is in the file Bookworming.Rmd. The full-text and code for the Atlantic article is in Atlantic.Rmd. That file does not contain the file edits on the piece.

The taxonomy used here is a modified version of one developed by the American Academy of Arts and Sciences for the Humanities Indicators. I generally defer to their categories, but believe communications to be in the social sciences (not humanities) and don't put "General Studies" into the humanities bin. The precise taxonomy is in the folder AmacadCIPS.csv, using the file CIP-HI Crosswalk (1987-pres)-Table 1.csv. (Direct Link) I have changed this file to include some disciplines for non-humanities majors as well as to reflect my changed definition of the humanities. You can examine the versioning history on it if you want to see my exact changes.

Data downloaded from the IPEDS series of the National Center for Education Statistics. Ones ending with '.txt' are pre-cleaned.

Tidying done in 'read_cips.R' (also not really the right name for the file).

A few dplyr functions are bundled into 'EDA_functions.R'.

functions.R is legacy code that was used to build the pre-1998 data.

Data is encoded. Information about majors is in adjusted CIPS.csv; information about institutions is in hd2017.csv.

Because parsing the individual year data files takes a while, read_cips.R creates intermediate .feather files to load on subsequent runs. This will take up additional disk space.

You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.