diff --git a/README.md b/README.md index 7cfb8749..5fd85159 100644 --- a/README.md +++ b/README.md @@ -340,9 +340,10 @@ Scrapi supports the addition of institutions in a separate index (` institutions much less frequently, meaning that simple parsers can be used to manually load data from providers instead of using scheduled harvesters. Currently, data from [GRID](https://grid.ac/) and [IPEDS](https://nces.ed.gov/ipeds/) is supported: -- GRID: Provides data on international research facilities. The currently used dataset is ` grid_2015_10_09.json `, which can be found [here](https://grid.ac/downloads). To use this dataset +- GRID: Provides data on international research facilities. The currently used dataset is ` grid_2015_11_05.json `, which can be found [here](https://grid.ac/downloads). To use this dataset move the file to '/institutions/', or override the file path and/or name on ` tasks.py `. This can be individually loaded using the function ` grid() ` in ` tasks.py `. -- IPEDS: Provides data on secondary education institutions in the US. The currently used dataset is ` hd2013.csv `, which can be found [here](https://nces.ed.gov/ipeds/datacenter/DataFiles.aspx). To use this dataset +- IPEDS: Provides data on secondary education institutions in the US. The currently used dataset is ` hd2014.csv `, which can be found [here](https://nces.ed.gov/ipeds/Home/UseTheData), by clicking on + Survey Data -> Complete data files -> 2014 -> Institutional Characteristics -> Directory information and unzipping the .csv file (on OSX this can be done by running ` unzip filename.zip `). To use this dataset move the file to '/institutions/', or override the file path and/or name on ` tasks.py `. This can be individually loaded using the function ` ipeds() ` in ` tasks.py `. Running ` invoke institutions ` will properly load up institution data into elastic search provided the datasets are provided. diff --git a/tasks.py b/tasks.py index 5284c8cf..34c9bbe8 100644 --- a/tasks.py +++ b/tasks.py @@ -305,7 +305,7 @@ def reset_all(): @task -def institutions(grid_file='institutions/grid_2015_10_09.json', ipeds_file='institutions/hd2013.csv'): +def institutions(grid_file='institutions/grid_2015_11_05.json', ipeds_file='institutions/hd2014.csv'): grid(grid_file) ipeds(ipeds_file)