This is a repository to accompany a scientometric review of the journal Population Studies, written by Melinda C. Mills and Charles Rahal, entitled 'Population Studies at 75 Years: An empirical review'. It analyzes the ~2000 papers published by the journal in honour of the journal's 75th anniversary, and has been accepted for publication there. A links to open-access (OSF) version of the paper can be found here.
The library tries to minimize the number of pre-requisite installations outside of the standard library, and we recommend an Anaconda installation and a virtual environment to provide a managed package of comprehensive tools. However, a couple are necessary, including a range of modules found in the requirements.txt file (generated by pipreqs). Important libraries which the work couldnt be done without include pandas, matplotlib, NetworK-X, and NLTK. The un-supervised LDA analysis is made possible with the help of MALLET (the 'MAchine Learning for LanguagE Toolkit'). Gender inference is done with a combination of Gender-Guesser and Gender-Detector.
The data originates from the Scopus APIs, using the appropriate ISSN (00324728) of the journal and the Search API. To recreate the dataset, first clone the repo:
git clone https://github.com/crahal/PopStudies_Review.git
cd PopStudies_Review/src
python popstudies_scraper.py
The functions in popstudies_preprocessor.py
clean the data and prepare it for analysis. The formatted notebook which calls the analysis from popstudies_analysis.py
is popstudies_notebook.ipynb
.
This work is free. You can redistribute it and/or modify it under the terms of the GPL-2.0 License. It comes without any warranty, to the extent permitted by applicable law.
Research assistance for the manual data curation was provided by Sofia Gieystor and Jiani Yan. Funding generously provided by the Leverhulme Trust, Leverhulme Centre for Demographic Science, and Nuffield College. We're also grateful to Anne Shepherd at the Population Investigation Committee for invaluable knowledge surrounding the journal. As always, all errors remain our own.
Last updated: 2020-13-09 (Accepted version)