Skip to content
No description, website, or topics provided.
Jupyter Notebook
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
ADS_Africa_authors.ipynb
ListCountries.txt
README.md

README.md

ADS-Query

The idea with this project was to determine the number of publishing astronomers on the African continent, and to see how the number of publications with African affiliations has increased over he recent past.

The data we used to make these figures were gathered by querying NASA's Astrophysics Data System for peer-reviewed astronomy articles in which any of the authors had affiliations that linked to one of the 54 countries in Africa. The search was further restricted by including only publications between 2013 and 2018.

Running the query

The query used the new ADS API and the Python wrapper for this API

For ADS query syntax see this link.

For each country in our list, the query output is parsed to search for that specific country in each affiliation. We then keep track of the Author, DOI, bibcode and date for each article.

We used multiple spellings, as can be seen in the file `ListCountries.txt', as a catch all for different names for countries.

We convert the lists into a pandas DataFrame and then drop any duplicate authors.

The Jupyter notebook containing the query and an initial is in this repository

Idiosyncracies

  1. The ADS query returns a maximum of around 2800 lines, even if you set the number of rows larger than this. For that reason, we've had to run separate queries for South Africa (for each year). This is a problem that will affect any query that returns a large number of rows from the database.

  2. The pandas drop_duplicates command will drop only literal duplicates, e.g. Carignan, C and Carignan, Claude will be viewed as different authors. We had to implement a final, manual check for duplicates.

You can’t perform that action at this time.