Skip to content

veemcb/ADS-Query

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 

Repository files navigation

ADS-Query

The idea with this project was to determine the number of publishing astronomers on the African continent, and to see how the number of publications with African affiliations has increased over he recent past.

The data we used to make these figures were gathered by querying NASA's Astrophysics Data System for peer-reviewed astronomy articles in which any of the authors had affiliations that linked to one of the 54 countries in Africa. The search was further restricted by including only publications between 2013 and 2018.

Running the query

The query used the new ADS API and the Python wrapper for this API

For ADS query syntax see this link.

For each country in our list, the query output is parsed to search for that specific country in each affiliation. We then keep track of the Author, DOI, bibcode and date for each article.

We used multiple spellings, as can be seen in the file `ListCountries.txt', as a catch all for different names for countries.

We convert the lists into a pandas DataFrame and then drop any duplicate authors.

The Jupyter notebook containing the query and an initial is in this repository

Idiosyncracies

  1. The ADS query returns a maximum of around 2800 lines, even if you set the number of rows larger than this. For that reason, we've had to run separate queries for South Africa (for each year). This is a problem that will affect any query that returns a large number of rows from the database.

  2. The pandas drop_duplicates command will drop only literal duplicates, e.g. Carignan, C and Carignan, Claude will be viewed as different authors. We had to implement a final, manual check for duplicates.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published