Skip to content
Analyse an institution's publication output and determine the open access share.
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.


Share of open access journal articles

We want to analyse an institution's publication output and determine the share of open access articles. To do so we retrieve data from different bibliographic databases and match it with data from the Directory of Open Access Journals (DOAJ) and Unpaywall to identify OA articles in OA jouranls. The python script described here analyses the retrieved data and produces both total numbers and a list of articles as downloadable files. To identify OA articles in hybrid journals and green OA articles in repositories the Unpaywall API is called.

Project description

The python scripts analyses article data and identifies those articles that were published in a) gold open access journals, b) hybrid journals, and c) repositories as green OA. The script DOES aggregate, normalise and duplicate check for article data. The script outputs different files (tab-separated txt) for further analysis. The script DOES NOT retrieve article data from databases -- this has to be done beforehand! Instructions for retrieval of article data are included in the (German) manual.

How to

See the manual for further information on input/output files and the most important script variables.

A full manual in German is available as PDF file and LaTeX project. See the manual for information on

  • how to retrieve article data from different databases (e.g. Web of Science, SciFinder, PubMed),
  • how to prepare article data for the python script,
  • output files,
  • how to add or delete databases from the analysis.

Please note: Some compatibility issues have cropped up since NumPy v1.14 was released last year. The script should work with NumPy v1.12. An updated version of the script should become available during the summer.

Contribution history

The python script was developed mainly by Eva Bunge with support from Michaela Voigt. The script is maintained by the Open Access team of TU Berlin University Library.


Contribute to this project by

  • committing to the github repository,
  • improving the manual or translating it into English (see LaTeX files here),
  • contacting us via e-mail (openaccess at to report conceptual flaws and suggest possible improvements, to discuss ways to enhance the script or to let us know what you are using this script for. Please note that the script is licensed under BSD-3 clause license. By contributing you agree to do so under these terms.


Please e-mail openaccess at


Some rights reserved, this work is distributed under BSD-3 clause license. See License for more information.

You can’t perform that action at this time.