Skip to content


Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?


Failed to load latest commit information.
Latest commit message
Commit time

Research compendium for a dataset about corresponding author country affiliations indexed in the Web of Science 2014 - 2018

Launch Rstudio Binder Lifecycle: experimental Travis build status DOI


This repository provides a dataset about corresponding author country affiliations indexed in the Web of Science 2014 - 2018 using the in-house database from the German Competence Center for Bibliometrics.

To start with, read the Data Descriptor, an overview of the dataset and analytical work.

This repository is organized as a research compendium. A research compendium contains data, code, and text associated with it. The R Markdown files in the analysis/ directory provide details about the data analysis, particularly about how the Web of Science in-house database from the German Competence Center for Bibliometrics was interfaced, and the Data Descriptor. The data/ directory contains all aggregated data. Because of the proprietary nature of the Web of Science, no raw data and no access to the in-house database can be shared.

Analysis files

The analysis/ directory contains the following reports written in R Markdown:

Analytical steps for obtaining the data from the Web of Science in-house database maintained by the German Competence Center for Bibliometrics (WoS-KB), and data enriching were also provided as R Markdown reports:

Data files

The data/ directory contains the resulting datasets stored as comma-separated value files.

Reproducibility notes

This repository follows the concept of a research compendium that uses the R package structure to port data and code.

Because access to the data infrastructure of the German Competence Center for Bibliometrics (WoS-KB) is restricted, there are different levels of reproducibility. Everyone will be able to reproduce the analysis in the data descriptor, the main document of this research compendium written in R Markdown. Users with access to the WoS-KB data infrastructure will also be able to replicate the R code and SQL queries locally, or on the script server.

Data descriptor

Clone the GitHub repository with all data and code.

git clone

Open an R session in the directory of this package and install the R package dependencies using a package snapshot from the date this package was build

devtools::install_deps(repos = list(CRAN = ''))

To replicate the data descriptor:



Using the holepunch-package the project was made Binder ready. Binder allows you to execute the data descriptor in the cloud in your web browser.

Launch Rstudio Binder

User with access to the German Competence Center for Bibliometrics data infrastructure

If you have access to the Competence Center of Bibliometrics data infrastructure, you can replicate how data was obtained from the Web of Science. These steps are described in the R Markdown documents, starting with 00 in the analysis/ folder.

To get started, follow the steps described above to download the research compendium including the necessary R packages. Next, add your database login credentials to your .Renviron file and save it. You can open your .Renviron file from R with usethis::edit_r_environ().


Reload your R session.

The Oracle database driver needed to access the remote database is included in this repository.

To replicate the R Markdown documents, call rmarkdown::render().

When working on the script-server, please note that you need to render the documents with knitr::knit(), because pandoc is not available.


Using a docker container with all source code and data needed to reproduce this research compendium would reduce the above-described set-up efforts. Unfortunately, access to the German Competence Center for Bibliometrics requires a VPN tunnel that, at least in my local setup, is not accessible from the Docker container. Furthermore, Docker is not available for users on the script-server.


Re-used data terms:

This work uses Web of Science data by Clarivate Analytics provided by the German Competence Center for Bibliometrics for research purposes.

Crossref asserts no claims of ownership to individual items of bibliographic metadata and associated Digital Object Identifiers (DOIs) acquired through the use of the Crossref Free Services. Individual items of bibliographic metadata and associated DOIs may be cached and incorporated into the user's content and systems.

ISSN-Matching of Gold OA Journals (ISSN-GOLD-OA) 3.0 and Country Geocodes obtained from Google are made available under CC-BY.

The documentation and data descriptor including the figures are made available under CC-BY 4.0.

Source Code: MIT (Najko Jahn, 2019)


in alphabetical order

  • Design and Conceptualization: Colleen Campbell, Kai Geschuhn, Najko Jahn
  • Data Analysis: Najko Jahn
  • Writing and Review: Colleen Campbell, Anne Hobert, Najko Jahn, Birgit Schmidt, Niels Taubert, Anne Hobert


This data analytics works has been developed using open tools. There are a number of ways you can help make it better:

  • If you don’t understand something, please let me know and submit an issue.

Feel free to add new features or fix bugs by sending a pull request.

Please note that this project is released with a Contributor Code of Conduct. By participating in this project you agree to abide by its terms.


This work is supported by the Federal Ministry of Education and Research of Germany (BMBF) in the framework Quantitative Research on the Science Sector (Project: "OAUNI Entwicklung und Einflussfaktoren des Open-Access-Publizierens an Universitäten in Deutschland", Förderkennzeichen: 01PU17023A).


Najko Jahn, Data Analyst, SUB Göttingen.


Global Publishing Output from Corresponding Authors 2014 - 2018



Unknown, MIT licenses found

Licenses found






No packages published