Skip to content
jbartlewski edited this page Sep 11, 2023 · 14 revisions

OpenAPC data schemas

The following schemas describe the data sets aggregated by the OpenAPC initiative. Each line conforms to a column in the according CSV file. At the moment, 3 data sets are maintained:

  1. OpenAPC data set (for APCs on a per-publication basis)
  2. BPC data set (for BPCs on a per-publication basis)
  3. Transformative Agreements (TA) data set (for journal articles published under alternative payment models, like Springer Compact or the Wiley DEAL in Germany)

OpenAPC data set

This is the original data set OpenAPC started with. It collects cost data on Article Processing Charges (APCs) on a per-publication basis. It consists of 18 metadata fields, with 5 of them being mandatory when contributing data.

Mandatory and backup columns

Only the first 5 columns are mandatory in all cases. The 4 columns marked as backup are required only if at least one of the articles in a contributed table does not have a DOI assigned. In that case, the DOI-less articles (and only those) have to provide these 4 data fields as additional information (Example).

column description source required?
institution Top-level organisation which covered the reported costs, e.g. "Bielefeld University" none mandatory
period Year of APC payment (YYYY) none mandatory
euro The amount that was paid in EURO. Includes VAT and any discounts none mandatory
doi Digital Object Identifier none mandatory
is_hybrid Determines if the article has been published in a hybrid journal (TRUE) or in fully/Gold OA journal (FALSE) none mandatory
publisher Name of the publication house that has charged the fee CrossRef backup
journal_full_title Full name of periodical that contains the article CrossRef backup
issn International Standard Serial Number CrossRef backup
issn_print International Standard Serial Number - print version CrossRef no
issn_electronic International Standard Serial Number - electronic version CrossRef no
issn_l Linking International Standard Serial Number ISSN International Centre no
license_ref License under which the article has been published CrossRef no
indexed_in_crossref indicates if the contribution is registered with the DOI agency CrossRef (TRUE/FALSE) CrossRef no
pmid id for metadata records indexed in Europe Pubmed Central (Europe PMC) Europe PMC no
pmcid id for articles available in Europe PubMed Central full text collection Europe PMC no
ut Web of Science unique item id Web of Science no
url URL to article if no DOI is available none backup
doaj Indicates if the journal is indexed in the Directory of Open Access Journals (TRUE/FALSE) DOAJ no

BPC data set

This data set is the newest addition to OpenAPC and collects data on BPCs (Book Processing Charges). It consists of 13 fields, with 5 being mandatory.

Mandatory and backup columns

The first 5 columns are mandatory in all cases. The isbn column is marked as backup and is required if the book does not have a DOI assigned. Since the usage of DOIs is not as widespread with books as it is with journal articles, we make two additional recommendations when contributing data:

  • The book_title column is marked recommended. It is not strictly necessary, but if you happen to have access to that kind of information, it could be helpful to add it to the table.
  • Books can have a variety of ISBNs, depending on the publication form (hardcover, softcover, PDF, epub...). If your original data provides fields for more than one ISBN type, we recommend to include them all. It is not required to name the additional columns accordingly, some generic schema (isbn_1, isbn_2...) will do.
column description source required?
institution Top-level organisation which covered the reported costs, e.g. "Bielefeld University" none mandatory
period Year of BPC payment (YYYY) none mandatory
euro The amount that was paid in EURO. Includes VAT and any discounts none mandatory
doi Digital Object Identifier none mandatory
backlist_oa Was the book published OA in the first place (FALSE) or was it already part of a publisher's backlist and became OA retroactively (TRUE)? none mandatory
publisher Name of the publication house that has charged the fee CrossRef no
book_title Title of the monograph CrossRef recommended
isbn International Standard Book Number CrossRef backup
isbn_print International Standard Book Number - print version CrossRef no
isbn_electronic International Standard Book Number - electronic version CrossRef no
license_ref License under which the book has been published CrossRef no
indexed_in_crossref indicates if the work is registered with the DOI agency CrossRef (TRUE/FALSE) CrossRef no
doab Indicates if the book is listed in the Directory of Open Access Books (TRUE/FALSE) DOAB no

Transformative Agreements data set

The Transformative Agreements (TA) data set contains metadata on OA journal articles which were not paid for with APCs, but published under transformative agreements instead. These kind of contracts are concluded with publishers and usually involve larger bodies like research organisations (e.g. Max Planck Society) or national consortia (Springer Compact agreements) as contract partners. Cost and payment modalities can differ a lot and are usually results of individual negotiations, which means that it is often difficult to assign a per-publication price tag. Consequently, most records in the data set do not include cost information and the euro field is not mandatory.

Mandatory and backup columns

The TA data set consists of 19 fields and is very similar to the OpenAPC data set. It contains an additional column agreement denoting the name of the contract the article was published under. DOI registration is an accepted standard for TA articles, thus the "backup" rules of the OpenAPC data set do not apply here.

column description source required?
institution Top-level organisation the article author is affiliated with none mandatory
period Year of payment (YYYY) none mandatory
euro Article cost, usually calculated in hindsight on an agreed formula none no
doi Digital Object Identifier none mandatory
is_hybrid Determines if the article has been published in a hybrid journal (TRUE) or in fully/Gold OA journal (FALSE) none mandatory
publisher Name of the publisher the TA was concluded with CrossRef no
journal_full_title Full name of periodical that contains the article CrossRef no
issn International Standard Serial Number CrossRef no
issn_print International Standard Serial Number - print version CrossRef no
issn_electronic International Standard Serial Number - electronic version CrossRef no
issn_l Linking International Standard Serial Number ISSN International Centre no
license_ref License under which the article has been published CrossRef no
indexed_in_crossref indicates if the contribution is registered with the DOI agency CrossRef (TRUE/FALSE) CrossRef no
pmid id for metadata records indexed in Europe Pubmed Central (Europe PMC) Europe PMC no
pmcid id for articles available in Europe PubMed Central full text collection Europe PMC no
ut Web of Science unique item id Web of Science no
url URL to article if no DOI is available (not used) none no
doaj Indicates if the journal is indexed in the Directory of Open Access Journals (TRUE/FALSE) DOAJ no
agreement Name of the transformative agreement the article was published under none no

Related Work

Clone this wiki locally