-
Notifications
You must be signed in to change notification settings - Fork 4
add plots and analysis from _cr_springer.Rmd
again
#211
Comments
title: "Comparing the indexing coverage of SpringerLink with that of Crossref"
|
it does not need to migrated to the hoad implementation
…On Tue, 2 Jun 2020 at 17:39, Max Held ***@***.***> wrote:
------------------------------
title: "Comparing the indexing coverage of SpringerLink with that of
Crossref"
author: "Najko Jahn"
output:
html_document:
keep_md: true
df_print: paged
knitr::opts_chunk$set(echo = TRUE, message=FALSE)
In its blog post
<https://www.intact-project.org/general/openapc/2018/03/22/offsetting-coverage/>,
the INTACT project compares the indexing coverage of Crossref, a DOI
registration agency for scholarly works, with that of SpringerLink, a
digital library dedicated to content published by Springer. Examining five
journals including European Radiology, they found that the article coverage
differs between these two sources and concluded:
The results are clear: When it comes to journal metrics (both OA and
total), Crossref data is too sketchy to rely on.
This is very harsh given the importance of Crossref to study the
prevalence of open access <https://peerj.com/articles/4375/> and for open
access monitoring <http://www.knowledge-exchange.info/event/oa-monitoring>.
So, let's examine whether we come to the same conclusion.
Analyses
To do so, I firstly downloaded the yearly article volume from the journal European
Radiology <http://www.springer.com/medicine/radiology/journal/330> from
SpringerLink, starting in 2015.
Let's load these metadata into R and obtain information about when and in
which volumes articles were published:
library(tidyverse)
my_files <- list.files(pattern = ".csv")
springer <- purrr::map_df(my_files, readr::read_csv)
springer %>%
count(`Publication Year`, `Journal Volume`)
Four records seem to represent journal information. There are also
online-first articles published in 2017 and 2018, which have not appeared
in a printed volume, yet.
Now, let's obtain metadata via the Crossref API
<https://api.crossref.org/> using the rcrossref package
<https://github.com/ropensci/rcrossref>, and check whether Crossref's and
SpringerLink's indexing coverage of articles published in European
Radiology 2015 and 2016 is identical. For this aim, we firstly used the
from-pub-date parameter as the INTACT study did, and secondly, the
from-print-pub-date parameter was used to avoid confusion between
online-first and print publication.
library(rcrossref)
# R call representing from-pub-date query
cr_from_online <- rcrossref::cr_works(filter = c(issn = "0938-7994",
from_pub_date = "2015-01-01",
until_pub_date = "2016-12-31",
type = "journal-article"),
limit = 1000, cursor = "*", cursor_max = 5)
# R call representing from-print-pub-date query
cr_from_print <- rcrossref::cr_works(filter = c(issn = "0938-7994",
from_print_pub_date = "2015-01-01",
until_print_pub_date = "2016-12-31",
type = "journal-article"),
limit = 1000, cursor = "*", cursor_max = 5)
Are there different result sets?
Dataset obtained from querying by first date of publication:
cr_from_online$data %>%
count(volume)
Dataset obtained from querying by date of publication in a printed volume:
cr_from_print$data %>%
count(volume)
While articles queried by from-published-date were published in three
different yearly volumes, filtering with from_print_pub_date results in
an identical number of articles obtained via SpringerLink.
Finally, let's check whether the SpringerLink 2015-2016 and Crossref
from_print_pub_date sets are equal using DOIs:
# filter 2015 and 2016 publications
springer_15_16 <- springer %>%
filter(`Publication Year` %in% c(2015, 2016))
setequal(springer_15_16$`Item DOI`, cr_from_print$data$DOI)
Conclusion
In conclusion, by checking Crossref and SpringerLink for articles
published in "European Radiology" no article coverage differences could be
found between these two sources. However, when comparing the indexing
coverage of Crossref and SpringerLink, query parameters must be harmonized
in order to guarantee equal article sets.
Session info
sessionInfo()
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#211 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAM7YRWL4X6EEYRVTFQAQGLRUUMK5ANCNFSM4NQ245BQ>
.
|
thanks! |
same as #210 also temporarily removed to debug #82.
Must check whether there's something in here we're missing elsewhere.
The text was updated successfully, but these errors were encountered: