-
Notifications
You must be signed in to change notification settings - Fork 4
identify open content license urls used for open access articles in hybrid journals #81
Comments
Some background: Existing White List Existing script to harmonize license urls https://github.com/subugoe/hybrid_oa_dashboard/blob/8e1e50d9403ec90a94c699e51919a46aeb1c0418/R/license_normalise.R#L4-L20 Related approach with comprehensive White List: “Applying Crossref and Unpaywall information to identify gold, hidden gold, hybrid and delayed Open Access publications in the KB publication corpus”: https://osf.io/preprints/socarxiv/sdzft/ |
Filtering journals where no license urls were shared using the # required libraries
library(dplyr) # data transformation
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
library(tidyr) # working with list-columns
library(jsonlite) # working with json files
# load data, most recent dump, which also includes data from Jan and Feb 2020
license_df <- jsonlite::stream_in(url("https://raw.githubusercontent.com/subugoe/hybrid_oa_dashboard/update_jan_feb_20/data/jn_facets_df.json"), verbose = FALSE)
# prepare a summary table, where all license URLs´s variants are broken down by publisher
license_df %>%
select(license_refs, journal_title, publisher) %>%
unnest(license_refs, keep_empty =TRUE) %>%
filter(is.na(.id))
#> # A tibble: 470 x 4
#> .id V1 journal_title publisher
#> <chr> <int> <chr> <chr>
#> 1 <NA> NA Natures Sciences Sociétés EDP Sciences
#> 2 <NA> NA Journal of Neuroscience Society for Neuroscience
#> 3 <NA> NA Genes & Development Cold Spring Harbor Labora…
#> 4 <NA> NA Physiological Genomics American Physiological So…
#> 5 <NA> NA Climate Research Inter-Research Science Ce…
#> 6 <NA> NA Jahrbuch der Österreichischen By… Osterreichische Akademie …
#> 7 <NA> NA Zeitschrift für Antikes Christen… Walter de Gruyter GmbH
#> 8 <NA> NA Journal of Lipid Research American Society for Bioc…
#> 9 <NA> NA Molecular Biology of the Cell American Society for Cell…
#> 10 <NA> NA Journal of Biological Chemistry American Society for Bioc…
#> # … with 460 more rows Created on 2020-03-19 by the reprex package (v0.3.0) |
I haven't touched any of this substantively, but I've created some scaffolding in 8e3cc1e.
|
Here's a reproducible example (reprex) to obtain licenses used for all hybrid journals covered by the Open APC initiative.
Created on 2020-03-06 by the reprex package (v0.3.0)
The text was updated successfully, but these errors were encountered: