Skip to content
Emmanuel Blondel edited this page Aug 25, 2020 · 5 revisions

atom4R - Tools to read/write/publish metadata as Atom XML format

DOI

Provides tools to read/write/publish metadata based on the Atom XML syndication format. This includes support of Dublin Core XML implementation, and a client to APIs implementing the AtomPub SWORD API specification.


If you wish to sponsor atom4R, do not hesitate to contact me

Many thanks to the following organizations that have provided fundings for strenghtening the atom4R package:


Table of contents

1. Overview
2. Package status
3. Credits
4. User guide
   4.1 Installation
   4.2 Write and Read Atom XML records
      4.2.1 Atom Feed and Entry objects
      4.2.2 Dublin Core Entry objects
   4.3 Publish Atom XML records
      4.3.1 SWORD API for Dataverse
         4.3.1.1 Create Dataverse record
         4.3.1.2 Read Dataverse record
         4.3.1.3 Update Dataverse record
         4.3.1.4 Delete Dataverse record
         4.3.1.5 Add/Remove files in a Dataverse record
         4.3.1.6 Publish Dataverse record
5. Issue reporting

1. Overview and vision


The atom4R package provides tools to read/write/publish metadata based on the Atom XML syndication format. This includes support of Dublin Core XML implementation, and a client to APIs implementing the AtomPub SWORD API specification.

An introduction on the Atom web standard(s), including the Atom XML Syndication format and AtomPub protocol can be found at [https://en.wikipedia.org/wiki/Atom_(Web_standard).

atom4R is jointly developed together with the geoflow which intends to facilitate and automate the production of geographic metadata documents and their associated datasources, where atom4R is used to assign DOIs and cross-reference these DOIs in other metadata documents such as geographic metadata (ISO 19115/19139) hosted in metadata catalogues and open data portals.

2. Development status


  • June 2020: Inception. Code source managed on GitHub.
  • Coming soon: Publication on CRAN.

3. Credits


(c) 2020, Emmanuel Blondel

Package distributed under MIT license.

If you use atom4R, i would be very grateful if you can add a citation in your published work. By citing atom4R, beyond acknowledging the work, you contribute to make it more visible and guarantee its growing and sustainability. For citation, please use the DOI: DOI

4. User guide


4.1 How to install atom4R in R

For now, the package can be installed from Github

install.packages("devtools")

Once the devtools package loaded, you can use the install_github to install atom4R. By default, package will be installed from master which is the current version in development (likely to be unstable).

require("devtools")
install_github("eblondel/atom4R")

4.2 Write and read Atom XML records

4.2.1 Atom Feed and Entry objects

The below example shows how to create a AtomFeed object, adding an AtomEntry to it, how to encode it as XML, and how a AtomFeed can be read from an XML.

#encoding
atom <- AtomFeed$new()
atom$setId("my-atom-feed")
atom$setTitle("My Atom feed title")
atom$setSubtitle("MyAtom feed subtitle")
author1 <- AtomAuthor$new(
  name = "John Doe",
  uri = "http://www.atomxml.com/johndoe",
  email = "johndoe@atom4R.com"
)
atom$addAuthor(author1)
author2 <- AtomAuthor$new(
  name = "John Doe's sister",
  uri = "http://www.atomxml.com/johndoesister",
  email = "johndoesister@atom4R.com"
)
atom$addAuthor(author2)
contrib1 <- AtomContributor$new(
  name = "Contrib1",
  uri = "http://www.atomxml.com/contrib1",
  email = "contrib1@atom4R.com"
)
atom$addContributor(contrib1)
contrib2 <- AtomContributor$new(
  name = "Contrib2",
  uri = "http://www.atomxml.com/contrib2",
  email = "contrib2@atom4R.com"
)
atom$addContributor(contrib2)
atom$setIcon("https://via.placeholder.com/300x150.png/03f/fff?text=atom4R")
atom$setSelfLink("http://example.com/atom.feed")
atom$setAlternateLink("http://example.com/my-atom-feed")
atom$addCategory("dataset")
atom$addCategory("spatial")
atom$addCategory("fisheries")

#add entry
entry <- AtomEntry$new()
entry$setId("my-atom-entry")
entry$setTitle("My Atom feed entry")
entry$setSummary("My Atom feed entry very comprehensive abstract")
author1 <- AtomAuthor$new(
  name = "John Doe",
  uri = "http://www.atomxml.com/johndoe",
  email = "johndoe@atom4R.com"
)
entry$addAuthor(author1)
author2 <- AtomAuthor$new(
  name = "John Doe's sister",
  uri = "http://www.atomxml.com/johndoesister",
  email = "johndoesister@atom4R.com"
)
entry$addAuthor(author2)
contrib1 <- AtomContributor$new(
  name = "Contrib1",
  uri = "http://www.atomxml.com/contrib1",
  email = "contrib1@atom4R.com"
)
entry$addContributor(contrib1)
contrib2 <- AtomContributor$new(
  name = "Contrib2",
  uri = "http://www.atomxml.com/contrib2",
  email = "contrib2@atom4R.com"
)
entry$addContributor(contrib2)
entry$addCategory("dataset")
entry$addCategory("spatial")
entry$addCategory("fisheries")

atom$addEntry(entry)

xml <- atom$encode()

#decoding
atom2 <- AtomFeed$new(xml = xml)
xml2 <- atom2$encode()

4.2.2 Dublin Core Entry objects

The below example shows how to create a DCEntry object in R, how to encode it as XML, and how a DCEntry can be read from an XML. An DCEntry can be used as AtomEntry in an AtomFeed object.

#encoding
dcentry <- DCEntry$new()
dcentry$setId("my-dc-entry")

#fill dc entry
dcentry$addDCDate(Sys.time())
dcentry$addDCTitle("atom4R - Tools to read/write and publish metadata as Atom XML format")
dcentry$addDCType("Software")
creator <- DCCreator$new(value = "Blondel, Emmanuel")
creator$attrs[["affiliation"]] <- "Independent"
dcentry$addDCCreator(creator)
dcentry$addDCSubject("R")
dcentry$addDCSubject("FAIR")
dcentry$addDCSubject("Interoperability")
dcentry$addDCSubject("Open Science")
dcentry$addDCDescription("Atom4R offers tools to read/write and publish metadata as Atom XML syndication format, including Dublin Core entries. Publication can be done using the Sword API which implements AtomPub API specifications")
dcentry$addDCPublisher("GitHub")

funder <- DCContributor$new(value = "CNRS")
dcentry$addDCContributor(funder)
dcentry$addDCRelation("Github repository: https://github.com/eblondel/atom4R")
dcentry$addDCSource("Atom Syndication format - https://www.ietf.org/rfc/rfc4287")
dcentry$addDCSource("AtomPub, The Atom publishing protocol - https://tools.ietf.org/html/rfc5023")
dcentry$addDCSource("Sword API - http://swordapp.org/")
dcentry$addDCSource("Dublin Core Metadata Initiative - https://www.dublincore.org/")
dcentry$addDCSource("Guidelines for implementing Dublin Core in XML - https://www.dublincore.org/specifications/dublin-core/dc-xml-guidelines/")
dcentry$addDCLicense("NONE")
dcentry$addDCRights("MIT License")

xml <- dcentry$encode()

#decoding
dcentry2 <- DCEntry$new(xml = xml)
xml2 <- dcentry2$encode()

4.3 Publish Atom XML records with AtomPub

The Atom Publishing Protocol (AtomPub or APP) is a simple HTTP-based protocol for creating and updating web resources.

atom4R intends to offer a standard R interface to APIs implementing the AtomPub protocol. Among them, one of the key APIs the package is targeting is the SWORD API. For the timebeing, atom4R offers an R interface to SWORD API v2, taking as main testing plateform the Opensource Dataverse. Additional plateforms implementing AtomPub / SWORD are foreseen to be tested depending on user community needs.

4.3.1 SWORD API for Dataverse

An interface for the Dataverse SWORD API is defined in atom4R with the SwordDataverseClient. To connect to Dataverse SWORD API, run the following code (filling your dataverse hostname and user token):

SWORD <- SwordDataverseClient$new(
  hostname = "localhost:8085",
  token = "<token>",
  logger = "DEBUG"
)

The following sections detail how to run the SWORD API operations with atom4R.

4.3.1.1. Create Dataverse Record

To create a Dataverse record, you should specify the ID of the dataverse (collection) in which you want to deposit the record. The record should be an object of class DCEntry.

#Create with SWORD
out <- SWORD$createDataverseRecord("<dataverse ID>", dcentry)

4.3.1.2. Read Dataverse Record

To read/get an existing record, Dataverse SWORD API requires to specify the global Identifier for the previously deposited record. This is represented by a string giving the DOI reserved by Dataverse when the record was created:

#Read with SWORD
out <- SWORD$getDataverseRecord("doi:10.XXX/10XXXX")

4.3.1.3. Update Dataverse Record

To update an existing record, Dataverse SWORD API requires to specify the global Identifier for the previously deposited record. This is represented by a string giving the DOI reserved by Dataverse when the record was created:

#Update with SWORD
out <- SWORD$updateDataverseRecord("<dataverse ID>", dcentry, "doi:10.XXX/10XXXX")

4.3.1.4. Delete Dataverse Record

To delete an existing record, Dataverse SWORD API requires to specify the global Identifier for the previously deposited record. This is represented by a string giving the DOI reserved by Dataverse when the record was created:

#Delete with SWORD
out <- SWORD$deleteDataverseRecord("doi:10.XXX/10XXXX")

4.3.1.5. Add/Remove files in a Dataverse record

One or files can be added to a Dataverse record. As for the other methods, the global identifier (DOI assigned by Dataverse) is required to locate the record to which files should be added.

SWORD$addFilesToDataverseRecord("doi:10.XXX/10XXXX", files = c("file1", "file2", ...))

The files should be added as simple vector giving the file name(s).

In similar way, files can be removed from a Dataverse record. To delete all files:

SWORD$deleteFilesFromDataverseRecord("doi:10.XXX/10XXXX")

To delete specific files, use the files argument, silimarly to addFilesToDataverseRecord method.

4.3.1.6. Publish Dataverse record

To publish an existing record, Dataverse SWORD API requires to specify the global Identifier for the previously deposited record. This is represented by a string giving the DOI reserved by Dataverse when the record was created:

#Publish with SWORD
out <- SWORD$publishDataverseRecord("doi:10.XXX/10XXXX")

A published record on Dataverse cannot be deleted by yourself. If you want to delete a Dataverse record you should contact your Dataverse administrator. However it is possible to edit a record. Its newer publication will induce the creation of a new record version in Dataverse.

5. Issue reporting


Issues can be reported at https://github.com/eblondel/atom4R/issues

Related to Dataverse