Discussion: Can/should we be able to import a dataspiced dataset from the web/elsewhere? #57

amoeba · 2018-05-30T17:01:40Z

This comes from a good question in my dataspice demo today: If user X authors a dataspice page for their dataset, and another scientist, Y, wants to use it, it'd be cool if they just ran:

import_spice("https://amoeba.github.io/some-dataset")

And their computer downloaded something like some-dataset.zip which had the dataspice.json and the files described in access.csv attached to it somehow.

The text was updated successfully, but these errors were encountered:

khondula · 2018-05-30T17:10:44Z

That seems cool! Would that depend on the reliability/persistence of 'contentUrl' or 'contentUrl' + 'fileName'?

Would there maybe be a way to generate a .bib as well, to suggest a citation?

cboettig · 2018-05-30T17:19:34Z

👏

I think it might be potentially more robust to have a function that just extracts the metadata and returns an R object which contains the download urls? e.g. something like

x <- import_spice()
read_csv(x$files[[1]]

(Some examples of schema.org Dataset contentUrls do not contain direct links to download a data file, but rather a web page that has links).

Could potentially make this behavior part of a read_spice() function; i.e. read_spice could work locally on a dataspice.json object or could extract dataspice.json from HTML content on the web.

An R object could also contain the citation (perhaps as an R bibitem object, which R can already turn into either bibtex or text-based citation). i.e. simply x$citation; or we could have a methods-y interface like citation(x)

amoeba · 2018-05-30T17:30:40Z

@khondula wrote:

That seems cool! Would that depend on the reliability/persistence of 'contentUrl' or 'contentUrl' + 'fileName'?

Yes, I see it as a huge need to resolve this stuff soon. @cboettig 's idea below helps alleviate that (don't fetch the data at first, just metadata) then give the user a way to fetch some or all of it.

returns an R object

Ooh nice! More robust yes.

Could potentially make this behavior part of a read_spice() function; i.e. read_spice could work locally on a dataspice.json object or could extract dataspice.json from HTML content on the web.

👍 and 👍 on all those ideas @cboettig

amoeba added the help wanted Extra attention is needed label May 30, 2018

cboettig mentioned this issue May 30, 2018

Discussion: Non-HTML output formats (like Rmd/md) #56

Open

amoeba mentioned this issue Jun 23, 2018

EML to dataspice formats #62

Closed

isteves mentioned this issue Jun 28, 2018

Add citation NCEAS/metajam#36

Open

amoeba modified the milestones: v1.0, v1.1 Jul 21, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Discussion: Can/should we be able to import a dataspiced dataset from the web/elsewhere? #57

Discussion: Can/should we be able to import a dataspiced dataset from the web/elsewhere? #57

amoeba commented May 30, 2018

khondula commented May 30, 2018

cboettig commented May 30, 2018

amoeba commented May 30, 2018

Discussion: Can/should we be able to import a dataspiced dataset from the web/elsewhere? #57

Discussion: Can/should we be able to import a dataspiced dataset from the web/elsewhere? #57

Comments

amoeba commented May 30, 2018

khondula commented May 30, 2018

cboettig commented May 30, 2018

amoeba commented May 30, 2018