Content and formatting in the output of `retriever ls` #210

cboettig · 2014-09-24T22:23:23Z

As discussed in ropensci/rdataretriever#36 (comment)

It's not trivial to map the available datasets listed to the more
meaningful descriptions on the website,
http://ecodataretriever.org/available-data.html.
The naming convention for datasets is unclear, and not obviously scalable as an
identifier.

dmcglinn · 2015-04-26T20:35:14Z

I agree this is a bit of a problem. It seems like the mapping of names from ls datasets to those descriptions at http://www.ecodataretriever.org/available-data.html could be accomplished with a table at http://www.ecodataretriever.org/available-data.html with two columns: 1) the long hyperlinked descriptive name of the dataset, and 2) the shortname that retriever users will refer to the dataset by (e.g., BBS).

With respect to scalability of names it appears that many of the datasets names are of the format LastnameYEAR where Lastname refers to the lastname of the first author of the dataset. This system could be codified as a rule for published datasets. For unpublished datasets I don't think a simple system will be that easy to define.

Additionally it appears that the output of retriever ls does not automatically detect all of the available datasets. For example the newly created script EA_palmer2007.script which generates the dataset named Palmer2007 is not listed by retriever ls.

ethanwhite · 2015-04-27T14:17:25Z

I also agree that this is something that needs improving, just haven't had time to work on it.

Additionally it appears that the output of retriever ls does not automatically detect all of the available datasets. For example the newly created script EA_palmer2007.script which generates the dataset named Palmer2007 is not listed by retriever ls.

This is probably because you're running the most recent release rather than the current version of master. The current release will only download scripts that existed as of that release. The truth is the entire relationships between scripts and releases needs to be thought about from both a technical and user perspective and I'm hoping that once I get a software engineer hired we'll be able to tackle that side of things.

shreyneil · 2018-02-20T22:16:43Z

@ethanwhite I think this issue was solved using #488 . Please review and close this issue.

ethanwhite · 2019-03-03T17:29:25Z

I agree that this has generally been addressed. Thanks for pointing this out @shreyneil!

We've fully addressed the first point in that they are explicitly linked at https://retriever.readthedocs.io/en/latest/datasets_list.html and through the verbose presentation that @shreyneil points to.

We've also improved the ability to digest and work with this metadata in Python and R.

We haven't yet grappled with the naming conventions issue, but to be honest that feels like a broader community discussion involving Frictionless Data and other folks. Specifically I think any conventions related to naming should end up in https://frictionlessdata.io/specs/data-package/#required-properties, where there aren't any specifics about naming at this time.

Thanks for the issue @cboettig. We may not get to things fast, but we do see to get to them eventually 😄.

goelakash mentioned this issue May 13, 2016

Verbose mode for dataset display using 'retriever ls' #488

Merged

henrykironde assigned ethanwhite Mar 3, 2019

ethanwhite closed this as completed Mar 3, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Content and formatting in the output of `retriever ls` #210

Content and formatting in the output of `retriever ls` #210

cboettig commented Sep 24, 2014 •

edited by ethanwhite

Loading

dmcglinn commented Apr 26, 2015

ethanwhite commented Apr 27, 2015

shreyneil commented Feb 20, 2018 •

edited

Loading

ethanwhite commented Mar 3, 2019

Content and formatting in the output of retriever ls #210

Content and formatting in the output of retriever ls #210

Comments

cboettig commented Sep 24, 2014 • edited by ethanwhite Loading

dmcglinn commented Apr 26, 2015

ethanwhite commented Apr 27, 2015

shreyneil commented Feb 20, 2018 • edited Loading

ethanwhite commented Mar 3, 2019

Content and formatting in the output of `retriever ls` #210

Content and formatting in the output of `retriever ls` #210

cboettig commented Sep 24, 2014 •

edited by ethanwhite

Loading

shreyneil commented Feb 20, 2018 •

edited

Loading