R package to store/access metadata associated with data/functions #18
Comments
|
Great idea. My proposal issue title was perhaps too specific to I am not sure from what you wrote if your idea builds on existing My suggestion was based on a pressing unmet need I face when ingesting, In terms of choosing the standard, EML seems to be the most generically I also like the idea that this topic may have cross-cutting potential with Good stuff mate, let me know if I am off track with where you saw this On Wed, Mar 30, 2016 at 1:35 PM, Jonathan Carroll notifications@github.com
|
Python docstrings are a standard in that they are strongly encouraged and are handled as official attributes, but I'm only using those as an example to launch from. From a structural point of view, the EML standard would be perfect, but I was thinking more in terms of Roxygen defined attributes than an XML structure. The attributes would be retrievable as first-class objects via some method, or printable with a I have in mind (and remember, this is all purely brainstorming at this point) the case where you load some
A somewhat complicated extension of this would be to overload Some related reading: http://simplystatistics.org/2015/11/06/how-i-decide-when-to-trust-an-r-package/ |
First off, I see that there is already ropensci/EML and the associated idea, but I'm not a fan of S4, and I'm thinking bigger.
I've brought this up in discussions elsewhere in the past and I know that hadley hasn't made attributes a priority in his workflows (e.g. in relation to
assertr()
https://twitter.com/hadleywickham/status/559183346144522241) -- in fact, it was only recently that attributes were preserved indplyr
pipelines. They're certainly not preserved inplyr
functions.I'd love to be able to attach a python-esque docstring to data and functions that can be printed without invoking the full help menu (
?library
), which might contain the last time the object was updated (either automated or manually stated), source, attribution, etc... It's certainly possible to usecomment()
on adata.frame
but I'm thinking perhaps these can be stored similarly to.Rmd
files (with full markdown capability?) in a cache and searched/loaded independently to ensure they survive processing. This could include a checksum on the object to enforce reproducibility and perhaps even a trigger system if an object is declared immutable but is altered (override<-
... does one dare?). Needless to say, these would have to be transparent to existing structures, so that would need some careful consideration and balance.Just thoughts at this stage.
The text was updated successfully, but these errors were encountered: