Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feedback on using emld to ease creation of helper methods #17

Closed
amoeba opened this issue Feb 28, 2018 · 3 comments
Closed

Feedback on using emld to ease creation of helper methods #17

amoeba opened this issue Feb 28, 2018 · 3 comments

Comments

@amoeba
Copy link
Contributor

amoeba commented Feb 28, 2018

Hey @cboettig I carved out some time to look at re-writing some EML helpers in emld to get a sense of how the two packages compare. The main thing I'm coming away with so far is that my helpers nearly melt away. I think this is evidence that it will be easier to write a full suite of generic helpers by hand, and also any special case helpers the community needs.

I started out with re-writing the helpers related to parties in arcticdatautils. Two major differences emerged:

  • I no longer have to do the NULL checking because emld handles the propagation of NULLs gracefully
  • I'm no longer wrapping things in ListOfs. This is always a pain for new users.
  • Sub-elements of parties, like email addresses, are way easier to create

I think these were all benefits you hoped we'd see.

Take a helper for creating a simple contact. It's only real purpose is to provide the user some autocompletion and roxygen docs:

set_contact <- function(givenNames = NULL, surName, email = NULL) {
  list(individualName = list(givenName = givenNames,
                             surName = surName),
       electronicMailAddress = email)
}

Adding support for another (simple non-nested) attribute of the party like phone number is an easy change:

set_contact <- function(givenNames = NULL, surName, email = NULL, phone = NULL) {
  list(individualName = list(givenName = givenNames,
                             surName = surName),
       electronicMailAddress = email,
       phone = phone)
}

Good start so far.

@cboettig
Copy link
Member

Thanks! This is encouraging to hear. I agree that getting past having to deal with NULL and ListOf makes these a lot easier to write.

It's been both exciting and a bit scary to see the EML R package is getting real uptake (i.e. by people I don't already know), which probably says more about EML standard than the R package. I know you need a stable platform for the arctic data work, but are also doing a lot of custom development in articdatautils -- do you think we have a path to get some of that to be emld based instead of S4 based?

Just a reminder that I have a emld-based versions of most (all?) of the set_ methods from EML over in https://github.com/cboettig/eml2 now, and recall that eml2 has a construct function for all complex EML elements (basically providing tab completion but not documentation at this point; which could/should be added), e.g. construct$creator() etc.

I haven't written all the get_ methods, in particular, for the methods that return tabular input from EML (attribute table, taxa table), though I've done some. The default rectangling performed by jsonlite::fromJSON() provides a pretty nice generic algorithm for mapping nested JSON to a data.frame, though sometimes needs a little tidying.

I'm also practicing a pattern based on jq queries to get 'flatter' JSON returned that can be easily rectangled. I think the jq approach will also be useful for going between, say, schema.org and eml json; i.e. using schema.org as a lighter-weight input format of key-value pairs than EML (mostly has slightly less typing and cleaner semantics, e.g. givenName and familyName are properties of creator, not properties of individualName, which is a more cludgy way of implying type: Organization.) okay, I digress...

@amoeba
Copy link
Contributor Author

amoeba commented Mar 19, 2018

do you think we have a path to get some of that to be emld based instead of S4 based?

Maybe. I'm no longer working on arcticdatautils, and have transferred maintenance to to the team that works with arcticdata.io and other projects. I know they're quite capable of learning the new way, though they may not see the point because they already have the helpers in S4. Asking them to play around with it might be a good use case.

@cboettig
Copy link
Member

I've opted to go back from construct$dataset etc to the shorter eml$dataset in eml2 constructor methods. eml2 also now has a set_responsibleParty method.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants