New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change default schemaLocation so it points at a URL #45
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks much for this. A few things here:
First, I think we should discuss whether we want to do it this way, or just stop putting anything in schemaLocation
at all by default. I'm kind of inclined to the latter (because all the test files had this set, I had historically assumed it was a "good idea", but now I'm not convinced).
Relatedly, given the proposal that eml_validate
ignores schemaLocation
anyway and always uses the local copy, just leaving that blank seems the simplest (i.e. it feels a bit weird to me that we have add a function to set the schema location while also ignoring that value in validation. Though I do agree with @mbjones that it is better we always use the local copy to validate).
Second, I think (but may be mistaken!) that we have a bit bigger nuisance in fixing #44. Currently validation is always local, but because the test suite includes a bunch of documents that are not valid complete EML, but are instead only valid subsets (e.g. dataset
etc), the validator is currently parsing schemaLocation
in order to decide if it should be testing against the full eml.xsd
or some subset like dataset.xsd
. This causes validation to fail if schemaLocation
is unset. It sounds like this should be a simple fix, though I did take a quick stab at it on the plane and created some test failures which I have not yet dug into. See patch-schemaLocation branch....
R/eml_version.R
Outdated
|
||
# Helper string used below to factor out the host from the generation of | ||
# schemaLocation values | ||
eml_host <- "https://eml.ecoinformatics.org" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is fine, unless we feel it should be configurable as an option. But why make it a package global variable? It seems more natural to me to make eml_host
an optional argument to the new eml_schema_location()
function, with this as the default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a better approach.
Hey @cboettig: I think I'm fine with either setting Considering the other two points you bring up, I can get onboard with not setting I'm happy to close this PR if we agree. |
You make a good case for setting it as being 'good practice' even though it's not required. I think we should set it by default then. Let's change the PR to
Does that make sense? Not sure if |
I wanna think through this a bit... The user-facing interface for setting So if it's not set, we automatically generate a value for them which would be a side-effect and should be overridable. Good so far. (Aside: Should it be overridable in This makes me want to add support for an |
Yup, good point! I agree the user way to set it should be in the list construction (or also in an argument to a dedicated constructor function, e.g. the proposed I guess the only trick is that it's a bit less obvious how to have a default value that the user can override with |
Hrm, at the risk of discussing this to death, what about
This could be exposed at the If you aren't a fan, I am happy to do |
@amoeba thanks, yeah, I like your proposal better, I think it's more intuitive than Also if you have a chance, maybe you want to take a whack at the validator code to remove any use of |
As per discussion in ropensci#45, this refactors the behavior of the schemaLocation argument in as_xml to provide an API for asking emld to automatically guess a helpful, web-resolvable schemaLocation value when a value isn't set on the document before serialization.
I refactored the schemaLocation argument as discussed above. Please take a look. I'll move on to looking at excising schemaLocation from validation logic. Edit: And will send a separate PR for that. |
thanks for this, current approach looks great! |
Closes #44
Couple of things to look at here:
eml_version.R
?