Add Support for Resource URI Scheme#1451
Conversation
| // only set to systemId if it's different from the diagnosticFile path | ||
| !xercesError.getSystemId.endsWith(schemaFileLocation.diagnosticFile.getPath) | ||
| ) new File(xercesError.getSystemId) | ||
| else schemaFileLocation.diagnosticFile), |
There was a problem hiding this comment.
So is this saying that there are cases were the file where Xerces finds an error isn't necessarily the same as the schema file location that daffodil provider to Xerces? Like the error is in an imported file or something?
Should we just always use the systemId from Xerces if it's not null? Or are there cases where it might be defined but diagnosticFile is the right thing to use?
Also, some windows tests are failing that look like they might be related to this, causing us to loose depersonlized paths in diagnostics?
There was a problem hiding this comment.
Correct, there are cases where systemId is defined but it's the absolute version of diagnosticFile causing us to loose the depersonalized paths. So the check is to sort of mitigate that
There was a problem hiding this comment.
And the depersonalized one we have in schemaFileLocation isn't necessarily the correct file that the error is about?
There was a problem hiding this comment.
Correct, that's what I found while playing around with the quarkus stuff. I'll confirm the accuracy of that again tho. It was complaining about line 577 in a 34 line file iirc
There was a problem hiding this comment.
Yep just confirmed that we're getting the right paths now when we weren't before
There was a problem hiding this comment.
Xerces Validation works without needing to do anything extra
There was a problem hiding this comment.
As an aside, I think XercesSchemaFileLocation is only used when we run into error loading the schema. So we don't use it for validation
There was a problem hiding this comment.
As an aside, I think XercesSchemaFileLocation is only used when we run into error loading the schema. So we don't use it for validation
You're talking about loading a DFDL schema right? I don't think it should be used for loading the XMLSchema_for_DFDL.xsd schema.
As such I don't think it represents an error in Daffodil.
It might not necessarily be a bug in Daffodil, it could be in Quarkus as you say, but it still represents a bug somewhere that broke functionality in Daffodil need to load the XML Schema for DFDL schemas. For example, if some tool modified daffodil jars and removed XMLSchema_for_DFDL.xsd schema, that wouldn't want to be reported as an SDE. That's a bug in the tool that broke an assertion that those schemas are available to Daffodil. Breaking that assertion should result in an some kind of exception that's not an SDE, so those tools or Daffodil can be fixed.
There was a problem hiding this comment.
Hmm, we use XersesSchemaFileLocation for all loader errors. Are you suggesting we add some special logic for just XMLSchema_for_DFDL.xsd?
There was a problem hiding this comment.
Kindof. I'm suggesting that when we load XMLSchema_for_DFDL.xsd, not when we validate with it, we need to detect that it wasn't loaded.
I was hoping we could wrap this line in a try/catch block:
And throw an assertion when it fails, but i'm not sure if that actually throws an exception on failure. Might need a different approach if that doesn't work.
| } | ||
| } | ||
|
|
||
| def optResourceURI(contextURI: URI, relPath: String): Option[URI] = { |
There was a problem hiding this comment.
Suggest we call this optRelativeResourceURI to match the pattern of optRelativeJarFileURI, which looks like it does a similar thing but for jars contexts?
| if (Paths.get(relURI).toFile.exists()) | ||
| Some(relURI) | ||
| else None | ||
| } else if (contextURI.getScheme == "resource") { |
There was a problem hiding this comment.
It might be worth doing a little refactoring to this getResourceRelativeOption function to make it more clear, and more correct. For example, the contextURI.isOpaque condition probably wasn't the right check, since the logic of that branch expects the context to be a "jar" URI. Any other opaque URI won't work and will probably lead to an weird error. I think this only works because the only opqaue URI we support is "jars", so isOpaque kindof implies it's a jar URI, but in theory we could add support for other opque URI's that aren't jar and this would break.
Since this fuction always expcts contextURI to have a scheme, what if we move the logic of the "file" scheme to a new optRelativeFileURI, and then this function just becomes:
contextURI.getSchema match {
case "jar" => optRelativeJarFileURI(...)
case "file" => optRelativeFileURI(...)
case "resource" => optRelativeResourceURI(...)
case _ => throw new IllegalArgumentException(...)
}I think that makes this function more clear and leaves the functions it calls to figure out to correctly handle resolving the relative path.
| } | ||
|
|
||
| def optResourceURI(contextURI: URI, relPath: String): Option[URI] = { | ||
| val relURI = contextURI.resolve(relPath) |
There was a problem hiding this comment.
I think resolvedURI might be more clear. relURI makes me think this URI is relative, but it should be an absolute URI at this point.
There was a problem hiding this comment.
I know this was just copied from getResourceRelativeOnlyOption, but we should rename relURI to resolvedURI here also. relURI really isn't a good description of what this variable represents.
| val relURI = contextURI.resolve(relPath) | ||
| try { | ||
| relURI.toURL.openStream().close() | ||
| this.getClass.getResource(relURI.getPath) |
There was a problem hiding this comment.
It looks like these two lines are doing the same thing--checking to see if the resolved URI actually exists. So I'm not sure we need both. I imagine calling getResource is more efficient since it doesn't need to actually open the file as a stream and deal with exceptions? So can we remove the try/catch and just do something like this:
if (this.getClass.getResource(relURI.getPath)) Some(relURI) else None| case "file" => Paths.get(uri).toFile | ||
| case "resource" => { | ||
| val resourceFilePart = uri.getPath | ||
| new File(resourceFilePart) |
There was a problem hiding this comment.
The rest of this function uses Paths.get(foo).toFile. I'm not sure why we did that, maybe Paths.get is doing something additinal? Should do that here for consistency?
daffodil-lib/src/main/scala/org/apache/daffodil/lib/xml/DaffodilXMLLoader.scala
Show resolved
Hide resolved
daffodil-lib/src/main/scala/org/apache/daffodil/lib/xml/DaffodilXMLLoader.scala
Show resolved
Hide resolved
|
Created https://cwiki.apache.org/confluence/display/DAFFODIL/Working+with+Native+Executables as well as a companion to this ticket @stevedlawrence @mbeckerle @jadams-tresys |
stevedlawrence
left a comment
There was a problem hiding this comment.
+1, one variable rename, and maybe the XerecesSchemaFileLocation change isn't needed anymore?
| else | ||
| new File(sysIdNormalizedPath) | ||
| } | ||
| }), |
There was a problem hiding this comment.
Can this change be reverted? Now that failures to load the validation schema result in an assertion, I think XercesSchemaFileLocation will only be created where schemaFileLocation.diagnosticFile is correct.
| } | ||
|
|
||
| def optResourceURI(contextURI: URI, relPath: String): Option[URI] = { | ||
| val relURI = contextURI.resolve(relPath) |
There was a problem hiding this comment.
I know this was just copied from getResourceRelativeOnlyOption, but we should rename relURI to resolvedURI here also. relURI really isn't a good description of what this variable represents.
54a8d16 to
b53ed68
Compare
- open up check for schema in resolveCommon to allow for file, jar and resource scheme - throw invariantFailed assertion in the case that we're unable to load XMLSchema_for_DFDL.xsd so the user is aware something is broken outside of the schema rather than an SDE than indicates something is wrong with the schema DAFFODIL-2973
b53ed68 to
f97b339
Compare
DAFFODIL-2973