Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Relative URLs over http and file don't seem to work. #56

Closed
toverly opened this issue Jun 21, 2013 · 26 comments
Closed

Relative URLs over http and file don't seem to work. #56

toverly opened this issue Jun 21, 2013 · 26 comments

Comments

@toverly
Copy link

toverly commented Jun 21, 2013

Issue

I have been working to make our Schema project more robust and allow it to be used offline and online. Your validator is the best, but I can't seem to get it tor validate a linked relative uri schema.

I have looked at your "Example 5" and have also written two tests against our schema as a test. The issue it that I would like set the namespace depending on how I want to use the schema files. It would be really nice to be able to checkout the whole schema repository and set the namespace to a specific file, or if I wanted to set the namespace to be on the public website, I could do that too.

Tests

Here is the test I wrote: https://github.com/spidasoftware/schema/blob/cee/utils/test/groovy/com/spidasoftware/schema/validation/ConceptualSchemaTest.groovy

Here is the base schema file: https://github.com/spidasoftware/schema/blob/cee/v1/spidacalc/calc/point.schema
that references two other files.

Am I doing something totally wrong, or is this a bug?

@fge
Copy link
Collaborator

fge commented Jun 21, 2013

Hello,

First of all: what version are you using? This is important since quite a few things are changing under the bonnet at the moment...

I'll try and read your test file, but I don't know Groovy too much...

@fge
Copy link
Collaborator

fge commented Jun 21, 2013

One thing I have noticed:

def fileUrl = "file:"+new File("v1").absolutePath+"/"

This will give, as a URI: file:/path/to/file/. But this is not a valid file URI! It should be file:///path/to/file/.

You should use the .toURI() method of File.

@toverly
Copy link
Author

toverly commented Jun 21, 2013

Don't worry, same as Java, just no semicolons, and def == Object. If it is confusing I could convert to java for you. I also moved all these changes to a new branch because I was breaking things. I updated the links above.

@fge
Copy link
Collaborator

fge commented Jun 21, 2013

Uh, OK, sorry, I was mistaken, this is a valid file URI...

@toverly
Copy link
Author

toverly commented Jun 21, 2013

Yea, same error. I have tried 2.01 and 2.1.5 for this, and the same result.

@toverly
Copy link
Author

toverly commented Jun 21, 2013

This is the mvn error for 2.1.5:

testThatRelativeHTTPNamespaceWorks(com.spidasoftware.schema.validation.ConceptualSchemaTest) Time elapsed: 0.367 sec <<< ERROR!
java.lang.NullPointerException
at com.github.fge.jsonschema.main.JsonSchemaFactoryBuilder.setLoadingConfiguration(JsonSchemaFactoryBuilder.java:107)
at com.github.fge.jsonschema.main.JsonSchemaFactoryBuilder$setLoadingConfiguration.call(Unknown Source)
at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:45)
at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:108)
at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:116)
at com.spidasoftware.schema.validation.ConceptualSchemaTest.testThatRelativeHTTPNamespaceWorks(ConceptualSchemaTest.groovy:37)

@fge
Copy link
Collaborator

fge commented Jun 21, 2013

ARGH! Why doesn't maven show the error message??

The line in the code is:

BUNDLE.checkNotNull(loadingCfg, "nullLoadingCfg");

this means the .getMessage() of the exception, which is not shown here, should be:

nullLoadingCfg = loading configuration must not be null

Uhm. I'll look at the test source again.

@toverly
Copy link
Author

toverly commented Jun 21, 2013

That is odd, why is it null, I set it just before. Also is there an easy way for me to enable logging?

@fge
Copy link
Collaborator

fge commented Jun 21, 2013

Now, as to your initial question, I have in plan for -core to add path redirects -- maybe this is what you are looking for. That is, if you have a base URI for your schemas of, say, https://my.site/schemas/, but have a local copy under, for instance, resource:/com/myprojects/schemas/, I have the plan to be able to do that:

LoadingConfiguration.newBuilder()
    .pathRedirect("https://my.site/schemas/", "resource:/com/myprojects/schemas/")

This, in combination with .setNamespace(), should, I think, answer this need. Or did you have something else in mind?

@toverly
Copy link
Author

toverly commented Jun 21, 2013

Not sure why it is throwing a null pointer, the LoadingConfiguration isn't null.

com.spidasoftware.schema.validation.ConceptualSchemaTest - fileUri: file:/Users/toverly/Code/schema/v1/
com.spidasoftware.schema.validation.ConceptualSchemaTest - cfg: com.github.fge.jsonschema.load.configuration.LoadingConfiguration@4df8b14

As to redirect:

I think that would get me to the nearly the same spot, but it seems like the more elegant solution would be to have the namespace set. This would eliminate the need to have the large "https://blah blah blah" all over the place and would make it much easier to move it to a different location, since none of the code would actually reference an absolute location.

I think the redirect is an excellent feature for schemas that aren't referenced in a relative way though.

@fge
Copy link
Collaborator

fge commented Jun 21, 2013

is there an easy way for me to enable logging?

Easy, easy... I don't know about that ;) But the answer is yes. You can supply an implementation of ReportProvider to a JsonSchemaFactory.

This means you would, for instance, provide an implementation of ProcessingReport which uses your logging system, then implement a ReportProvider in order to provide those ProcessingReports.

The interface is a little complex at the moment (this could be an abstract class with only one method to implement instead of three). And the number of "injections" I do makes me wonder about using DI...

@fge
Copy link
Collaborator

fge commented Jun 21, 2013

This would eliminate the need to have the large "https://blah blah blah" all over the place and would make it much easier to move it to a different location, since none of the code would actually reference an absolute location.

I am afraid I do not understand... Do you mean you rely on the ids of your schemas?

I'll try and write a test case here reproducing your tests. I really can't figure out what is going on...

@toverly
Copy link
Author

toverly commented Jun 21, 2013

Yea sorry, I am not being all that clear. Let me try to be a little more systematic.

Currently on the master branch of our schema all the "$ref" point to a specific url on the master branch. This leads to issues if I load a any of the . schema files either locally or from the the web, if they have a $ref they will point back to a specific version on the web. This makes it very hard to version, because I don't know who or what are pointing to "master".

This file is how I currently have the $ref on a simple object. absolute point.schema. As you can see the id and ref are all absolute.

However I would much rather have ALL references be relative to the namespace. I changed that same file on the branch I am working on to be relative ref and that is what the test I provided is testing:
realitive point.schema This would allow more generic placement of the file. As best I can tell this is how you did example5. Except I am trying 'file' and 'https' instead of 'resource'.

Now the "redirect" would solve the issue if people are using your library, but I can't guarantee that :-), and that would be a nice feature for working with other schema files outside of our own.

@fge
Copy link
Collaborator

fge commented Jun 21, 2013

Aaah, I understand, now...

It is not the loading configuration that is null, but the message bundle!

Do you compile from source?

@toverly
Copy link
Author

toverly commented Jun 21, 2013

No, but I could. Want me to try something?

@fge
Copy link
Collaborator

fge commented Jun 21, 2013

Uhmno, not at this moment.

Can you check, in the 2.1.5 jar (and in the 1.1.6 -core jar), whether the following file is packaged?

META-INF/services/com.github.fge.msgsimple.serviceloader.MessageBundleProvider

If it is not there, it means there is a bug in my packaging :/

Also, at one point in the validation, whether it is from the file URI or the https URI, I get an I/O error; I'll let you know where.

I'll get back to you in a few minutes, I have spotted other problems.

@fge
Copy link
Collaborator

fge commented Jun 21, 2013

OK, I have found the packaging bug, and why you get an NPE where you are getting it. A very, very stupid mistake from my part :(

I need to fix another mistake I have spotted in -core (this one I don't quite understand where it comes from, but it shouldn't be long to spot). I'll get back to you ASAP. And in the meanwhile I'll also read your post above!

@fge
Copy link
Collaborator

fge commented Jun 21, 2013

First, about your test... The base URI is https://github.com/spidasoftware/schema/blob/master/v1/spidacalc/calc/point.schema#. This gives a 404!

I have found this which is valid: https://raw.github.com/spidasoftware/schema/master/v1/spidacalc/point.schema#

Also, if you don't mind sending me a mail (see my profile), I'd like if we went on via IRC, it is better for interactivity.

@fge
Copy link
Collaborator

fge commented Jun 21, 2013

OK, 2.1.6 released which fixes the bug wrt message bundles.

Now, back to the main point...

Currently on the master branch of our schema all the "$ref" point to a specific url on the master branch. This leads to issues if I load a any of the . schema files either locally or from the the web, if they have a $ref they will point back to a specific version on the web. This makes it very hard to version, because I don't know who or what are pointing to "master".

OK, but please keep in mind one thing: if an id is found at the top of the schema, it conditions the resolution scope of all JSON references which follow. For instance, if you have:

{
    "id": "/foo/bar",
    "items": {
        "$ref": "meh"
    }
}

then the fully resolved JSON reference will be /foo/meh#.

Schema addressing is unfortunately a weak point right now in JSON Schema. In particular, there is inline dereferencing, which uselessly complicates matters and which I tried to scrap from the core, without succeeding because some people did not agree. But in v5 it will be gone. I'll fight for that.

Also, I don't know how other libraries deal with addressing. Those who pass the test suite have at least a modicum of common sense when dealing with URI resolution. As to others, it is completely unknown.

This is why path redirect would be a solution for you. If all your ids are absolute, a path redirect would cause URIs to be transformed before resolution.

On a more general note, I think you should try and use gh-pages: this is how I publish my Javadoc for instance. All you have to do is create a gh-pages branch on your repo, publish content and push it to github: the pages will then be visible at https://yourusername.github.io/path/to/the/files.

@toverly
Copy link
Author

toverly commented Jun 21, 2013

I see about the reference and the id. I would agree that this is a weak point. We will probably use the gh-pages. But right now, I can't seem to find a good way to accomplish what I need within the json-schema. Anyway I can voice my opinion in a useful manner on the spec? I would love to be able to say "given the root of X, load my multiple files" essentially the same way a file system works, would be perfect.

@fge
Copy link
Collaborator

fge commented Jun 21, 2013

As to voicing your opinion on the spec, the best medium is the Google group.

I would love to be able to say "given the root of X, load my multiple files" essentially the same way a file system works, would be perfect.

Heh, I wish Java 6 had FileSystem... Unfortunately, it doesn't have it at the moment ;)

I do have the means to preload schemas already, but one by one; if I were to walk a base URI, that would only make sense in a couple of cases: file URIs, for instance; even resource is a royal pain to deal with; or from a zip archive, maybe. But HTTP? ;)

The problem with id remains, though. If you have an id, it changes the URI resolution scope and unfortunately there is no way around this at this moment :/ Or I could go around the spec and propose a mode where id would be completely ignored. That would be another option. As a consequence however, this would be a close schema set. What do you think?

@toverly
Copy link
Author

toverly commented Jun 25, 2013

I think the bug that was there is resolved. Did you want to close?

@fge
Copy link
Collaborator

fge commented Jun 25, 2013

Uhm, sorry, I have lost track. What bug are you talking about exactly?

@toverly
Copy link
Author

toverly commented Jun 25, 2013

This was loading relative schemas, I got it to work as the standard defined it. You also found a bug with the reporting stuff.

@fge
Copy link
Collaborator

fge commented Jun 27, 2013

-core 1.1.9 will have full path redirection support FWIW.

As to this issue, is it OK for you?

@fge
Copy link
Collaborator

fge commented Apr 8, 2014

-ETIMEOUT...

@fge fge closed this as completed Apr 8, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants