Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft 4 support #28

Closed
gazpachoking opened this issue Sep 20, 2012 · 44 comments
Closed

Draft 4 support #28

gazpachoking opened this issue Sep 20, 2012 · 44 comments
Labels
Enhancement Some new desired functionality
Milestone

Comments

@gazpachoking
Copy link
Contributor

I started writing up a bit of code for the current revision of draft 4. Branch is here
I haven't done any testing, or added unit tests, so it is liable to be broken, just thought I'd let people know I'm working on it.

@gazpachoking
Copy link
Contributor Author

I'm wondering if we should have another property other than self._version for doing the version check within the validation methods. It seems rather wasteful to have to compare the whole dictionary every time you want to see what version you are validating against. It might be nice to have a numerical property, such that you could check if self._version_num > 3

@Julian
Copy link
Member

Julian commented Sep 20, 2012

Fantastic! Thanks, that's certainly a useful thing – one I've had in mind but haven't gotten to.

Let me give you a few pointers (and I'll probably add a few if I remember more) if I may so that we don't do any extra work.

The direction this should go in would not involve having lots of conditionals inside Validator. Instead, I think ultimately, we would gradually phase out using Validator to construct validators, and have it turn into a factory function. You don't have to worry about that part until this is close to being done, but for right now what that means is that I think you should be writing a new class, Draft4Validator, and working on re-factoring so that you can re-use whatever validation is common between the two versions of the draft, and ergo the two validators (preferably via a Mixin, rather than inheritance). Then when that's close to done, we can work on properly selecting a Validator class.

The advantages here are testability and clarity, as you've already noticed.

@gazpachoking
Copy link
Contributor Author

Yeah, I was wondering about making it a separate class, that seems like a good idea. I'm not sure the best way to reduce the duplication between classes yet though. I could easily create a base with the methods that both drafts have in common, I'm just wondering how it will work if draft 5 comes out which no longer uses the same common base. Were you thinking more along the lines of not having actual validate_? methods in the mixin, but methods to make writing those easier?

@Julian
Copy link
Member

Julian commented Sep 20, 2012

Well, if Draft 5 has different semantics for the same properties, it just won't inherit from the mixin, or more methods will be factored out into another mixin that suits it, or more functions.

I can't tell without seeing what it looks like (or attempting to implement it) whether a mixin would be better than a bunch of factory functions that write validator functions – I think that's going to be dependent on each validator method on a case by case basis, depending on how much the functionality changes (and the third option of rewriting it completely might be appropriate for drastic changes). So I leave that up to your judgement at the moment :).

@gazpachoking
Copy link
Contributor Author

Sounds good. I'll give it a go, and let you give some more input from there.

@Julian
Copy link
Member

Julian commented Sep 20, 2012

Great! Thanks again.

@Julian
Copy link
Member

Julian commented Sep 21, 2012

One more tip here. If I had to do all the tests over again, I'd have done it in a language agnostic way outside of a python source file, and then just load that file and loop over it in the test suite, creating a test for each line.

If you'd like to be adventurous, you can try that. If not I'll probably migrate to it myself at some point, or bug the JSON Schema guys to make a language agnostic set of test cases.

@gazpachoking
Copy link
Contributor Author

Yeah, I was hoping the json-schema guys had something like that. It makes sense to have the test cases be usable by any validation implementation. Here's hoping they come up with something of that nature.

@gazpachoking
Copy link
Contributor Author

Well, it still needs some cleanup, but I have the current tests for draft 3 running again after splitting it out into its own class. (except doctests)

@gazpachoking
Copy link
Contributor Author

Branch has been updated with master changes and $ref support.

@gazpachoking
Copy link
Contributor Author

I'll give this another go on top of the new clean branch.

@Julian
Copy link
Member

Julian commented Oct 28, 2012

OK Great. I think tomorrow I'm going to be pushing out 0.7, and that the things I want to prioritize for 0.8 are getting some real docs and porting to the JSON test suite, so keep that in mind I guess (use JSON tests) if you can.

@gazpachoking
Copy link
Contributor Author

OK, should we implement a test runner for the json tests first then? How are you thinking of including them, git submodule from the test repo? I guess I'll work on some more tests for draft 3 branch over there first.

EDIT: I made this test runner a while ago, although I was not concentrating on making it clean, just making it work.

@Julian
Copy link
Member

Julian commented Oct 28, 2012

Yes to a test runner, haven't thought about how yet but it should be
runnable with nose / any ordinary runner and each example should show up as
an individual test. Think there's a load_tests thing that discoverers
understand but I forget offhand.

I've come to hate submodules and the extra hassle they cause, so I think
the way to package it is to just include the directory containing (just)
the JSON tests themselves. We should add a version ID to the JSON test
suite and then just update it manually on new releases after checking them.

Sound good to you?

On Sunday, October 28, 2012, Chase Sterling wrote:

OK, should we implement a test runner for the json tests first then? How
are you thinking of including them, git submodule from the test repo? I
guess I'll work on some more tests for draft 3 branch over there first.


Reply to this email directly or view it on GitHubhttps://github.com//issues/28#issuecomment-9842769.

@gazpachoking
Copy link
Contributor Author

Yep sounds good to me. I didn't use any load_tests thing in my preliminary one, I'll have to investigate that. My method did make a separate case for each test, but I'm fairly confident there should be a better way to do it.

@gazpachoking
Copy link
Contributor Author

So, do you think there is a better way to define common tests between draft 3 and draft 4? I'm just thinking about if we update common tests how to keep them in sync other than just updating the test in both folders.

@Julian
Copy link
Member

Julian commented Nov 6, 2012

I'm hoping that that doesn't happen enough to annoy us (updating vs. writing new tests), but if it does we'll have to be more clever. For now I guess I was assuming that the naive way is fine. If you're finding it annoying I guess we should come up with something.

@gazpachoking
Copy link
Contributor Author

I'm not finding it annoying, especially since I didn't write draft 4 tests yet. :-P Just wanted to check if you had some ideas. I bet you are right though, and it won't really be an issue.

@gazpachoking
Copy link
Contributor Author

@Julian I figured I'd move discussion from here for stuff specific to this implementation. Did a quick start on draft 4 support tonight, mostly just moved stuff around so far. Let me know if I'm going in the right direction, and any more ideas you have.
https://github.com/gazpachoking/jsonschema/tree/draft4_final

@Julian
Copy link
Member

Julian commented Feb 7, 2013

Yeah that looks like it's definitely headed in the right direction. I should say, there is a medium sized feature branch that I have to merge in the format branch which add support for format validation, obviously.

I don't think it should be getting in your way but I'll get it in this afternoon anyhow since it's essentially done and the rest of the stuff can be done directly inside develop.

@gazpachoking
Copy link
Contributor Author

Okay, got my branch updated and (I think) fully working with draft 4. There were a couple spots I wasn't sure how much detail to provide in the error message. I didn't commit the draft 4 tests here, wasn't exactly sure how that's supposed to work, but they are working with my most recent branch on the test suite repo. One issue I ran in to is with the definitions tests I made there, which have a schema which is a $ref to the draft 4 metaschema, and the metaschema has $refs to itself, which were then not resolving properly. I'll merge this with the format stuff soon, was in the middle of changes when you pushed that.

@Julian
Copy link
Member

Julian commented Feb 8, 2013

Yeah -- we're gonna need circular ref detection for that I think. @fge has
mentioned some stuff about that before.

Maybe the test suite should have a test for that itself even.

I'll take a look at this later, thanks :).
On Feb 7, 2013 11:02 PM, "Chase Sterling" notifications@github.com wrote:

Okay, got my branch updated and (I think) fully working with draft 4.
There were a couple spots I wasn't sure how much detail to provide in the
error message. I didn't commit the draft 4 tests here, wasn't exactly sure
how that's supposed to work, but they are working with my most recent
branch on the test suite repo. One issue I ran in to is with the
definitions tests I made there, which have a schema which is a $ref to the
draft 4 metaschema, and the metaschema has $refs to itself, which were then
not resolving properly. I'll merge this with the format stuff soon, was in
the middle of changes when you pushed that.


Reply to this email directly or view it on GitHubhttps://github.com//issues/28#issuecomment-13275966.

@gazpachoking
Copy link
Contributor Author

Alright, merged in the format stuff. Still need to add some changes to that for the draft 3 vs draft 4 validator, (draft 4 renamed ip-address to ipv4, and removed colors format,) I'll have to look into the workings a bit closer to see how to do that most cleanly.

@fge
Copy link

fge commented Feb 8, 2013

About $ref, FWIW: my ref resolution processing works as follows. I have three methods on a schema tree (JsonSchemaTree):

  • a method to fully resolve the reference against the current resolution context of the schema;
  • a method to tell whether a resolved $ref may resolve within the current schema (it may, but still fail to do, see below);
  • a method returning a JSON Pointer into the schema if the ref actually resolves, or null otherwise.

Let us take this scenario for instance.

The schema, with loading URI x://y.z/t#, is as follows:

{
    "definitions": {
        "foo": {},
        "baz": { "$ref": "#/definitions/bar" }
    }
}

We are at pointer /definitions/baz. The node is a reference:

  • the resolution method will yield reference x://y.z/t#/definitions/bar;
  • the method telling whether that reference is contained with the current scope returns true;
  • but the method returning a pointer to the actual reference returns null.

Therefore, this is a dangling JSON reference.

I do the same for inline dereferencing and this can be quite fun! For instance, a ref reading foo://bar#/a/b/c will return true (and resolve!) in the following scenario:

{
    "id": "foo://bar#/a/b",
    "c": { "type": "string" }
}

Hope this helps!

@fge
Copy link

fge commented Feb 8, 2013

Also, to detect ref loops, it is simple enough: I use a LinkedHashSet (Linked to preserve insertion order) and collect in there all refs I have stumbled upon during the resolution process. When I compute a ref, if it has already been seen, it means that we have a ref loop.

Full algorithm here:

https://github.com/fge/json-schema-validator/blob/master/src/main/java/com/github/fge/jsonschema/processing/ref/RefResolverProcessor.java#L97

@gazpachoking
Copy link
Contributor Author

@Julian So, should we include 2 subclasses of FormatChecker for draft 3 and 4 formats?

@Julian
Copy link
Member

Julian commented Feb 8, 2013

I think two instances is fine. Having the common names be on the class and
the instances (draft3_format_checker and draft4_format_checker) adding
their own aliases.

Sound OK?

@Julian
Copy link
Member

Julian commented Feb 8, 2013

And thanks fge, I'm sure that'll be helpful :).

@gazpachoking
Copy link
Contributor Author

@fge Yeah, didn't mean to ignore you, still trying to process how to apply that. Thanks!

@Julian That sounds good to me, if I understand properly. Something like this you were thinking?

+draft3_format_checker = FormatChecker()
+draft4_format_checker = FormatChecker()
...
-@FormatChecker.cls_checks("ip-address")
+@draft3_format_checker.checks("ip-address")
+@draft4_format_checker.checks("ipv4")
 def is_ip_address(instance):

@Julian
Copy link
Member

Julian commented Feb 8, 2013

Exactly. Probably prefer the d4 name in the class though.
On Feb 8, 2013 10:20 AM, "Chase Sterling" notifications@github.com wrote:

@fge https://github.com/fge Yeah, didn't mean to ignore you, still
trying to process how to apply that. Thanks!

@Julian https://github.com/Julian That sounds good to me, if I
understand properly. Something like this you were thinking?

+draft3_format_checker = FormatChecker()+draft4_format_checker = FormatChecker()
...-@FormatChecker.cls_checks("ip-address")+@draft3_format_checker.checks("ip-address")+@draft4_format_checker.checks("ipv4")
def is_ip_address(instance):


Reply to this email directly or view it on GitHubhttps://github.com//issues/28#issuecomment-13294729.

@gazpachoking
Copy link
Contributor Author

So, should each function (that's supported in both drafts) end up with 3 decorators then? One for the class, and 1 for each instance? Should the default class checkers end up the same as draft4 ones?

@Julian
Copy link
Member

Julian commented Feb 8, 2013

Yeah I think the default names should be from draft 4 since those seem a
little saner. Do you agree?

So all the functions would need only 1 decorator from the class, except for
the ones that had different names in draft 3 which would have a second on
the draft 3 format checker with their alias (this means that you be able to
even use the draft 4 names, but I think that's okay).

@Julian
Copy link
Member

Julian commented Feb 8, 2013

oh and, regarding the tests directory, it's a Got subtree as opposed to a sub module if you're familiar. Actually there's some stuff in there for the format checkers that needs to be merged back into the suite.

generally the idea is that you can commit either here or there and we should be able to merge the changes in both directions. I think the way to do some is wiwitgit diff-tree

@gazpachoking
Copy link
Contributor Author

This what you had in mind? gazpachoking@724b6ee

I'll look in to git subtrees, is it just for when the tests need to be pushed back to the suite? If I just copy in the files and make a new commit in this repo does that mess anything up?

@Julian
Copy link
Member

Julian commented Feb 8, 2013

It's for either merging forward or back. When merging in changes you need
to use a particularly arcane merge command that merges everything in 1
commit. I think if you google for git subtree 1 of the first links should
have both of the commands, otherwise I'll find it for you later or just do
the merge myself.

And yeah that looks right, though you could just use the decorator again
rather than assigning directly to checkers, I think?
On Feb 8, 2013 12:06 PM, "Chase Sterling" notifications@github.com wrote:

This what you had in mind? gazpachoking@724b6eehttps://github.com/gazpachoking/jsonschema/commit/724b6eeb04dc916728360f51098fd4ea2f44b424

I'll look in to git subtrees, is it just for when the tests need to be
pushed back to the suite? If I just copy in the files and make a new commit
in this repo does that mess anything up?


Reply to this email directly or view it on GitHubhttps://github.com//issues/28#issuecomment-13300212.

@gazpachoking
Copy link
Contributor Author

Yeah, wasn't sure if that would be better, the double function call looked a little bit weird. draft3_format_checker.checks("ip-address")(is_ipv4) I can switch it to that though.

@Julian
Copy link
Member

Julian commented Feb 8, 2013

Ah OK. It's fine leaving it.

@gazpachoking
Copy link
Contributor Author

This is what you're using to keep the test suite in sync?

@Julian
Copy link
Member

Julian commented Feb 8, 2013

It has become part of git itself a few versions ago.
http://git-scm.com/book/ch6-7.html is a good resource.

@gazpachoking
Copy link
Contributor Author

Yeah, I saw that one too. That appears to be subtree merge strategy, which is different than the subtree command. I can't find any official docs on subtree command other than on the git repo I linked. If I understand correctly, if doing a --squash with either method, it's no different than just copying the latest version of the files in any other manner and committing. As for pushing changes the other direction.. haven't quite figured that out yet.

@gazpachoking
Copy link
Contributor Author

Maybe this is a good workflow? http://psionides.eu/2010/02/04/sharing-code-between-projects-with-git-subtree/ Have you already done a push back to the test suite from changes in here? If so, how did that work? (I didn't see that part on the git book page you linked.) I suspect you would switch to the test suite branch and do another read-tree to get the changes commited to the main jsonschema repo? It seems like splitting the commits to the subdir for pushing back is the main area where using the new subtree command would help.

@gazpachoking
Copy link
Contributor Author

So, here is my quick and dirty fix for giving context to our ref resolver. It feels we should use a cleaner way, and it also feels like there should be some problems with it. It passes all of the tests, including the definitions tests I added, which just makes me think we need some more comprehensive $ref tests.

@@ -359,8 +359,10 @@

     def validate_ref(self, ref, instance, schema):
         resolved = self.resolver.resolve(ref)
+        self.resolver.referrer = resolved
         for error in self.iter_errors(instance, resolved):
             yield error
+        self.resolver.referrer = self.schema

@gazpachoking
Copy link
Contributor Author

Ahh yes, the issue with my previous $ref solution would be refs to # inside another referenced section in the same schema. With that hack, the context would change when resolving the inner $ref, even though # should point to the root of the whole schema for both.

I want to add some tests for this, it'd be nice if we can merge the current draft4 tests into the suite first, so a new pull request can have updates for both.

@Julian
Copy link
Member

Julian commented Feb 17, 2013

Looks like you're right about what I did just being the merge strategy, I was a bit careless.

Maybe we'll consider actual subtree later. For now though, closing this since we'll hopefully get this done in #59.

@Julian Julian closed this as completed Feb 17, 2013
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement Some new desired functionality
Projects
None yet
Development

No branches or pull requests

3 participants