Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a context manager to RefResolver #60

Closed

Conversation

gazpachoking
Copy link
Contributor

@Julian Still need to come up with some more tests, and fix one current test to work with the new method to make sure this is working as intended. Just wanted to get your thoughts on this method of handling nested $ref resolution before proceeding.

@Julian
Copy link
Member

Julian commented Feb 14, 2013

I'm confused, I've been slightly busy this week so haven't been following so closely, but what problem is this aiming to solve?

@gazpachoking
Copy link
Contributor Author

So, for example, in these tests, (which currently fail,) we are validating a schema with a $ref to another schema. This schema in turn has $refs to itself (#), currently, when evaluating those $refs, our resolver still looks them up in the context of our original schema, this branch keeps track of the proper context for which we should be resolving the $refs.

Did that make sense? It's a bit confusing.

@gazpachoking
Copy link
Contributor Author

I also have remote URIs adding to the RefResolver store in this branch. That has nothing to do with this change and I didn't intend to include it.

@Julian
Copy link
Member

Julian commented Feb 14, 2013

I specifically avoided the latter. Caching is good and we should have one way to make a ref resolver that caches, but I'm not sure I like it as the default. Actually I guess now thinking again I do like it, but there should be a cache_remote=False flag or the like in the constructor to turn it off.

@Julian
Copy link
Member

Julian commented Feb 14, 2013

As for the main bug here, I have to think about this a bit more but doesn't it just sound like the referring document should be provided to resolve, not init?

@gazpachoking
Copy link
Contributor Author

doesn't it just sound like the referring document should be provided to resolve, not init?

Yeah, I thought about that, we don't necessarily have access to it from validate_ref though, as the schema argument may be a subschema in the document, whereas we need the whole document to resolve refs from.

@Julian
Copy link
Member

Julian commented Feb 14, 2013

Ah, OK.

How about removing it from __init__ then and having something like:

with resolver.enter_context(schema):
    resolver.resolve(whatever)

which is fairly close to what you have I guess. And then having refresolver raise an error if no context is set during a resolve where one is needed.

@Julian
Copy link
Member

Julian commented Feb 14, 2013

Or maybe in_context is a better name.

@gazpachoking
Copy link
Contributor Author

That could be better, we still have the issue of not necessarily having the proper context available from validate_ref though. The schema argument passed to validate_ref may be a subschema, where as the proper resolution context should be the whole schema.

@Julian
Copy link
Member

Julian commented Feb 14, 2013

Oh it should be the whole schema? then just always use self.schema no?

@gazpachoking
Copy link
Contributor Author

then just always use self.schema no?

How about when we are inside a remote $ref though? In that case the context needs to be the base of the remote schema.

@Julian
Copy link
Member

Julian commented Feb 14, 2013

In which case resolve_remote would call in_context as well. I'm probably still not thinking clearly here?

@Julian
Copy link
Member

Julian commented Feb 14, 2013

Probably going to have to fiddle with this later to see where the problems crop up

@gazpachoking
Copy link
Contributor Author

Hmm, you could be right, let me see what I can come up with.

@gazpachoking
Copy link
Contributor Author

Yeah, I'm not coming up with any better ways where we track the context outside of RefResolver. RefResolver is the one who needs to calculate the new context, so it seems like we shouldn't be worried about it externally. If we set the current context from outside we'd end up with something like:

context = resolver.get_context(ref) 
resolved = resolver.resolve(ref)
with resolver.in_context(context):
    # Do the validation against resolved

Which doesn't feel any better to me. Maybe I'm missing something, see if you come up with something better while playing around.

@gazpachoking
Copy link
Contributor Author

Also, even with my above idea we'd still need to set the initial context in RefResolver.init I don't know where we'd put the with resolver.in_context(self.schema): that it wraps the initial validation but doesn't reset the context to self.schema when validating subschemas.

@gazpachoking
Copy link
Contributor Author

@Julian I guess the crux of the matter is this:

  • We must ask the RefResolver what the new resolution context will be for given document lookup.
  • This new context may not be the same as the resolved document, if the reference had a fragment.
  • We need to change the resolution context after the current document lookup, for any future lookups that occur while we are validating the referenced document.
  • We will not need to look up a context for a given ref without also needing to lookup the document that ref points to.

For these reasons, I picked to keep the current context entirely within RefResolver, and combine the context change with a document lookup. One thing is clear however, if there is not a better way to do this, the name resolve_context sucks for this method, as does my docstring for it.

@gazpachoking
Copy link
Contributor Author

Added a couple tests to the suite that should test the current problem. json-schema-org/JSON-Schema-Test-Suite#29

@gazpachoking
Copy link
Contributor Author

Made a separate PR for the remote caching stuff, I might have a better idea for the ref context stuff, I'll give it a try.

@gazpachoking
Copy link
Contributor Author

@Julian Okay, scrapped the old stuff, did it slightly differently. I think it's at least slightly better now, not sure about the method names still, especially context_and_document. Let me know what you think.

@gazpachoking
Copy link
Contributor Author

Maybe scrap context_and_document and add a refresolver.split_fragment, so that resolve_context, and resolve_fragment can be called directly from validate_ref... need to see how that works out.

@Julian
Copy link
Member

Julian commented Feb 17, 2013

I'm not neglecting this :P I'll try to give it some poking tonight.

@Julian Julian mentioned this pull request Feb 18, 2013
@gazpachoking
Copy link
Contributor Author

I think at least the method name is a bit better now.

@gazpachoking
Copy link
Contributor Author

Okay, I think I have a better idea. We move the store to be a class attribute of RefResolver, then add a from_uri method similar to from_schema method. Then we add a contextmanager to the Validator class to change self.resolver. validate_ref then becomes something like:

def validate_ref(self, ref, instance, schema):
    with self.with_resolver(self.resolver.from_uri(ref)):
        resolved = self.resolver.resolve(ref)
        for error in self.iter_errors(instance, resolved):
            yield error

@gazpachoking
Copy link
Contributor Author

I think I like that idea. We could also then add inline addressing support by scanning through the schema on __init__ for "id" keys and adding them to the store.

EDIT: Hmm, that might not be as easy as I thought, as there can be id keys in dicts which are not schemas. Well, that bit will need more thought.

@gazpachoking
Copy link
Contributor Author

Seems whatever solution we settle on needs to be expanded more than what I currently have here. I explained the issue in #66

@gazpachoking
Copy link
Contributor Author

Going to close this. We can discuss further in #66, I'll make another pull request once we have more tests going.

Julian added a commit that referenced this pull request Oct 5, 2014
3cef243 Applies to Draft4 as well.
f323723 Check comparisons for very negative numbers as well.
c73cf22 Merge pull request #60 from gilesbowkett/added-jsck-to-who-uses-it
e77c86a jsck uses JSON Schema Test Suite
1a96775 Merge pull request #57 from tanaka-tatsuya/fix/max-length-test-case
b124928 fixed maxLength test case

git-subtree-dir: json
git-subtree-split: 3cef243acce1290c8bbbc546f7f3fcbd6c10ff8b
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement Some new desired functionality
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants