Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Keywords alongside of $ref #523

Closed
handrews opened this issue Dec 27, 2017 · 26 comments · Fixed by #628
Closed

Keywords alongside of $ref #523

handrews opened this issue Dec 27, 2017 · 26 comments · Fixed by #628

Comments

@handrews
Copy link
Contributor

An effect of moving to a delegation model for $ref (see #514) is that there is no longer any need to forbid adjacent keywords. This was mentioned in passing in the first comment but I wanted to call it out to get clear agreement (without getting bogged down in the very long and nicely resolved #514).

The resolution to #514 allows implementations to do inclusions under certain circumstances. This can be made to work with adjacent keywords by simply stuffing the included target under an allOf:

{
    "$defs": {
        "foo": {"required": ["f"]}
    },
    "required": ["a", "b"],
    "$ref": {"#/$defs/foo"}
}

is equivalent to

{
    "required": ["a", "b"],
    "allOf": [{"required": ["f"]}]
}

If there is already an adjacent allOf, the included $ref is simply appended to the existing list.

Any objections? It's extremely common to see people put keywords beside $ref even with all documentation saying that it is not allowed, so we really should make it more intuitive.

@handrews handrews added this to the draft-08 milestone Dec 27, 2017
@gregsdennis
Copy link
Member

I wonder how this would interact with #515. I think I mentioned it as a solution somewhere in there. I'll try to find the specific comment and link to it later, but I'm on my phone for the moment.

@handrews
Copy link
Contributor Author

@gregsdennis it should be orthogonal. additionalProperties and additionalItems wouldn't be able to "see" through a $ref: they work within the same schema object, and $ref should not be special in that regard.

With $ref as delegation, it fits into the keyword taxonomy as an applicator keyword- specifically, it applies the schema at the target URI to the current instance location. This is why, when you want to inline the target schema, but there are adjacent keywords, you MUST wrap the inlined schema in an allOf.

Otherwise we're into schema merging territory which we will not be discussing here as that's what #515 is for. I'm serious about this, any discussion of topics belonging to 515 on this issue will lead to immediate deletion of the entire comment. I will probably cackle maniacally while doing so, in order to fit the mad dictator stereotype more fully. We don't need another 200+ comment issue, I filed this specifically to keep it separate.

@handrews
Copy link
Contributor Author

(it's OK to talk about whether something would put us in merge territory, it's just not OK to drag actual proposals related to merging or similar stuff over here)

@gregsdennis
Copy link
Member

gregsdennis commented Jan 16, 2018

I don't know if it's particularly an issue with this as a feature, but it would be fairly easy to create a contradictory schema.

For instance, if a $ref'd schema defines a maximum of 12, and the $ref has an adjacent minimum of 20, it creates a contradiction that will never validate any value.

Maybe this is something that we leave to the author to manage, but I figured it should still be mentioned.


Another thing that should be mentioned is that these refs could be internal or external. Internal refs are trivial, but I think we should account for the possibility of a $ref'd schema being of a different schema version, in which case simply housing it in an allOf would only work logically, not practically.

@handrews
Copy link
Contributor Author

@gregsdennis in general it is very easy to create nonsensical / contradictory schemas with JSON Schema. The scenario you mention exists today- just wrap the $ref in an allOf and it's exactly functionally identical to the proposed adjacent-keyword behavior.

The adjacent keyword behavior is not actually new at all, it just removes an unnecessary allOf.

Also, the whole point of $ref as delegation was in order to let it work as kind of a logical inline but not really a physical inline because of things like differing meta-schemas. So I'm not sure I'm getting your point in the second part- I thought we had an exhaustive discussion of that already.

@gregsdennis
Copy link
Member

So I'm not sure I'm getting your point in the second part- I thought we had an exhaustive discussion of that already.

I wasn't trying to bring in new discussion so much as clarify the equivalence in your opening comment.

@handrews
Copy link
Contributor Author

@gregsdennis gotcha. Yeah, that was perhaps not as clear as it should be.

@Relequestual
Copy link
Member

OOOOH. I hadn't realised the implication of making it clear that $ref was delegation...
This is brilliant and will solve a lot of questions (I hope).
I totaly think this is the right way to go!

@gregsdennis
Copy link
Member

gregsdennis commented Jan 27, 2018

I was thinking on this some more and I realized a neat side effect of using the sibling $ref syntax: it enables an author to create "abstract" schemas, that is, schemas that depend on certain definitions but don't define those definitions themselves, instead relying on extending schemas to define them.

The example that came to mind was generic classes. If we consider .Net's List<T>, it's essentially an array where all of the items are the same type, but we don't know what type that is. So a schema might look like this:

{
  "id" : "http://dotnetschemas.com/list-of-T",
  "type" : "array",
  "items" : { "$ref" : "#/definitions/T" }
}

By itself, this schema would cause an error as T is not defined. But if we then extend (inherit) it, we can define T to be whatever we want.

{
  "$ref" : "http://dotnetschemas.com/generic-list",
  "definitions" : {
    "T" : { "type" : "string" }
  }
}

Of course, this particular example would only apply to languages which support generics, but there are likely other "abstract"-type use cases.

Maybe... it depends on how the $ref is dereferenced.

@epoberezkin
Copy link
Member

epoberezkin commented Jan 29, 2018

The problem is that "$ref" : "#/definitions/T" will be resolved in the context of the first schema (i.e. won't be resolved at all).

@Relequestual
Copy link
Member

Yeah, @epoberezkin is right @gregsdennis, that wouldn't work.
We shouldn't (at least not right now), but if we were to change $ref to include the same syntax suggested by the annotation collection key format shown in examples in #530, then it COULD in theory be possible.

@handrews
Copy link
Contributor Author

if we were to change $ref to include the same syntax suggested by the annotation collection key format shown in examples in #530, then it COULD in theory be possible.

This is actually how Doca's cfRecurse works: It is like $ref except that instead of a URI it takes a JSON Pointer which is lazily evaluated across the post-$ref-resolution in-memory data structure (2nd phase lazy evaluation? Lazy-lazy evaluation?). It allows re-using recursive structures like meta-schemas very easily, including the dreaded API response envelope :-D

{
    "$schema": "https://example.com/schemas/doca#",
    "id": "https://example.com/schemas/envelope",
    "type": "object",
    "properties": {
        "success": {"type": "boolean"},
        "errors": {"type": "array"},
        "result": {"cfRecurse": ""}
    }
}

and then...

{
    "$schema": "http://json-schema.org/draft-04/hyper-schema#",
    "id": "https://example.com/schemas/foo",
    "type": "object",
    "properties": {
        "id": {"type": "integer", "minimum": 1},
        "this": {"type": "number"},
        "that": {"type": "string"}
    },
    "links": [
        {
            "rel": "self",
            "href": "foos/{id}",
            "method": "GET",
            "targetSchema": {"$ref": "https://example.com/schemas/envelope"}
        },
        {
            "rel": "self",
            "href": "foos/{id}",
            "method": "PUT",
            "schema": {"$ref": "#"},
            "targetSchema": {"$ref": "https://example.com/schemas/envelope"}
        }
    ]
}

For the targetSchema values, first the $ref to the envelope is de-referenced in the context of the "foo" schema, and then when the cfRecurse is evaluated it points back to the root of that post-$ref-resolution "foo" schema (the entry point of overall processing) instead of the root of the envelope schema file (the entry point of the current file).

@gregsdennis
Copy link
Member

Okay, well, I don't want to get us off-topic, but it was a thought I had.

@Relequestual Relequestual self-assigned this Feb 12, 2018
@Relequestual
Copy link
Member

I'm assinging this to myself, as I feel it will follow on from #515 .

I may actually look to resolve both of these issues with one PR, as I feel they are tightly coupled.
I have an idea how I'd like to phrase this change.

@gregsdennis
Copy link
Member

gregsdennis commented Jun 15, 2018

@handrews If there is already an adjacent allOf, the included $ref is simply appended to the existing list.

EDIT: I just realized this isn't what you were talking about, but the below is still important to the issue.

I think that it's probably important to define the "merging" behavior for each of the keywords. This is a first attempt:

  • Explicit-value keywords (e.g. minimum) would overwrite. Simple.
  • Array-typed keywords (e.g. *Of) would use concatenation. Fairly easy.
  • Object-typed keywords (e.g. properties / definitions) would use an overwriting merge, where new properties are concatenated, but conflicts overwrite the value in the referenced schema. This could produce some weirdness.

You could gather annotations from both the current and referenced schemas, so that wouldn't be an issue.


What do references in the child (not sub-) schema look like? If I "inherit" from draft 7, and I have a property with a reference to #/definitions/nonNegativeInteger, does that resolve properly, or do I need #/$ref/definitions/nonNegativeInteger (goodness, that's ugly)?

@handrews
Copy link
Contributor Author

@gregsdennis There is absolutely no merging behavior.

$ref is an in-place applicator that has the same results as evaluating its target schema. That is all.

@handrews
Copy link
Contributor Author

handrews commented Jun 15, 2018

[EDIT: I was confusing this issue with the $recursive PR, so this didn't make sense, sorry]*

@gregsdennis
Copy link
Member

gregsdennis commented Jun 15, 2018

Well, if we're going to allow $ref adjacent to other keywords, then we need to define the behavior of those keywords relative to the same keyword present in the $ref'd schema.

Here's an example that attempts to consider the different scenarios we might encounter:

Schema A

{
  "$id":"http://schema/A",
  "definitions":{
    "arrayMinLength5":{
      "type":"array",
      "minItems":5
    }
  },
  "properties":{
    "justAString":{
      "type":"string"
    },
    "justAnInteger":{
      "type":"integer"
    }
  },
  "allOf":[
    {"required":["A"]},
    {"required":["B"]}
  ],
  "minimum":5
}

Schema B

{
  "$id":"http://schema/B",
  "$ref":"http://schema/A",
  "properties":{
    "justAString":{                               // 1
      "type":"string",
      "minLength":10
    },
    "justAnInteger":{                             // 2
      "maximum":10
    },
    "newInB":{                                    // 3
      "$ref":"#/definitions/arrayMinLength5"      // 4
    }
  },
  "allOf":[
    {"required":["C"]}                            // 5
    {"required":["A"]}                            // 6
  ],
  "minimum":10                                    // 7
}
  1. justAString is redefined by adding new constraints. The new property overwrites its A counterpart.
  2. justAnInteger is redefined but drops a previous constraint. Is this still a strict redefinition? (I think it should be. I think this is what you mean by "no merging.")
  3. newInB is a new property in B. It's new, so it's just appended to the list of defined properties.
  4. Since the definitions are "imported" they can be referenced as normal.
  5. The requirement for the C property is new, but since allOf is an array-valued property this is fine.
  6. The repetition of a requirement from the base allOf doesn't matter; it's like saying true && true.
  7. minimum is an explicit-value keyword, so the behavior here is to just overwrite.

Hypothetical "combined" schema

{
  "$id":"http://schema/B2",
  "definitions":{
    "arrayMinLength5":{
      "type":"array",
      "minItems":5
    }
  },
  "properties":{
    "justAString":{
      "type":"string",
      "minLength":10
    },
    "newInB":{
      "type":"integer"
    }
  },
  "allOf":[
    {"required":["A"]},
    {"required":["B"]},
    {"required":["C"]}
  ],
  "minimum":10
}

There might be other scenarios, but I think that covers the basics.

@handrews
Copy link
Contributor Author

Also, let's call things "extension" schemas, even though I'm not thrilled with the term, since "child schema" is frequently used to mean a specific subschema that is an immediate child of the current schema object.

@handrews
Copy link
Contributor Author

@gregsdennis no, there is no need to do anything of the sort. There is no interaction. At all. None.

$ref becomes a totally normal keyword. All keywords in a schema object have their assertion results ANDed, and their annotation results combined according to the rules of each annotation.

These schemas behave completely identically:

{
    "type": "object",
    "properties": {"foo": true},
    "$ref": "#/$defs/moreProperties"
}

and

{
    "type": "object",
    "properties": {"foo": true},
    "allOf": [{"$ref": "#/$defs/moreProperties"}]
}

This, in fact, becomes the way to "dereference" a $ref with adjacent keywords: Stuff it under allOf (by adding an allOf or appending an entry to an existing allOf) and then replace the $ref with its target. Originally the reason that $ref forbade adjacent keywords was because its internal model was replacement.

With #514 we changed that internal model to delegation- it's just a normal keyword, and its results get combined with all the other keywords just as they are combined with each other.

There is no visibility or interaction through $ref.

@gregsdennis
Copy link
Member

gregsdennis commented Jun 15, 2018

If that's the case, then schema authors will have to be very careful to not write contradictory schemas.

I'm concerned that this is just a shorthand for placing the $ref in an allOf. The nuance is one of inheritance (my example) vs encapsulation (your example). They are two very different solutions to similar problems in object-oriented programming.

I feel that having $ref alongside other keywords implies inheritance, whereas inclusion in an allOf implies encapsulation.

@handrews
Copy link
Contributor Author

@gregsdennis I appreciate your persistence on this because it's helping show where people might find the new approach confusing.

If that's the case, then schema authors will have to be very careful to not write contradictory schemas.

This exact problem exists now with allOf, so I'm not sure why it's suddenly a lot harder when there's no allOf involved.

I feel that having $ref alongside other keywords implies inheritance, whereas inclusion in an allOf implies encapsulation.

This is an assumption that I'm not sure is broadly held. We've been pretty clear that JSON Schema is not OO (whether anyone has listened is debatable, which is a concern). Even the $recursiveRef situation is not OO extension, which is why I'm trying to find another name for it. It's a dynamic reference target, but once the reference target is determined, the reference behaves like $ref.

I think the appropriate approach here is that @Relequestual needs to put a very clear statement of the intended usage in PR #585. I have asked for the allOf transformation equivalence to be included in that PR, because I think it makes the point very clearly.

@gregsdennis
Copy link
Member

Then what's the benefit of this feature? How is pulling a $ref out of an allOf an improvement if the semantics are the same? To me, the intent is much clearer when it's still inside the allOf, and I'd personally still use that syntax, especially if there's already an allOf in the schema. There's no new functionality here.

Having it alongside other keywords implies a different intent (to me that intent is inheritance). A new usage of a keyword should imply new functionality that can't be achieved using the old mechanism.

I think that an inheritance-style extension is something that is worth supporting, especially since there are many OO language using JSON Schema. This doesn't require JSON Schema to be OO, however. This has also been pushed by other issues through new keyword proposals like $extend and $merge/$patch.

The desire for the functionality is present, and having $ref in an allOf versus alongside other keywords could facilitate that distinction.

@gregsdennis
Copy link
Member

From the opening comment:

It's extremely common to see people put keywords beside $ref even with all documentation saying that it is not allowed

Shouldn't we strive to understand why people are doing this and what the expected intent is?

@handrews
Copy link
Contributor Author

handrews commented Jun 16, 2018

Then what's the benefit of this feature? How is pulling a $ref out of an allOf an improvement if the semantics are the same?

Because people do it anyway and get confused that it doesn't work. There's even a whole school of thought as shown by several implementations and ambiguous test cases in the test suite that maintains that only adjacent assertions are a problem, and adjacent annotations are not. Without any clear concept of what that means. See json-schema-org/JSON-Schema-Test-Suite#197 and json-schema-org/JSON-Schema-Test-Suite#198.

This is not a good situation, and requires clarification. As discussed in #514, we feel that the best approach is to normalize $ref instead of having it be this weird exception with unintuitive special behavior.

Having it alongside other keywords implies

We will document its purpose explicitly. I am quite confident that not everyone sees it as inheritance, haven spoken with a great many people about this topic. We definitely need to be clear about it, but the solution to people making assumptions is to explicitly state which common/likely assumptions are incorrect, and why.


Regarding make JSON Schema more OO-friendly, please do not derail this issue, which is only about $ref behavior as a result specifically of the decision in #514 to switch from replacement to delegation, within this enormous topic.

Typically, OO analogies are the result of using JSON Schema to generate code. The proposed code generation vocabulary (or rather, the idea of proposing one, it's not there yet) is the appropriate place to address this. Which, like pretty much everything right now, depends on getting vocabulary support solidified in draft-08.

$merge/$patch has been resoundingly rejected after 500+ comments of discussion, and we will not revisit it. $extends failed to garner a single positive vote in the vote-a-rama related to the $merge discussion.

We decided to go with unevaluatedProperties (for the additionalProperties: false problems) plus formalizing annotation collection (to allow various flexible mechanisms for overriding annotations). These were the two main use cases driving all of those other keywords. Unless someone finds a completely fatal problem in those approaches preventing us from continuing, those decisions stand.

I spent much of the last year exploring every option, use case, and proposal in this area, and scouring the community for people to participate in the discussion. And then doing my absolute best to let everyone have their say, despite some people really not being willing to do anything but throw bombs. I have no desire to do all of that again, as nothing has changed.

@handrews
Copy link
Contributor Author

Shouldn't we strive to understand why people are doing this and what the expected intent is?

People don't read documentation. That seems to be the main thing. Otherwise, the main use case was overriding annotations (because yes, I did a metric f^&*ton of striving here).

Whether or not something was OO-style inheritance has always been equally confusing for people with or without keywords adjacent to $ref. allOf definitely does not make people less likely to assume inheritance-like behavior, so that concern is orthogonal to this issue.

Relequestual added a commit to Relequestual/json-schema-spec that referenced this issue Jun 28, 2018
@ghost ghost removed the Status: In Progress label Jun 30, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment