allOf with object schemas and additionalProperties = false #116
Comments
The desired results make sense, but the logic validators would have to implement sounds horrible. Would be nice to have some way to achieve these results though. |
It doesn't feel like something that would be too hard. I was planning on having a go at forking Francis Galiegue's java validator to see. I'm currently working on professionally on an application that uses JSON in ajax requests (pretty common), and uses the json schema to validate the requests (also not uncommon, I suspect). However, right now we have no alternative, but to cut'n'paste common bits of schemas because our risk control chaps are, quite understandably, against permitting schemas that can't flag unexpected content as incorrect. We've talked about writing a preprocessor to do the merging, but that feels like an admission of defeat to me. |
The solution I like best is actually "ban unknown properties mode". This would be a mode/config for the validator, where the presence of a undocumented property (not accounted-for by any of the schemas) is raised as an error. The idea is that during development you could throw a wobbly at any unexpected input, during production you could log a warning but continue, or whatever - all using the same schema. It is also very simple to implement - simply maintain two sets ("properties I have seen", "properties I have validated") and compare the two at the end after all schema validation. At least one validator (tv4, my one) already supports this, and it will likely be documented explicitly in v5. If you're modifying Francis's Java validator, then it might be a better move to add support for "banUnknownProperties mode" instead. |
That doesn't solve the Schema "patching" still seems like the best solution to me, especially since JSON Patch is an RFC. |
@fge - What case doesn't it solve? The idea is that the rule "no properties I don't understand" is set at the validator-level, and applied after all schema validation. If a property is accounted for by any schema (inside an |
@geraintluff the problem here is that you define this at the validator level, and this is information unavailable in the schema itself. Another party may choose to use the schema but not this validation mode. What is more, it is unconmfortably tied to JSON Objects instances only. Consider that with my approach, which does not suffer any of these problems ;) Whiich is why I proposed it in the first place. |
Patching/merging is basically generating schemas at runtime, which (as well as being conceptually complicated and fairly inelegant) causes problems if you want to reference parts of the generated schema. And yes - "ban unknown properties" is defined at the validator level. But the requirement always seems to be "I don't care which schema accounts for which properties, but I want to know they're all covered" - and this handles that fairly well. (Also, we're talking about typo'd property names - I reckon it's fair to limit the discussion to objects). |
Given these two JSON objects: {
"prop1": 1,
"prop2": 2
} {
"prop1": 1,
"prop2": "2"
} It seems like I should be able to use the following schema to represent this: {
"type": "object",
"additionalProperties": false,
"properties": {
"prop1": { "type": "number" }
},
"requires": ["prop1"],
"oneOf": [
{
"properties": {
"prop2": { "type": "number" }
},
"requires": ["prop2"]
},
{
"properties": {
"prop2": { "type": "string" }
},
"requires": ["prop2"]
}
]
} According to version 4 of the spec though, nothing can ever validate against this schema since the The "ban unknown properties" mode can partly fix this issue, but what if you want to embed the above schema into another one where you don't want the "ban unknown properties" mode to apply? There doesn't seem to be a way to do that. |
I heard someone mention elsewhere that the See: http://json-schema.org/latest/json-schema-validation.html#anchor79 In section 5.5.2.1: |
This would mean "hacking" {one,all,any}Of to behave differently depending on whether you validate JSON Objects or not. And that is the whole problem. @geraintluff's solution is to allow two validation modes -- for objects. And that is also a problem to me. It is not a behaviour enforced by the schema itself. Sure enough, my solution requires generating schemas at runtime, but honestly I don't see the problem. After all, if you use I still think my idea is the better one because it presents no ambiguity ;) |
Thanks for the responses - I acknowledge that this is seen as a concern. As I read, there are three possible solutions proposed:
So laying out my store, as it were: I can see the attraction of 2 because it means that individual validator implementations can make up their own mind whether to support this, but this also means that a schema could validate successfully or unsuccessfully depending on a setting on the validator (which may or may not be available). This feels like a bad thing. Going the preprocessor route has similar challenges I think: the attraction is that we don't have to change the spec, but the downside is that we lose the certainties that the spec brings. It also seems to me that preprocessing is actually just doing the same thing as 1, while avoiding explicitly altering the spec. Meanwhile, I totally understand that it is a big deal to break the symmetry afforded by the simple definition of 'allOf' - all schemas must validate, no exceptions. However, there's clearly a need to do something with this - and at least there's no risk of anyone relying on the extant behavior: because as it stands including more than one object schema with "additionalProperties": false in an 'allOf' clause would always fail. Incidentally, this topic is very much related to the 'extends' keyword that was dropped from verion 3 of the schema. I believe, and correct me if I'm wrong, that this was done to avoid use of polymorphism in which agreeing to validate a base schema implicitly commits to validating any schema that derives from it (bad). However, in practise, notwithstanding the risks of using 'mixin types' in data modelling, the alternative often means a lot of duplicated code that must be kept in sync. Another way round this, without altering the spec, that's been suggested to me would be to insist on aggregation instead of mixing in. For example instead of: we use Logically, this amounts to almost the same thing - the difference is if there are properties that overlap which would necessarily appear twice in the second schema, and once in the first. However, there are two reasons to prefer the first:
|
OK, I have another proposal. That would be a new keyword, {
"type": "object",
"propertiesFrom": { "$ref": "link://to/other/schema#" },
"properties": { "other": { "properties": "here" } },
"additionalProperties": false
} The value of Another possible keyword: Comments? |
That would work The negative is that it introduces a couple more keywords into the standard. So assuming we go for the maxim less is more, what creates more complexity, asymmetry in the logic for 'allOf', or extra keywords? I'd argue that since 'allOf', when used with objects with additional properties set to false has little utility, the equation sits on the side of making the meaning of 'all of' change with objects. However, that said, I'd support anything that presents a workable solution |
Any progress on this? |
@wking well, since then I have taken another route; and I strongly disagree with the "ban unknown properties" mode, it's a hack (note that it cannot be enforced at the schema level). My proposal is instead to implement schema patching via either "$merge" or "$patch"; please see the Google group for more information, I have posted threads recently about this. |
On Thu, Jan 08, 2015 at 12:10:39PM -0800, Francis Galiegue wrote:
Links to recent threads about $merge and $patch: |
And the thread kicking off $merge and $patch seems to be 1, although |
I just got hit by this problem and I'm not a big fan of trying to stick on First: Second: I first though of merge before reading the mailing list, but this way:
This way we'll be able to merge more than two schemas, which may have a usage. Also it looks a lot like:
but clear / explicit / readable (I personally, as a newcomer of jsonschema, don't understand why the "array / item" would do the trick for now, I'll have to read the doc again ^-^) |
@dawidcha I agree that the current definition of additionalProperties in the standard causes a considerable confusion. See for example this recent issue about it. The problem is that we try to assign language semantics to the rules that are defined in a very strict, shallow way (i.e. not affected by anything but siblings) and doing so we confuse ourselves. I nevertheless think that bending the standard to solve the confusion problem is an absolutely terrible idea. There are a lot of applications and schemas relying on this strictly defined behaviour, regardless how intuitive it is, and making the same keywords mean different things in the next version of the standard will make migration very difficult for a lot of people. So I think all currently defined keywords should keep their meaning and the confusion problem should be solved with education - there are plenty of materials around. I think even removing keywords is a bad idea, but changing them is just terrible... We can extend the standard to allow it to solve the problems that currently are difficult to solve. I think "unknown" properties concept proposed by @geraintluff can be a good idea, although I am yet to see the case where the schema cannot be refactored to achieve what you want to achieve without it. At the same time, refactoring often hides the meaning of the schema which is not a good thing. I agree with @fge that this is the "mode of validation" rather than the part of the schema makes it a bit strange and almost out of the scope of the standard. I think we could introduce the keyword "unknownProperties" that can be only boolean (no need to make it accept the schema I think :), that would take into account all properties defined in properties/patternProperties/patternGroups both on the same level and in all subschemas in allOf, anyOf, oneOf and maybe even "not" and "switch" keywords. That downside is that this approach would require to use this keyword on all levels where such restriction is necessary. The upside is that it would allow flexibility, while validation mode won't. Validators still could implement it as the mode, as z-schema did with additionalProperties and many other keywords. But it would be outside of the standard. The complexity comes from the question what to do with properties defined in $refs that are used inside allOf etc. @geraintluff, does your current implementation of "unknown properties" take properties from refs into account? What if they are recursive? So all this complexity makes me love the current shallowness of the standard and holds me back from implementing unknown properties in Ajv |
OK, I am not an active member of the process anymore, but might I just mention that my patch proposal would just solve all of these miseries? |
I missed those ideas. It works I think, can probably be simpler. |
+1 for |
@dawidcha There has been further discussion of this general topic here: https://groups.google.com/forum/#!forum/json-schema You can join the current work at https://github.com/json-schema-org/json-schema-spec if you'd like to continue discussing this. |
cc: @domoritz -- I don't understand why output for this one correctly validates examples while using `&` is wrong. Maybe this is possibly a bug.
As I understand, the pattern for reusing base schema definitions across multiple schemas is to use 'allOf' to include base schema that we're mixing in. If this is to validate successfully, all such base schemas must have additionalProperties set to false (or omitted) which means that they, and the derived schema have no way of limiting the allowed property set.
Since I can't actually think of any cases when using allOf on several object schemas with additionalProperties set to false would make any sense, I'd like to propose that the draft spec be updated to say that when using allOf with object schemas, the additionalProperties setting will be treated as 'true', for the purposes of validation, but if all schemas in the allOf list actually have additionalProperties set to false then the derived schema will inherit that setting.
Did that make any sense?
The text was updated successfully, but these errors were encountered: