Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support JSON Schema 2019-09 (formerly draft-08) #44

Closed
emiltin opened this issue Sep 23, 2019 · 33 comments
Closed

support JSON Schema 2019-09 (formerly draft-08) #44

emiltin opened this issue Sep 23, 2019 · 33 comments

Comments

@emiltin
Copy link

emiltin commented Sep 23, 2019

No description provided.

@emiltin
Copy link
Author

emiltin commented Dec 2, 2019

@leh
Copy link

leh commented Mar 7, 2021

Hey, are there any plans to support 2019-09-validation as well?

@davishmcclurg
Copy link
Owner

Hey, are there any plans to support 2019-09-validation as well?

Supporting 2019-09 would include validation.


Looks like there's a new 2020-12 version as well.

@dbourguignon
Copy link

👋 Any news on that?
It seems that the 2020-12 version is the one targeted by OpenAPI 3.1, it would interesting to make it supported there

@davishmcclurg
Copy link
Owner

👋 Any news on that?
It seems that the 2020-12 version is the one targeted by OpenAPI 3.1, it would interesting to make it supported there

No news at the moment. I recently quit my job though so I may have more time to look into the changes.

@emiltin
Copy link
Author

emiltin commented Apr 5, 2022

hi @davishmcclurg, thank you for a great gem! any news on supporting 2019-09 or 2020-12?

@kyrofa
Copy link

kyrofa commented Nov 22, 2022

I'd love to see this as well. unevaluatedProperties opens up new possibilities.

@ekzobrain
Copy link

@davishmcclurg I started working on this here: https://github.com/ekzo-dev/json_schemer/tree/feature/draft2019-09
I currently implemented only this core change https://json-schema.org/draft/2019-09/release-notes.html#keyword-changes:
$ref | changed | Other keywords are now allowed alongside of it

I also adjusted test suite to run draft2019-09 ref tests, but some of them still fail because some other core changes need to be implemented as well.
I will continue this work step by step, and it would be great if you patricipate as well.

davishmcclurg added a commit that referenced this issue Jul 9, 2023
Lots of stuff here. The main goal is to support the newer JSON Schema
drafts (2020-12 and 2019-09) including output formats and annotations.
The biggest change is pulling individual keywords into separate classes
which contain parsing and validation logic. All drafts now use the same
`Schema` class with the new "vocabularies" concept handling behavior
differences.

Each draft has its own meta schema (meta.rb), vocabularies (vocab.rb),
and, if necessary, keyword classes (vocab/*.rb). Most keywords are
defined in the latest draft with previous drafts removing/adding things
from there. Old drafts (4, 6, and 7) only have a single vocabulary
because they predate the concept.

`Schema` contains some logic but I tried to keep as much as possible in
keyword classes. `Schema` and `Keyword` have a similar interface
(`value`, `keyword`, `parent`, etc) and share some code using the
`Output` module because it didn't feel quite right to have `Schema` be a
subclass of `Keyword`.

There are two basic methods for schemas and keywords:

`#parse`: parses the provided definition (generates relevant subschemas,
side effects, etc). Basically anything that can be done before data
validation.
`#validate`: iterates through the parsed schema/keywords, validates
data, and returns a `Result` object (possibly with nested results).

One exception is `Ref`, which doesn't resolve refs at parse time because
of a circular dependency when generating meta schemas.

Output formats (introduced in 2019-09) are supported via `Result`. I
think the only tricky thing there is that nested results are returned as
enumerators instead of arrays for performance reasons. This matches the
"classic" behavior as well.

2019-09 also introduced "annotations" which are used for some
validations (`unevaluatedProperties`, `unevaluatedItems`, etc) and are
returned with successful results in a similar format to errors. The
"classic" output format drops them to match existing behavior.

Notes:

- `Location` is used for performance reasons so that JSON pointer
  resolution can be cached and deferred until output time.
- `instance_location` isn't cached between validations because it's
  possibly unbounded.
- `ref_resolver` and `regexp_resolver` are lazily created for performanc
  reasons.

Known breaking changes (so far):

- Custom keyword output
- `not` and `dependencies` output
- Property validation hooks (`before_property_validation` and
  `after_property_validation`) are now called immediately surrounding
  `properties` validation. Previously, `before_property_validation` was
  called before all "object" validations (`dependencies`,
  `patternProperties`, `additionalProperties`, etc) and
  `after_property_validation` was called after.

Related:

- #27
- #44
- #116
@davishmcclurg
Copy link
Owner

Hi @ekzobrain—I appreciate the contribution! I've actually been working for the last couple months on a rewrite to support the latest drafts: https://github.com/davishmcclurg/json_schemer/compare/next

There are a lot of changes there so I probably won't be able to include yours, but let me know if you want to help out elsewhere because there are still some things to finish up.

@ekzobrain
Copy link

ekzobrain commented Jul 10, 2023

@davishmcclurg Hi. You've done a great work! Of course my changes are not applicable here because it is a major architecture rewrite. I may help with testing - we have tens of thousands compex schemas (some of them are up to 8000 rows in source), which are now build on Draft 7, but we need to migrate them to at least Draft 2019-09, or may be even 2020-12 as you already support it. What is the currect state of your work, what's remaining undone?
While working with your current (Draft 7) implementation we also faced issues with ambiguous error output. Some times it is very hard to understand what is the source of the error when validating complex structured schemas with allOf/oneOf keywords. Also some error (with additionalProperties mismatch, for example) do not have their own descriptions. This validator https://www.jsonschemavalidator.net/ (supports only up to Draft 2019-09) gives very good error descriptions, it would be great to implement identically. If you wish I may provide examples.

@davishmcclurg
Copy link
Owner

I may help with testing - we have tens of thousands compex schemas (some of them are up to 8000 rows in source), which are now build on Draft 7, but we need to migrate them to at least Draft 2019-09, or may be even 2020-12 as you already support it. What is the currect state of your work, what's remaining undone?

It would be great if you could help with testing. There are a few small things I'm still planning to get done (docs, cleanup, etc), but that branch should be working and ready for you to test.

While working with your current (Draft 7) implementation we also faced issues with ambiguous error output. Some times it is very hard to understand what is the source of the error when validating complex structured schemas with allOf/oneOf keywords. Also some error (with additionalProperties mismatch, for example) do not have their own descriptions. This validator https://www.jsonschemavalidator.net/ (supports only up to Draft 2019-09) gives very good error descriptions, it would be great to implement identically. If you wish I may provide examples.

I agree the error output is not that helpful—additionalProperties is one I noticed as well. Can you provide some examples? The branch I'm working on includes "output formats" (added in draft 2019-09) so you can try those, but I'm not sure that really solves the problem. If error messaging is something you're willing to work on, we can talk through some ideas.

@ekzobrain
Copy link

Ok, I'll give you feedback about my tests and error output in about a week

davishmcclurg added a commit that referenced this issue Jul 23, 2023
Lots of stuff here. The main goal is to support the newer JSON Schema
drafts (2020-12 and 2019-09) including output formats and annotations.
The biggest change is pulling individual keywords into separate classes
which contain parsing and validation logic. All drafts now use the same
`Schema` class with the new "vocabularies" concept handling behavior
differences.

Each draft has its own meta schema (meta.rb), vocabularies (vocab.rb),
and, if necessary, keyword classes (vocab/*.rb). Most keywords are
defined in the latest draft with previous drafts removing/adding things
from there. Old drafts (4, 6, and 7) only have a single vocabulary
because they predate the concept.

`Schema` contains some logic but I tried to keep as much as possible in
keyword classes. `Schema` and `Keyword` have a similar interface
(`value`, `keyword`, `parent`, etc) and share some code using the
`Output` module because it didn't feel quite right to have `Schema` be a
subclass of `Keyword`.

There are two basic methods for schemas and keywords:

`#parse`: parses the provided definition (generates relevant subschemas,
side effects, etc). Basically anything that can be done before data
validation.
`#validate`: iterates through the parsed schema/keywords, validates
data, and returns a `Result` object (possibly with nested results).

One exception is `Ref`, which doesn't resolve refs at parse time because
of a circular dependency when generating meta schemas.

Output formats (introduced in 2019-09) are supported via `Result`. I
think the only tricky thing there is that nested results are returned as
enumerators instead of arrays for performance reasons. This matches the
"classic" behavior as well.

2019-09 also introduced "annotations" which are used for some
validations (`unevaluatedProperties`, `unevaluatedItems`, etc) and are
returned with successful results in a similar format to errors. The
"classic" output format drops them to match existing behavior.

Notes:

- `Location` is used for performance reasons so that JSON pointer
  resolution can be cached and deferred until output time.
- `instance_location` isn't cached between validations because it's
  possibly unbounded.
- `ref_resolver` and `regexp_resolver` are lazily created for performanc
  reasons.

Known breaking changes (so far):

- Custom keyword output
- `not` and `dependencies` output
- Property validation hooks (`before_property_validation` and
  `after_property_validation`) are now called immediately surrounding
  `properties` validation. Previously, `before_property_validation` was
  called before all "object" validations (`dependencies`,
  `patternProperties`, `additionalProperties`, etc) and
  `after_property_validation` was called after.

Related:

- #27
- #44
- #116
@davishmcclurg
Copy link
Owner

@ekzobrain I just added a commit to that branch that adds more descriptive error messages. I don't think it addresses your concerns (eg, additionaProperties) but it should cover what people are using JSONSchemer::Errors.pretty for.

davishmcclurg added a commit that referenced this issue Jul 27, 2023
Lots of stuff here. The main goal is to support the newer JSON Schema
drafts (2020-12 and 2019-09) including output formats and annotations.
The biggest change is pulling individual keywords into separate classes
which contain parsing and validation logic. All drafts now use the same
`Schema` class with the new "vocabularies" concept handling behavior
differences.

Each draft has its own meta schema (meta.rb), vocabularies (vocab.rb),
and, if necessary, keyword classes (vocab/*.rb). Most keywords are
defined in the latest draft with previous drafts removing/adding things
from there. Old drafts (4, 6, and 7) only have a single vocabulary
because they predate the concept.

`Schema` contains some logic but I tried to keep as much as possible in
keyword classes. `Schema` and `Keyword` have a similar interface
(`value`, `keyword`, `parent`, etc) and share some code using the
`Output` module because it didn't feel quite right to have `Schema` be a
subclass of `Keyword`.

There are two basic methods for schemas and keywords:

`#parse`: parses the provided definition (generates relevant subschemas,
side effects, etc). Basically anything that can be done before data
validation.
`#validate`: iterates through the parsed schema/keywords, validates
data, and returns a `Result` object (possibly with nested results).

One exception is `Ref`, which doesn't resolve refs at parse time because
of a circular dependency when generating meta schemas.

Output formats (introduced in 2019-09) are supported via `Result`. I
think the only tricky thing there is that nested results are returned as
enumerators instead of arrays for performance reasons. This matches the
"classic" behavior as well.

2019-09 also introduced "annotations" which are used for some
validations (`unevaluatedProperties`, `unevaluatedItems`, etc) and are
returned with successful results in a similar format to errors. The
"classic" output format drops them to match existing behavior.

Notes:

- `Location` is used for performance reasons so that JSON pointer
  resolution can be cached and deferred until output time.
- `instance_location` isn't cached between validations because it's
  possibly unbounded.
- `ref_resolver` and `regexp_resolver` are lazily created for performanc
  reasons.

Known breaking changes (so far):

- Custom keyword output
- `not` and `dependencies` output
- Property validation hooks (`before_property_validation` and
  `after_property_validation`) are now called immediately surrounding
  `properties` validation. Previously, `before_property_validation` was
  called before all "object" validations (`dependencies`,
  `patternProperties`, `additionalProperties`, etc) and
  `after_property_validation` was called after.

Related:

- #27
- #44
- #116
davishmcclurg added a commit that referenced this issue Jul 31, 2023
Features:

- Draft 2020-12 support
- Draft 2019-09 support
- Output formats
- Annotations
- OpenAPI 3.1 schema support
- OpenAPI 3.0 schema support
- `insert_property_defaults` in conditional subschemas
- Error messages
- Non-string schema and data keys

See individual commits for more details.

Closes:

- #27
- #44
- #55
- #91
- #94
- #116
- #123
davishmcclurg added a commit that referenced this issue Aug 1, 2023
Features:

- Draft 2020-12 support
- Draft 2019-09 support
- Output formats
- Annotations
- OpenAPI 3.1 schema support
- OpenAPI 3.0 schema support
- `insert_property_defaults` in conditional subschemas
- Error messages
- Non-string schema and data keys

See individual commits for more details.

Closes:

- #27
- #44
- #55
- #91
- #94
- #116
- #123
@davishmcclurg davishmcclurg mentioned this issue Aug 1, 2023
@davishmcclurg
Copy link
Owner

Ok, I'll give you feedback about my tests and error output in about a week

@ekzobrain have you had a chance to test things out? I've got a PR open now that I'm planning to merge in a couple weeks. It would be helpful if you could test it out before I merge. Let me know!

@neilpa-inv
Copy link

@davishmcclurg Just to give you another point of reference. Our team has a fairly complex schema spread across a couple dozen files. It's using draft-7 but I switched to 2020-12 to try against the big rewrite commit (6de2a04) earlier this week. Our suite of >1000 examples that tests both valid and invalid data fully passed on the new version.

I wanted to experiment with unevaluatedProperties: false to try and simplify our schema definitions. Unfortunately it didn't make things much simpler for our use case so I'm exploring some alternatives (#136). Figured you'd at least want to know that I was able to seamlessly upgrade to the newest library version.

@davishmcclurg
Copy link
Owner

Thanks @neilpa-inv! I appreciate the feedback

@ekzobrain
Copy link

ekzobrain commented Aug 18, 2023

@davishmcclurg I've started testing against my files and faced an issue with Draft 7 schema. I've attached an example schema and data files.

This code throws an exception "JSONSchemer::InvalidRefPointer: /definitions/n1:TStatementForm1":

schemer = JSONSchemer.schema(Pathname.new('schema.json'))
schemer.validate(Pathname.new('data.json'))

While this code returns true:

JSONSchemer.valid_schema?(Pathname.new('schema.json'))

example.zip

@davishmcclurg
Copy link
Owner

This code throws an exception "JSONSchemer::InvalidRefPointer: /definitions/n1:TStatementForm1":

Looks like the issue is you're using other keywords (ie, definitions) at the same level as $ref. That's not allowed in Draft 7:

All other properties in a "$ref" object MUST be ignored.

To get around it, you can wrap the $ref in allOf:

"allOf": [{ "$ref": "#/definitions/n1:TStatementForm1" }],

@neilpa-inv
Copy link

This code throws an exception "JSONSchemer::InvalidRefPointer: /definitions/n1:TStatementForm1":

Looks like the issue is you're using other keywords (ie, definitions) at the same level as $ref. That's not allowed in Draft 7:

All other properties in a "$ref" object MUST be ignored.

To get around it, you can wrap the $ref in allOf:

"allOf": [{ "$ref": "#/definitions/n1:TStatementForm1" }],

But shouldn't the JSONSchemer.valid_schema? call also fail for the same invalid ref?

@davishmcclurg
Copy link
Owner

But shouldn't the JSONSchemer.valid_schema? call also fail for the same invalid ref?

valid_schema? only validates the provided schema against the published meta schema, which doesn't cover this case. If you look at the published schema, it's fairly simple and won't guarantee that a schema is correct. valid_schema? also does not resolve any refs because it treats the provided schema as any other JSON data.

@ekzobrain
Copy link

ekzobrain commented Aug 19, 2023

This code throws an exception "JSONSchemer::InvalidRefPointer: /definitions/n1:TStatementForm1":

Looks like the issue is you're using other keywords (ie, definitions) at the same level as $ref. That's not allowed in Draft 7:

All other properties in a "$ref" object MUST be ignored.

To get around it, you can wrap the $ref in allOf:

"allOf": [{ "$ref": "#/definitions/n1:TStatementForm1" }],

I know about this rule, but always thought, that it is relative only for validation vocabulary keywords. Your 1.0 version worked correctly with this schema and other validators that I know and use also handle it correctly, for example: https://www.jsonschemavalidator.net/

@davishmcclurg
Copy link
Owner

I know about this rule, but always thought, that it is relative only for validation vocabulary keywords.

I don't think that's the case since there's a json-schema-test-suite test for ignoring $id when it's next to $ref: https://github.com/json-schema-org/JSON-Schema-Test-Suite/blob/5cc9214e82f1e0a5e9644960b6fe0166afb7b283/tests/draft7/ref.json#L178-L213

Looks like the definitions case is controversial though: json-schema-org/JSON-Schema-Test-Suite#458

I'll look into supporting definitions since that's the previous behavior.

davishmcclurg added a commit that referenced this issue Aug 19, 2023
Lots of stuff here. The main goal is to support the newer JSON Schema
drafts (2020-12 and 2019-09) including output formats and annotations.
The biggest change is pulling individual keywords into separate classes
which contain parsing and validation logic. All drafts now use the same
`Schema` class with the new "vocabularies" concept handling behavior
differences.

Each draft has its own meta schema (meta.rb), vocabularies (vocab.rb),
and, if necessary, keyword classes (vocab/*.rb). Most keywords are
defined in the latest draft with previous drafts removing/adding things
from there. Old drafts (4, 6, and 7) only have a single vocabulary
because they predate the concept.

`Schema` contains some logic but I tried to keep as much as possible in
keyword classes. `Schema` and `Keyword` have a similar interface
(`value`, `keyword`, `parent`, etc) and share some code using the
`Output` module because it didn't feel quite right to have `Schema` be a
subclass of `Keyword`.

There are two basic methods for schemas and keywords:

`#parse`: parses the provided definition (generates relevant subschemas,
side effects, etc). Basically anything that can be done before data
validation.
`#validate`: iterates through the parsed schema/keywords, validates
data, and returns a `Result` object (possibly with nested results).

One exception is `Ref`, which doesn't resolve refs at parse time because
of a circular dependency when generating meta schemas.

Output formats (introduced in 2019-09) are supported via `Result`. I
think the only tricky thing there is that nested results are returned as
enumerators instead of arrays for performance reasons. This matches the
"classic" behavior as well.

2019-09 also introduced "annotations" which are used for some
validations (`unevaluatedProperties`, `unevaluatedItems`, etc) and are
returned with successful results in a similar format to errors. The
"classic" output format drops them to match existing behavior.

Notes:

- `Location` is used for performance reasons so that JSON pointer
  resolution can be cached and deferred until output time.
- `instance_location` isn't cached between validations because it's
  possibly unbounded.
- `ref_resolver` and `regexp_resolver` are lazily created for performanc
  reasons.

Known breaking changes (so far):

- Custom keyword output
- `not` and `dependencies` output
- Property validation hooks (`before_property_validation` and
  `after_property_validation`) are now called immediately surrounding
  `properties` validation. Previously, `before_property_validation` was
  called before all "object" validations (`dependencies`,
  `patternProperties`, `additionalProperties`, etc) and
  `after_property_validation` was called after.

Related:

- #27
- #44
- #116
davishmcclurg added a commit that referenced this issue Aug 19, 2023
I don't think this is really correct according to the
[specification][0], but it matches the previous behavior and seems
useful. Drafts after 7 don't have a problem because they allow all
keywords as `$ref` siblings.

Related:

- #44 (comment)
- json-schema-org/JSON-Schema-Test-Suite#458

[0]: https://datatracker.ietf.org/doc/html/draft-handrews-json-schema-01#section-8.3
davishmcclurg added a commit that referenced this issue Aug 19, 2023
Features:

- Draft 2020-12 support
- Draft 2019-09 support
- Output formats
- Annotations
- OpenAPI 3.1 schema support
- OpenAPI 3.0 schema support
- `insert_property_defaults` in conditional subschemas
- Error messages
- Non-string schema and data keys
- Schema bundling

See individual commits for more details.

Closes:

- #27
- #44
- #55
- #91
- #94
- #116
- #123
- #136
@davishmcclurg
Copy link
Owner

davishmcclurg commented Aug 19, 2023

@ekzobrain I updated that branch to make definitions work as you described. Let me know if you have any trouble. f82215c

@ekzobrain
Copy link

ekzobrain commented Aug 19, 2023

@davishmcclurg It now works as expected.
Now about error messages. I've updated my example files to fail with additionalProperties error. The schema is complex enough, it contains oneOf/allOf branches and when an error occures in some branch - it is very hard to find out and locate an actual problem. One of the major problems is that oneOf has no separate error message when it does not pass validation, but it should.
Run this code:

schemer = JSONSchemer.schema(Pathname.new('schema.json'))
e = schemer.validate(JSON.parse(File.read('data.json')))
e.each { |e| puts e['error'] } # or JSONSchemer::Errors.pretty(e)

And compare output with this validator: https://www.jsonschemavalidator.net/ , it gives a correct error tree for all nested schemas (especially for oneOf) and it is easy to find an actual source of the problem.
Also text for additionalProperties error is much cleaner there.

It would be great to try to mimic their output messages as close as possible.

example2.zip

@ekzobrain
Copy link

ekzobrain commented Aug 19, 2023

And another notice is about public API. JSONSchemer.validate() method supports passing a Pathname, but resolved file contents is processed as string and not JSON.parse()'d as JSONSchemer.schema() does.
I think it would be better to align this behavior and parse data file contents as JSON if it is given as a Pathname.
Currenly this won't work (will always return false):

schemer = JSONSchemer.schema(Pathname.new('schema.json'))
schemer.valid?(Pathname.new('data.json'))

@davishmcclurg
Copy link
Owner

And compare output with this validator: https://www.jsonschemavalidator.net/ , it gives a correct error tree for all nested schemas (especially for oneOf) and it is easy to find an actual source of the problem.

If you want a tree of errors, try output_format: 'detailed'. The output is more verbose than jsonschemavalidator.net, but it's similar:

"detailed" output
{
  "valid": false,
  "keywordLocation": "/$ref/properties/header/$ref/properties/appliedDocument/items/$ref/oneOf",
  "absoluteKeywordLocation": "file:///Users/dharsha/Downloads/example2/schema.json#/definitions/n1:TSomeDocument/oneOf",
  "instanceLocation": "/header/appliedDocument/0",
  "error": "instance at `/header/appliedDocument/0` does not match exactly one `oneOf` schema",
  "errors": [
    {
      "valid": false,
      "keywordLocation": "/$ref/properties/header/$ref/properties/appliedDocument/items/$ref/oneOf/0/properties/otherDocument/$ref/properties/documentTypes/items/$ref/oneOf",
      "absoluteKeywordLocation": "file:///Users/dharsha/Downloads/example2/schema.json#/definitions/n1:DocumentTypes/oneOf",
      "instanceLocation": "/header/appliedDocument/0/otherDocument/documentTypes/0",
      "error": "instance at `/header/appliedDocument/0/otherDocument/documentTypes/0` does not match exactly one `oneOf` schema",
      "errors": [
        {
          "valid": false,
          "keywordLocation": "/$ref/properties/header/$ref/properties/appliedDocument/items/$ref/oneOf/0/properties/otherDocument/$ref/properties/documentTypes/items/$ref/oneOf/0/additionalProperties",
          "absoluteKeywordLocation": "file:///Users/dharsha/Downloads/example2/schema.json#/definitions/n1:DocumentTypes/oneOf/0/additionalProperties",
          "instanceLocation": "/header/appliedDocument/0/otherDocument/documentTypes/0/testAddProp",
          "error": "instance at `/header/appliedDocument/0/otherDocument/documentTypes/0/testAddProp` does not match schema"
        },
        {
          "valid": false,
          "keywordLocation": "/$ref/properties/header/$ref/properties/appliedDocument/items/$ref/oneOf/0/properties/otherDocument/$ref/properties/documentTypes/items/$ref/oneOf/1",
          "absoluteKeywordLocation": "file:///Users/dharsha/Downloads/example2/schema.json#/definitions/n1:DocumentTypes/oneOf/1",
          "instanceLocation": "/header/appliedDocument/0/otherDocument/documentTypes/0",
          "error": "instance at `/header/appliedDocument/0/otherDocument/documentTypes/0` does not match schema",
          "errors": [
            {
              "valid": false,
              "keywordLocation": "/$ref/properties/header/$ref/properties/appliedDocument/items/$ref/oneOf/0/properties/otherDocument/$ref/properties/documentTypes/items/$ref/oneOf/1/propertyNames/pattern",
              "absoluteKeywordLocation": "file:///Users/dharsha/Downloads/example2/schema.json#/definitions/n1:DocumentTypes/oneOf/1/propertyNames/pattern",
              "instanceLocation": "/header/appliedDocument/0/otherDocument/documentTypes/0",
              "error": "string at `/header/appliedDocument/0/otherDocument/documentTypes/0` does not match pattern: ^(?!documentTypeCode$).*"
            },
            {
              "valid": false,
              "keywordLocation": "/$ref/properties/header/$ref/properties/appliedDocument/items/$ref/oneOf/0/properties/otherDocument/$ref/properties/documentTypes/items/$ref/oneOf/1/required",
              "absoluteKeywordLocation": "file:///Users/dharsha/Downloads/example2/schema.json#/definitions/n1:DocumentTypes/oneOf/1/required",
              "instanceLocation": "/header/appliedDocument/0/otherDocument/documentTypes/0",
              "error": "hash at `/header/appliedDocument/0/otherDocument/documentTypes/0` is missing required keys: [\"representativeDocTypeCode\"]"
            }
          ]
        }
      ]
    },
    {
      "valid": false,
      "keywordLocation": "/$ref/properties/header/$ref/properties/appliedDocument/items/$ref/oneOf/1",
      "absoluteKeywordLocation": "file:///Users/dharsha/Downloads/example2/schema.json#/definitions/n1:TSomeDocument/oneOf/1",
      "instanceLocation": "/header/appliedDocument/0",
      "error": "instance at `/header/appliedDocument/0` does not match schema",
      "errors": [
        {
          "valid": false,
          "keywordLocation": "/$ref/properties/header/$ref/properties/appliedDocument/items/$ref/oneOf/1/propertyNames/pattern",
          "absoluteKeywordLocation": "file:///Users/dharsha/Downloads/example2/schema.json#/definitions/n1:TSomeDocument/oneOf/1/propertyNames/pattern",
          "instanceLocation": "/header/appliedDocument/0",
          "error": "string at `/header/appliedDocument/0` does not match pattern: ^(?!otherDocument$|idDocument$|powerOfAttorney$|mapPlanDocument$|legalAct$|confirmPrivilege$).*"
        },
        {
          "valid": false,
          "keywordLocation": "/$ref/properties/header/$ref/properties/appliedDocument/items/$ref/oneOf/1/required",
          "absoluteKeywordLocation": "file:///Users/dharsha/Downloads/example2/schema.json#/definitions/n1:TSomeDocument/oneOf/1/required",
          "instanceLocation": "/header/appliedDocument/0",
          "error": "hash at `/header/appliedDocument/0` is missing required keys: [\"paymentDocument\"]"
        }
      ]
    },
    {
      "valid": false,
      "keywordLocation": "/$ref/properties/header/$ref/properties/appliedDocument/items/$ref/oneOf/2",
      "absoluteKeywordLocation": "file:///Users/dharsha/Downloads/example2/schema.json#/definitions/n1:TSomeDocument/oneOf/2",
      "instanceLocation": "/header/appliedDocument/0",
      "error": "instance at `/header/appliedDocument/0` does not match schema",
      "errors": [
        {
          "valid": false,
          "keywordLocation": "/$ref/properties/header/$ref/properties/appliedDocument/items/$ref/oneOf/2/propertyNames/pattern",
          "absoluteKeywordLocation": "file:///Users/dharsha/Downloads/example2/schema.json#/definitions/n1:TSomeDocument/oneOf/2/propertyNames/pattern",
          "instanceLocation": "/header/appliedDocument/0",
          "error": "string at `/header/appliedDocument/0` does not match pattern: ^(?!otherDocument$|paymentDocument$|powerOfAttorney$|mapPlanDocument$|legalAct$|confirmPrivilege$).*"
        },
        {
          "valid": false,
          "keywordLocation": "/$ref/properties/header/$ref/properties/appliedDocument/items/$ref/oneOf/2/required",
          "absoluteKeywordLocation": "file:///Users/dharsha/Downloads/example2/schema.json#/definitions/n1:TSomeDocument/oneOf/2/required",
          "instanceLocation": "/header/appliedDocument/0",
          "error": "hash at `/header/appliedDocument/0` is missing required keys: [\"idDocument\"]"
        }
      ]
    },
    {
      "valid": false,
      "keywordLocation": "/$ref/properties/header/$ref/properties/appliedDocument/items/$ref/oneOf/3",
      "absoluteKeywordLocation": "file:///Users/dharsha/Downloads/example2/schema.json#/definitions/n1:TSomeDocument/oneOf/3",
      "instanceLocation": "/header/appliedDocument/0",
      "error": "instance at `/header/appliedDocument/0` does not match schema",
      "errors": [
        {
          "valid": false,
          "keywordLocation": "/$ref/properties/header/$ref/properties/appliedDocument/items/$ref/oneOf/3/propertyNames/pattern",
          "absoluteKeywordLocation": "file:///Users/dharsha/Downloads/example2/schema.json#/definitions/n1:TSomeDocument/oneOf/3/propertyNames/pattern",
          "instanceLocation": "/header/appliedDocument/0",
          "error": "string at `/header/appliedDocument/0` does not match pattern: ^(?!otherDocument$|paymentDocument$|idDocument$|mapPlanDocument$|legalAct$|confirmPrivilege$).*"
        },
        {
          "valid": false,
          "keywordLocation": "/$ref/properties/header/$ref/properties/appliedDocument/items/$ref/oneOf/3/required",
          "absoluteKeywordLocation": "file:///Users/dharsha/Downloads/example2/schema.json#/definitions/n1:TSomeDocument/oneOf/3/required",
          "instanceLocation": "/header/appliedDocument/0",
          "error": "hash at `/header/appliedDocument/0` is missing required keys: [\"powerOfAttorney\"]"
        }
      ]
    },
    {
      "valid": false,
      "keywordLocation": "/$ref/properties/header/$ref/properties/appliedDocument/items/$ref/oneOf/4",
      "absoluteKeywordLocation": "file:///Users/dharsha/Downloads/example2/schema.json#/definitions/n1:TSomeDocument/oneOf/4",
      "instanceLocation": "/header/appliedDocument/0",
      "error": "instance at `/header/appliedDocument/0` does not match schema",
      "errors": [
        {
          "valid": false,
          "keywordLocation": "/$ref/properties/header/$ref/properties/appliedDocument/items/$ref/oneOf/4/propertyNames/pattern",
          "absoluteKeywordLocation": "file:///Users/dharsha/Downloads/example2/schema.json#/definitions/n1:TSomeDocument/oneOf/4/propertyNames/pattern",
          "instanceLocation": "/header/appliedDocument/0",
          "error": "string at `/header/appliedDocument/0` does not match pattern: ^(?!otherDocument$|paymentDocument$|idDocument$|powerOfAttorney$|legalAct$|confirmPrivilege$).*"
        },
        {
          "valid": false,
          "keywordLocation": "/$ref/properties/header/$ref/properties/appliedDocument/items/$ref/oneOf/4/required",
          "absoluteKeywordLocation": "file:///Users/dharsha/Downloads/example2/schema.json#/definitions/n1:TSomeDocument/oneOf/4/required",
          "instanceLocation": "/header/appliedDocument/0",
          "error": "hash at `/header/appliedDocument/0` is missing required keys: [\"mapPlanDocument\"]"
        }
      ]
    },
    {
      "valid": false,
      "keywordLocation": "/$ref/properties/header/$ref/properties/appliedDocument/items/$ref/oneOf/5",
      "absoluteKeywordLocation": "file:///Users/dharsha/Downloads/example2/schema.json#/definitions/n1:TSomeDocument/oneOf/5",
      "instanceLocation": "/header/appliedDocument/0",
      "error": "instance at `/header/appliedDocument/0` does not match schema",
      "errors": [
        {
          "valid": false,
          "keywordLocation": "/$ref/properties/header/$ref/properties/appliedDocument/items/$ref/oneOf/5/propertyNames/pattern",
          "absoluteKeywordLocation": "file:///Users/dharsha/Downloads/example2/schema.json#/definitions/n1:TSomeDocument/oneOf/5/propertyNames/pattern",
          "instanceLocation": "/header/appliedDocument/0",
          "error": "string at `/header/appliedDocument/0` does not match pattern: ^(?!otherDocument$|paymentDocument$|idDocument$|powerOfAttorney$|mapPlanDocument$|confirmPrivilege$).*"
        },
        {
          "valid": false,
          "keywordLocation": "/$ref/properties/header/$ref/properties/appliedDocument/items/$ref/oneOf/5/required",
          "absoluteKeywordLocation": "file:///Users/dharsha/Downloads/example2/schema.json#/definitions/n1:TSomeDocument/oneOf/5/required",
          "instanceLocation": "/header/appliedDocument/0",
          "error": "hash at `/header/appliedDocument/0` is missing required keys: [\"legalAct\"]"
        }
      ]
    },
    {
      "valid": false,
      "keywordLocation": "/$ref/properties/header/$ref/properties/appliedDocument/items/$ref/oneOf/6",
      "absoluteKeywordLocation": "file:///Users/dharsha/Downloads/example2/schema.json#/definitions/n1:TSomeDocument/oneOf/6",
      "instanceLocation": "/header/appliedDocument/0",
      "error": "instance at `/header/appliedDocument/0` does not match schema",
      "errors": [
        {
          "valid": false,
          "keywordLocation": "/$ref/properties/header/$ref/properties/appliedDocument/items/$ref/oneOf/6/propertyNames/pattern",
          "absoluteKeywordLocation": "file:///Users/dharsha/Downloads/example2/schema.json#/definitions/n1:TSomeDocument/oneOf/6/propertyNames/pattern",
          "instanceLocation": "/header/appliedDocument/0",
          "error": "string at `/header/appliedDocument/0` does not match pattern: ^(?!otherDocument$|paymentDocument$|idDocument$|powerOfAttorney$|mapPlanDocument$|legalAct$).*"
        },
        {
          "valid": false,
          "keywordLocation": "/$ref/properties/header/$ref/properties/appliedDocument/items/$ref/oneOf/6/required",
          "absoluteKeywordLocation": "file:///Users/dharsha/Downloads/example2/schema.json#/definitions/n1:TSomeDocument/oneOf/6/required",
          "instanceLocation": "/header/appliedDocument/0",
          "error": "hash at `/header/appliedDocument/0` is missing required keys: [\"confirmPrivilege\"]"
        }
      ]
    }
  ]
}

Is that close to what you're hoping for?

And another notice is about public API. JSONSchemer.validate() method supports passing a Pathname, but resolved file contents is processed as string and not JSON.parse()'d as JSONSchemer.schema() does.
I think it would be better to align this behavior and parse data file contents as JSON if it is given as a Pathname.

That's an interesting idea. I'm hesitant because JSONSchemer.schema also automatically parses strings as json, which I don't think we'd want to do for JSONSchemer::Schema#validate (because bare strings can be validated by a schema). I'm going to hold off on Pathname support for now, but please open a separate issue to discuss it further.

@ekzobrain
Copy link

ekzobrain commented Aug 19, 2023

Ok, that's what I was looking for! But some notice about error messages:
Снимок экрана 2023-08-19 в 22 19 00

  1. error says: "instance at /header/appliedDocument/0/otherDocument/documentTypes/0/testAddProp does not match schema", but that is misleading. "Property 'testAddProp' has not been defined and the schema does not allow additional properties." is much clearer, do you agree? May it be fixed someway?
  2. this is an agregate message for oneOf/1 , that it doesn't match schema. But why there was no such grouping message for oneOf/0 ? Isn't it redundant? jsonschemavalidator does not have it. Also it's intance path is the same as in the next two errors, that's also misleading, because one error says "instance", second "string", third "hash" pointing at the same instance path :) That's also a pros for removing it
  3. "string at /header/appliedDocument/0/otherDocument/documentTypes/0 does not match pattern: ^(?!documentTypeCode$).*" but there's not string at that path, there's an object... message (or instance path) should be fixed someway to express correctly
  4. "hash at /header/appliedDocument/0/otherDocument/documentTypes/0 is missing required keys: ["representativeDocTypeCode"]" may it be "hash at /header/appliedDocument/0/otherDocument/documentTypes/0 is missing required keys: representativeDocTypeCode, xxx, yyy" for cleanless?

@ekzobrain
Copy link

ekzobrain commented Aug 19, 2023

I also faced an issue with contentMediaType. A have shemas with contentMediaType: "text/xml".
Validation against such schemas throws an exception:
Снимок экрана 2023-08-19 в 23 30 31

Maybe you could add a configuration callback for validating unknown contentMediaType/format ? Now it is not functional this way

@davishmcclurg
Copy link
Owner

  1. error says: "instance at /header/appliedDocument/0/otherDocument/documentTypes/0/testAddProp does not match schema", but that is misleading. "Property 'testAddProp' has not been defined and the schema does not allow additional properties." is much clearer, do you agree? May it be fixed someway?

I updated the branch to show a more useful error message:

{
  "valid": false,
  "keywordLocation": "/$ref/properties/header/$ref/properties/appliedDocument/items/$ref/oneOf/0/properties/otherDocument/$ref/properties/documentTypes/items/$ref/oneOf/0/additionalProperties",
  "absoluteKeywordLocation": "file:///Users/dharsha/Downloads/example2/schema.json#/definitions/n1:DocumentTypes/oneOf/0/additionalProperties",
  "instanceLocation": "/header/appliedDocument/0/otherDocument/documentTypes/0/testAddProp",
  "error": "property at `/header/appliedDocument/0/otherDocument/documentTypes/0/testAddProp` is not defined and schema does not allow additional properties"
}

2. this is an agregate message for oneOf/1 , that it doesn't match schema. But why there was no such grouping message for oneOf/0 ? Isn't it redundant? jsonschemavalidator does not have it. Also it's intance path is the same as in the next two errors, that's also misleading, because one error says "instance", second "string", third "hash" pointing at the same instance path :) That's also a pros for removing it

I believe the behavior currently matches what's in the spec for the detailed output format, specifically:

The following rules govern the construction of the results object:

  • All applicator keywords ("*Of", "$ref", "if"/"then"/"else", etc.) require a node.
  • Nodes that have no children are removed.
  • Nodes that have a single child are replaced by the child.

That's likely why oneOf/0 doesn't have the grouping message (single child replaced by child).

3. "string at /header/appliedDocument/0/otherDocument/documentTypes/0 does not match pattern: ^(?!documentTypeCode$).*" but there's not string at that path, there's an object... message (or instance path) should be fixed someway to express correctly

This one's tricky because it's a propertyNames error and there's not a good way to provide a json pointer path to a key (only a value).

4. "hash at /header/appliedDocument/0/otherDocument/documentTypes/0 is missing required keys: ["representativeDocTypeCode"]" may it be "hash at /header/appliedDocument/0/otherDocument/documentTypes/0 is missing required keys: representativeDocTypeCode, xxx, yyy" for cleanless?

Yes, I updated this as well.

I also faced an issue with contentMediaType. A have shemas with contentMediaType: "text/xml".

Did this work in previous versions? I believe application/json is the only media type that ever worked:

if content_media_type && decoded_data
case content_media_type.downcase
when 'application/json'
yield error(instance, 'contentMediaType') unless valid_json?(decoded_data)
else
raise NotImplementedError
end
end

@ekzobrain
Copy link

ekzobrain commented Aug 19, 2023

Did this work in previous versions? I believe application/json is the only media type that ever worked:

Seems like this is a new schema and I did not try it with previous versions... But that doesn't change my suggestion to add callback options for those :)

@ekzobrain
Copy link

ekzobrain commented Aug 19, 2023

Also I think it is better to rename hash to object in error messages, to stick closer to JSON type names. As you renamed keys to properties.

@davishmcclurg
Copy link
Owner

Seems like this is a new schema and I did not try it with previous versions... But that doesn't change my suggestion to add callback options for those :)

👍 please open a separate issue to keep track of the suggestion.

Also I think it is better to rename hash to object in error messages, to stick closer to JSON type names. As you renamed keys to properties.

Good call.


@ekzobrain I appreciate the testing and suggestions. Did you run into any other issues testing things out with your schemas?

davishmcclurg added a commit that referenced this issue Aug 20, 2023
I don't think this is really correct according to the
[specification][0], but it matches the previous behavior and seems
useful. Drafts after 7 don't have a problem because they allow all
keywords as `$ref` siblings.

Related:

- #44 (comment)
- json-schema-org/JSON-Schema-Test-Suite#458

[0]: https://datatracker.ietf.org/doc/html/draft-handrews-json-schema-01#section-8.3
davishmcclurg added a commit that referenced this issue Aug 20, 2023
Features:

- Draft 2020-12 support
- Draft 2019-09 support
- Output formats
- Annotations
- OpenAPI 3.1 schema support
- OpenAPI 3.0 schema support
- `insert_property_defaults` in conditional subschemas
- Error messages
- Non-string schema and data keys
- Schema bundling

See individual commits for more details.

Closes:

- #27
- #44
- #55
- #91
- #94
- #116
- #123
- #136
@ekzobrain
Copy link

@ekzobrain I appreciate the testing and suggestions. Did you run into any other issues testing things out with your schemas?

Currently not. I think it's time to make a release, if you are ready, and then move forward.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants