Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update guidance on use with non-JSON formats #1390

Open
awwright opened this issue Mar 27, 2023 · 8 comments
Open

Update guidance on use with non-JSON formats #1390

awwright opened this issue Mar 27, 2023 · 8 comments

Comments

@awwright
Copy link
Member

The extent to which JSON Schema can be used to validate data structured as a non-JSON input isn't defined well enough. The spec currently says

However, any document or memory structure that can be parsed into or processed according to the JSON Schema data model can be interpreted against a JSON Schema, including data formats like CBOR

In my personal opinion, this is an interesting fact to point out. However, this isn't enough guidance to ensure that different implementations are compatible. Additionally, is somewhat outside the scope of JSON Schema, and so should be removed.

If this should be written into the standard, it should go into more detail about how this works technically. For any JSON-compatible format, there should be an isomorphism to JSON, or there should be guidance on how to handle the larger value space (for example, CBOR provides data tags, which applications might like to distinguish).

But I think the best option is to remove this for now, and publish guidance on handling non-JSON inputs separately.

Closes #1274

@gregsdennis
Copy link
Member

Would you include YAML as a non-JSON input?

@awwright
Copy link
Member Author

Yes, YAML also has a larger value space than JSON. For example, it supports circular references (Anchors and Aliases)—there's no way to encode this to JSON and then back to YAML. However, for a certain subset of YAML, you can just convert it to JSON, then validate that. (Or some equivalent calculation, if you want to optimize away the "conversion to JSON" step.)

Maybe this can be addressed: "Non-JSON formats may be validated if there is a single correct representation as JSON. Values without a JSON representation will either be indistinguishable, or cause an error." Maybe that's enough guidance?

@gregsdennis
Copy link
Member

I think that's why we state that we operate on the JSON data model. I believe there's already text that says JSON Schema can operate in any format that maps into that data model.

@awwright
Copy link
Member Author

Well, that's the paragraph I'm proposing to remove, at least from core. (Again that wouldn't suggest you can't pass alternate serializations to a validator, just that it's out of scope to describe in core.)

Related to this, I was thinking that "data model" could be simplified too. The data model is something I introduced to address the fact that the same value in JSON can be represented in multiple different ways. But the section is largely a paraphrase of the instance equality section, it may be easier just to say "the data model distinguishes JSON documents by those that are not instance equal."

And then after this, we can re-examine how non-JSON formats fit into this, maybe by specifying how a non-JSON document can be compared for instance equality to a JSON document.

@awwright
Copy link
Member Author

Like I mentioned above, this issue may be a good place to consolidate "Instance Data Model" and "Instance Equality" into a single section. Each section is describing essentially the same concept just in different terms.

@jdesrosiers
Copy link
Member

this is an interesting fact to point out. However, this [...] is somewhat outside the scope of JSON Schema, and so should be removed.

I agree. In fact, I think this kind of thing happens a lot in the spec and it would be nice to clean some of these things up.

@gregsdennis
Copy link
Member

@awwright What, precisely, are you proposing be removed, that whole statement, or just the bit at the end about CBOR?

@gregsdennis
Copy link
Member

Action is to remove the phrase highlighting CBOR. I think the rest is pertinent.

@gregsdennis gregsdennis added this to the stable-release milestone Jun 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Awaiting PR
Development

No branches or pull requests

3 participants