$data #51

handrews · 2016-09-16T20:33:59Z

Originally written by @geraintluff at https://github.com/json-schema/json-schema/wiki/%24data-(v5-proposal)

NOTE: JSON Relative Pointer is defined as an extension of JSON Pointer, which means that an absolute JSON pointer is legal anywhere that a relative pointer is mentioned (but not vice versa).

Absolute JSON Pointers always begin with /, while relative JSON pointers always begin with a digit. Resolving a pointer beginning with / behaves the same whether it is being resolved "relative" to a specific location or not, just as resolving a URI "/foo/bar" is resolved the same whether there is an existing path component to the URI or not.

Proposed keywords

$data

This keyword would be available:

inside any schema
contained in an object ({"$data": ...}) for the following schema properties:
- minimum/maximum
- exclusiveMinimum/exclusiveMaximum
- minItems/maxItems,
- enum
- more...
contained in an object ({"$data": ...}) for the following LDO properties:
- href
- rel
- title
- mediaType
- more...

Purpose

This keyword would allow schemas to use values from the data, specified using Relative JSON Pointers.

This allows more complex behaviour, including interaction between different parts of the data.

When used inside LDOs, this allows extraction of many more link attributes/parameters from the data.

Values

Wherever it is used, the value of $data is a Relative JSON Pointer.

Behaviour

If the $data keyword is defined in a schema, then before any further processing of the schema:

The value of $data is interpreted as a Relative JSON Pointer.
The pointer is resolved relative to the current instance being validated/processed/etc.
The resolved value is taken to be the value of the schema for all further processing.

When used in one of the permitted schema/LDO properties, then before any further processing of the schema/LDO:

The value of $data is interpreted as Relative JSON Pointer.
The pointer is resolved relative to the current instance being validated/processed/etc.
The resolved value is substituted as the property value.

Example

{
    "$schema": "http://json-schema.org/draft-04/schema#",
    "type": "object",
    "properties": {
        "smaller": {"type": "number"},
        "larger": {
            "type": "number",
            "minimum": {"$data": "1/smaller"},
            "exclusiveMinimum": true
        }
    },
    "required": ["larger", "smaller"]
}

In the above example, the "larger" property must be strictly greater than the "smaller" property.

Concerns

Theoretical purity

Currently, validation is "context-free", meaning that one part of the data has minimal effect on the validation of another part. This has an effect on things like referencing sub-schemas. Changing this is a big issue, and should not be done lightly.

Some interplay of different parts of the data can currently be specified using oneOf (and the proposed switch) - but crucially, these constraints are specified in the schema for a common parent node, meaning that sub-schema referencing is still simple.

The use of $data also (in some cases) limits the amount of static analysis that can be done on schemas, because their behaviour becomes much more data-dependent. However, the expressive power it opens up is quite substantial.

Not available for all keywords

It's also tempting to allow its use for all schema keywords - however, not only is that a bad idea for keywords such as properties/id, but it also might present an obstacle to anybody extending the standard.

Not available inside `enum` values

It should be noted that while {"enum": {"$data":...}} would extract a list of possible values from the data, {"enum": [{"$data":...}]} would not - it would in fact specify that there is only one valid value: {"$data":...}.

Similar concerns would exist with an extra keyword like constant - what if you want the constant value to be a literal {"$data":...}? However, perhaps constant could be given this data-templating ability, and if you want a literal {"$data":...}, then you can still use enum.

Describing using the meta-schema

The existing mechanics of $ref can be nicely described using a rel="full" link relation.

The mechanics of $data, however, would be impossible to even approach in the meta-schema. We could describe the syntax, but nothing more. Is this a problem?

The text was updated successfully, but these errors were encountered:

handrews · 2016-09-17T07:29:52Z

I feel that there are several use cases here, and it might be best to split them up.

URI Template resolution

For hypermedia interactions, where the instance data must be referenced in order for hyperlinking to work at all, the extended templating syntax from issue #52 covers the necessary use cases. It narrowly targets hyperlinking and does not involve instance data in any other aspect of JSON Schema, as it applies only in situation where we were already referencing instance data.

Link title

I can see how a link title might reference data in the same way as the URI Template. The template may include the id of a related thing, while the title may include the related thing's name. Either way, the URI and the title are both things that are presented back to the user, which should be affected by the instance data as they are describing a relation involving that data.

Therefore I would prefer to see the URI Template extended syntax (with "vars") be used here rather than a more generic approach that applies to more than just hypermedia values.

rel and mediaType

The use cases for mediaType and rel are not immediately obvious to me. The relation type should not, in my mind, change based on the instance data. Only the specific instance to which the relation points should change. A mediaType specified at runtime from instance data would not be of use in planning what a program can do with different representations. It's not clear to me why you wouldn't just list out the possibilities. If some media type links may only be present some of the time, there are other ways to express that using "oneOf" (or possibly "switch") to associate links with only certain variations of the content.

Interactions during validation

I feel like this should somehow express the constraints in terms of the relations among the fields rather than pulling in data that will produce the desired result. So somehow explicitly saying that "larger" should be strictly greater than "smaller" without loading the data into the schema before validation.

I'm not 100% sure what that would look like. In this case, possibly very much like the $data example, as either way you need to reference the related field. But I would be more comfortable with something that clearly states "this is describing relationships among data" rather than "this loads a value from instance data, and treats it as part of the schema, whatever that happens to mean."

I don't feel like I'm articulating this well, but I'm going to go ahead and post this comment in the interest of provoking discussion :-)

HotelDon · 2016-09-19T03:28:49Z

@handrews
So, for me, my use cases sit entirely in the interactions during validation category, so that's what I'm going to speak to.

I think I have a basic understanding of what you're trying to say - you'd prefer a solution where JSON Schema validators use $data (or some other similar feature) as a pointer whose value is checked during validation, instead of recompiling the schema before validation even begins, where it inserts the value of those pointers directly into the schema.

Would it be possible to modify the proposal to remove the portions about modifying the schema directly, and include language elsewhere in v6+ that validation shouldn't modify schemas for any reason? Then, $data would be functionally identical to the way it works now, but doesn't encourage "bad behavior" among the various JSON Schema validators.

handrews · 2016-09-19T03:47:23Z

@HotelDon this is what I mean by "I'm not articulating this well"

It's not so much the reading/loading of the data (which likely has to be done lazily because of $refs), it's what you can do with it after it is loaded. Although thinking about this more I may be OK with it.

The way the proposal is written, with a list of allowed properties that trails off with "more..." left me very concerned about the scope. However what's not explicitly called out, but I think would be better than listing the fields, is that all of the fields proposed for $data take a literal value, and not a schema. We really need to make sure that data is never interpreted as a schema- that's a security nightmare- just use it to shove in links to all of your favorite malware sites!

But I think the intent here is that $data is only used to load data in place of a literal value. I can get behind that.

I still think it is valuable to separate the hypermedia template resolution cases out and use vars as specified in issue #52 for that. Since the values in vars are already assumed to be pointers into the instances, so requiring them to be little{"$data": "/pointer"} objects instead of just pointers is overkill.

HotelDon · 2016-09-19T07:15:27Z

I had never considered the possibility of someone trying to load schema's with $data, so I guess that is what got me confused.

So maybe fix it to sound more like this:

This keyword would be available:

inside any schema

contained in an object ({"$data": ...}) for most schema properties that accept literal values. For example:

minimum/maximum

minItems/maxItems

pattern

enum

etc...

I'm still having a hard time wrapping my head around the hypermedia/LDO portion of this proposal, so I don't have much of an opinion on it. It might be helpful if @geraintluff chimed in to defend his original proposal a bit, assuming he still has any interest in doing so.

epoberezkin · 2016-10-28T20:35:47Z

My 2¢: people seem to use it a lot with Ajv, judging by the questions. So it must be useful.

I think relative JSON pointer should be extended to allow navigating array items (see #115)

handrews · 2016-10-31T19:02:39Z

I've become more receptive of this proposal while working with some of the more difficult hyper-schema problems such as discussed in #108

awwright · 2016-12-03T18:52:49Z

I'm solidly of the opinion that checking data consistency is solidly out of the scope of JSON Schema. Although it's certainly an option for validators that do want to offer the feature.

And if it's a popular feature then... maybe it's something we have to look into, perhaps as a separate document though.

handrews · 2016-12-03T19:15:53Z

@awwright $data has important uses in hyper-schema whether it is available in general validation or not.

Relequestual · 2017-01-05T11:46:21Z

I can see this could be useful. A few clear usecases might be helpful if anyone has the time or inclination.

handrews · 2017-01-05T21:33:09Z

@Relequestual I'd like to see whether PR #179 is accepted or not before digging into use cases here. If it is accepted, that will clarify how to present the future use cases. If it is not, I'll need to come up with a different approach anyway.

handrews · 2017-08-30T23:03:16Z

I'm moving this out of draft-07/wright-*-02. It is a huge topic that has seen no progress and almost no real discussion in the past year. And there is no clear advocate with time available to move it forward.

handrews · 2017-09-26T18:51:55Z

Random thought: Would it make sense to define $data as part of a separate vocabulary for data interaction? (for lack of a better term)

If we went this route, it would also add to the use cases for #314 for understanding multiple vocabularies in use simultaneously.

epoberezkin · 2017-10-06T19:19:10Z

Separate vocabulary seems overkill...

johandorland · 2019-06-29T12:48:16Z

The context is of huge importance. Most validators split validation into two parts, parsing the schema and validating a document with this parsed schema. With $data you are missing some values when parsing the schema as their content is part of the document. It's not extremely difficult to implement in a validator, as it indeed uses the same reference mechanics as $ref, but it does require a fundamental change in the way schemas are processed.

askirmas · 2019-06-29T12:53:43Z

Most validators

Does another way exists? I expect all validators

The context is of huge importance

And it's obvious for validator. That's why I see bingo

askirmas · 2019-06-29T13:11:27Z

Also I see huge question for $data and not so critical for $ref - merge strategy. For $ref it is replace as Object.assign({}, $ref, holder) but for data I'd prefer to use all the variety (and 1 option per project depend on it's specification).
Ajv propose $merge and $patch but my opinion same like for $data - it is just

entities ~~should not~~ be multiplied unnecessarily

{
  "$schema": "http://json-schema.org/draft-07/schema",
  "properties": {
    "$data": {
      "oneOf": [
        {"type": "string", "format": "uri-reference"},
        {
          "type": "object",
          "required": ["uri"],
          "properties": {
            "uri": {
              "type": "string", "format": "uri-reference"
            },
            "strategy": {
              "type": "string",
              "default": "replace",
              "enum": ["replace", "merge", "replace_recursive", "merge_recursive"]
            },
            "mergeRecursiveTypeConflicts": {
              "description": "For next step of processing $data will be instruction. Let's say like $ref/type. So merge ['boolean', 'integer'] with 'string' can produce ['boolean', 'integer', 'string'] that is OOP extending"
            }
          }
        }
      ]     
    }
  }
}

handrews · 2020-01-15T00:09:49Z

See @awwright 's Scope of JSON Schema Validation document for why this is unlikely to be taken up.

awwright · 2020-02-25T07:53:11Z

Can we close this out since the general consensus is against this?

handrews · 2020-02-25T09:42:19Z

@awwright yes, I think it's time. The most relevant current discussions are in #855 and #549.

yww325 · 2020-07-26T13:41:40Z

See @awwright 's Scope of JSON Schema Validation document for why this is unlikely to be taken up.

I can't understand why the scope document is against this proposal.
in the scope it says: Validating data consistency may involve: Scanning the rest of the document for a referenced value
Isn't that all this proposal about?

awwright · 2020-07-26T23:32:07Z

@yww325 Find this paragraph:

Many applications simply wish to test that an ID is defined in another part of the same document. Even though this case would be compute-bound, it is still outside the scope of JSON Schema validation for several reasons...

1valdis · 2021-11-22T16:40:33Z

Just a note from perspective of me as an AJV user. I've been using AJV for 3 years and only today I discovered $data keyword in AJV; and that it can also reference array lengths. It felt like epiphany. Before, JSON Schema was just a way for some very basic and dumb validation, that still required some supporting scripting for more complex validation scenarios. With $data, it suddenly became so much more "intelligent" and useful.

So as a user of JSON Schema I'm totally up for this (or an at least equally powerful alternative) being considered as a part or at least an extension of some sort.

laurisvan · 2021-12-06T17:18:29Z

While commenting for the already closed issues might not be. a good practice, I must also say that $data in ajv has been extremely beneficial for us. I don't know how I could otherwise have done cross-references between the values in validated data.

Leaving it out makes JSON Schema much less expressive to us, and we need to do more validation in application code that we could have left for schema.

gregsdennis · 2021-12-06T18:25:55Z

The sentiment is shared @laurisvan. However it's disagreement about how it should work that has kept it out of the spec. AJV's implementation is how that author thought it should be done.

I also have implemented my own idea (and put it in a vocabulary) for my library JsonSchema.Net.

While AJV may be in wider use, my implementation is arguably more in line with the spec's current state since it defines a vocabulary.

Until alignment on how it should work can be achieved, we can't put it in the spec. If you would like to open this conversation again, I would suggest opening a new issue that:

summarizes the discussion here
describes how known approaches (notably mine and AJV's) work along with their advantages and deficiencies

sandrina-p · 2023-03-03T16:55:13Z

For those in the future who will read @gregsdennis's last comment, here's the updated link to the vocabulary he mentioned.

gregsdennis · 2023-03-22T00:27:54Z

🤦 It's changed again. I've updated the link above. Hopefully this one will stick.

flq · 2023-06-14T10:31:46Z

Sorry @gregsdennis but the documentation doesn't bring me to the point of understanding how to use the "data" keyword in the context of Schema.NET.

I'm looking at https://docs.json-everything.net/schema/vocabs/data-2022/ and there's only an example for minValue and that data MUST be an object and may use a JSON pointer. I'm trying to express the (apparently often requested) constraint that an item within an array must be one of the names laid out in a different array in the instance.

gregsdennis · 2023-06-14T10:38:13Z

Hey there @flq. So that's the vocab spec itself. There's a subfolder in that "Prebuilt Vocabularies" folder called "Examples". In there you'll find examples for instance and external data. I suspect you want instance data.

The link to that doc page is https://docs.json-everything.net/schema/examples/data-ref.

If you have any other issues you find with my site or libraries, please feel free to open an issue in my repo.

flq · 2023-06-14T12:03:05Z

Thank you for the response - I've joined the slack channel to elaborate.

clenk mentioned this issue Oct 19, 2016

Validation: Improving date-time: min/max, linking & step #99

Closed

This was referenced Oct 22, 2016

Checking value of an attribute against value of other attributes in input json data json-schema/json-schema#160

Closed

Validating equivalence between two (or more) properties json-schema/json-schema#155

Closed

handrews mentioned this issue Nov 3, 2016

Revive Relative JSON Pointer I-D #126

Closed

handrews changed the title ~~v6 validation and hyper-schema: $data~~ validation and hyper-schema: $data Nov 24, 2016

Relequestual added the Type: Enhancement label Jan 5, 2017

timgdavies mentioned this issue Jan 19, 2017

Articulating cross-references in the schema open-contracting/standard#414

Closed

epoberezkin mentioned this issue Feb 13, 2017

Relative constraints #250

Closed

handrews added $data labels May 16, 2017

handrews added this to the draft-07 (wright-*-02) milestone May 16, 2017

handrews mentioned this issue Jul 28, 2017

enum type without specifying values #340

Closed

mrkvon mentioned this issue Aug 3, 2017

Refactoring validation to json-schema ditup/ditapi#14

Merged

handrews modified the milestones: draft-future, draft-07 (wright-*-02) Aug 30, 2017

handrews removed the hypermedia label Sep 2, 2017

handrews changed the title ~~validation and hyper-schema: $data~~ $data Sep 2, 2017

epicfaace mentioned this issue Jul 25, 2019

[Question] Dynamic titles for array fields rjsf-team/react-jsonschema-form#649

Closed

epicfaace mentioned this issue Dec 9, 2019

Enable $data reference usage for AJV validation rjsf-team/react-jsonschema-form#1255

Closed

1 task

SridharSubramaniam mentioned this issue Dec 19, 2019

why schema keyword value can not be a path？ networknt/json-schema-validator#236

Closed

jgaehring mentioned this issue Jan 17, 2020

Provide field requirements/restrictions/defaults for log types at /farm.json farmOS/farmOS#231

Closed

epoberezkin mentioned this issue Jan 23, 2020

Update link to $data proposal ajv-validator/ajv#1153

Closed

handrews closed this as completed Feb 25, 2020

yhack mentioned this issue Feb 28, 2020

Support non-hierarchical dependencies json-schema-org/json-schema-vocabularies#20

Open

epicfaace mentioned this issue Mar 22, 2020

$data passed into AJV instance rjsf-team/react-jsonschema-form#1668

Closed

7 tasks

avan2s mentioned this issue Jun 17, 2020

Unable to use $data reference networknt/json-schema-validator#302

Closed

grv87 mentioned this issue Dec 9, 2021

Cross-field validation konform-kt/konform#29

Open

manoj-pillay-10gen mentioned this issue Apr 6, 2023

Validation for interdependent properties manoj-pillay-10gen/simple-search-index-validator#10

Open

aahei mentioned this issue May 23, 2023

Add Attribute Type Checking for Sentiment, Aggregation, Hash Join Operators in Front-End Texera/texera#1924

Merged

georeith mentioned this issue Oct 6, 2023

Support @required and $data annotations. vega/ts-json-schema-generator#1788

Closed

constantinpopa10 mentioned this issue Jul 11, 2024

Support for JSON schema draft 2020-12, 2019-09 Apicurio/apicurio-registry#2689

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

$data #51

$data #51

handrews commented Sep 16, 2016 •

edited

Loading

handrews commented Sep 17, 2016

HotelDon commented Sep 19, 2016 •

edited

Loading

handrews commented Sep 19, 2016

HotelDon commented Sep 19, 2016

epoberezkin commented Oct 28, 2016

handrews commented Oct 31, 2016

awwright commented Dec 3, 2016

handrews commented Dec 3, 2016

Relequestual commented Jan 5, 2017

handrews commented Jan 5, 2017

handrews commented Aug 30, 2017

handrews commented Sep 26, 2017

epoberezkin commented Oct 6, 2017

johandorland commented Jun 29, 2019

askirmas commented Jun 29, 2019 •

edited

Loading

askirmas commented Jun 29, 2019 •

edited

Loading

handrews commented Jan 15, 2020

awwright commented Feb 25, 2020

handrews commented Feb 25, 2020

yww325 commented Jul 26, 2020

awwright commented Jul 26, 2020

1valdis commented Nov 22, 2021 •

edited

Loading

laurisvan commented Dec 6, 2021

gregsdennis commented Dec 6, 2021 •

edited

Loading

sandrina-p commented Mar 3, 2023 •

edited

Loading

gregsdennis commented Mar 22, 2023 •

edited

Loading

flq commented Jun 14, 2023 •

edited

Loading

gregsdennis commented Jun 14, 2023 •

edited

Loading

flq commented Jun 14, 2023

$data #51

$data #51

Comments

handrews commented Sep 16, 2016 • edited Loading

Proposed keywords

Purpose

Values

Behaviour

Example

Concerns

Theoretical purity

Not available for all keywords

Not available inside enum values

Describing using the meta-schema

handrews commented Sep 17, 2016

URI Template resolution

Link title

rel and mediaType

Interactions during validation

HotelDon commented Sep 19, 2016 • edited Loading

handrews commented Sep 19, 2016

HotelDon commented Sep 19, 2016

epoberezkin commented Oct 28, 2016

handrews commented Oct 31, 2016

awwright commented Dec 3, 2016

handrews commented Dec 3, 2016

Relequestual commented Jan 5, 2017

handrews commented Jan 5, 2017

handrews commented Aug 30, 2017

handrews commented Sep 26, 2017

epoberezkin commented Oct 6, 2017

johandorland commented Jun 29, 2019

askirmas commented Jun 29, 2019 • edited Loading

askirmas commented Jun 29, 2019 • edited Loading

handrews commented Jan 15, 2020

awwright commented Feb 25, 2020

handrews commented Feb 25, 2020

yww325 commented Jul 26, 2020

awwright commented Jul 26, 2020

1valdis commented Nov 22, 2021 • edited Loading

laurisvan commented Dec 6, 2021

gregsdennis commented Dec 6, 2021 • edited Loading

sandrina-p commented Mar 3, 2023 • edited Loading

gregsdennis commented Mar 22, 2023 • edited Loading

flq commented Jun 14, 2023 • edited Loading

gregsdennis commented Jun 14, 2023 • edited Loading

flq commented Jun 14, 2023

handrews commented Sep 16, 2016 •

edited

Loading

Not available inside `enum` values

HotelDon commented Sep 19, 2016 •

edited

Loading

askirmas commented Jun 29, 2019 •

edited

Loading

askirmas commented Jun 29, 2019 •

edited

Loading

1valdis commented Nov 22, 2021 •

edited

Loading

gregsdennis commented Dec 6, 2021 •

edited

Loading

sandrina-p commented Mar 3, 2023 •

edited

Loading

gregsdennis commented Mar 22, 2023 •

edited

Loading

flq commented Jun 14, 2023 •

edited

Loading

gregsdennis commented Jun 14, 2023 •

edited

Loading