Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add section on schema resource and compound schema resource #675

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 14 additions & 1 deletion pages/learn/glossary.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,12 @@ If you encounter a term you wish were defined here, please feel free to [file an

The entries on this page can be linked to via anchor links (e.g. `https://json-schema.org/learn/glossary.html#vocabulary`) when sharing a definition with others.

### compound schema resource
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The correct term is "Compound Schema Document".


A compound schema resource is a JSON document which has multiple embedded [JSON schema resources](#schema-resource). It is important to note that for a schema to be embeddable it must define a `$id` keyword, which is used to hold its unique identifier. These embedded schema resources collectively define various aspects or features of the overall schema.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A compound schema resource is a JSON document which has multiple embedded JSON schema resources.

You can't use the word "embedded" in this sentence. The root schema counts as one of the "multiple" Schema Resources, but it isn't "embedded". So, the way this is currently worded, there would have to be two embedded schemas (three Schema Resources) before it would be considered a Compound Schema Document.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is important to note that for a schema to be embeddable it must define a $id keyword, which is used to hold its unique identifier.

Maybe it's just me, but it feels awkward to phrase it like this. I would say it the other way around. A subschema is an embedded schema if it has an $id keyword.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These embedded schema resources collectively define various aspects or features of the overall schema.

"Features" isn't the right word here. We usually use that term to refer to features of JSON Schema as a language. I think the word you're looking for here is "constraints". We say that each keyword in a schema adds a constraint to the schema.


Compound schema resource is vital when working with multiple schema resources. By [bundling](../understanding-json-schema/structuring#bundlingbundling) these schema resources into a single schema document, distribution is made easier.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Compound schema resource is vital when working with multiple schema resources.

I'm not sure this is true. A system can happily work with every resource in its own document. Ideally, this is the way it should be done, but that's not always practical.

Also bundling, as described by that link, isn't the only reason to embed resources. In an OpenAPI document block, each schema could have its own identifier, making it a distinct resource which could be $ref'd from anywhere.


### dialect

A cohesive collection of [keywords](#keyword) available for use within a schema, often representing a use-case specific single release of the JSON Schema specification.
Expand Down Expand Up @@ -106,6 +112,13 @@ Strictly speaking, according to the specification, schemas are themselves [JSON

In recent [drafts](#draft) of the specification, a schema is either a JSON object or a JSON boolean value.

#### schema resource

A schema resource is a [schema](#schema) which is canonically identified by an absolute URI. This implies all root/parent schemas are schema resources as they can be identified by an absolute URI.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A schema resource is a schema which is canonically identified by an absolute URI.

This is just taken verbatim from Core 4.3.5. The idea of this glossary is to make things a bit more plain.

This implies all root/parent schemas are schema resources as they can be identified by an absolute URI.

This isn't precisely true.

{ "type": "object" }

is a root schema, but it's not a resource because it's not identifiable.


We should say that a schema resource is any schema or subschema which carries an $id keyword and then re-iterate the specification's suggestion that all root schemas should carry an $id keyword.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greg, I think you've uncovered an inconsistency in the spec. I think that your example has to be considered a Schema Resource. Here's a slightly more complex schema to explain why.

{
  "type": "object",
  "properties": {
    "foo": { "$ref": "https://example.com/schema/foo" }
  },
  "$defs": {
    "foo": {
      "$id": "https://example.com/schema/foo",
      "type": "string"
    }
  }
}

The root schema is one of the Schema Resources that makes up this Compound Schema Document, but it doesn't have an identifier. If we don't consider the root schema a Schema Resource, then this wouldn't be a Compound Schema Document, which doesn't make sense. There has to be an exception so that the root schema doesn't need to have $id.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A schema resource is a schema which is canonically identified by an absolute URI.

I'd drop the word "canonically". It's just techno-babel that's not necessary to describe the concept to lay-people.


The URI serves as a canonical identifier for the schema. This allows it to be referenced and reused in other schemas. Consequently, all schema resource must maintain semantic clarity, without which reusability will be difficult.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The URI serves as a canonical identifier for the schema. This allows it to be referenced and reused in other schemas. Consequently, all schema resource must maintain semantic clarity, without which reusability will be difficult.
The value in the `$id` keyword should resolve to an absolute URI, which serves as a canonical identifier for the schema. This allows it to be referenced and reused in other schemas.

Consequently, all schema resource must maintain semantic clarity, without which reusability will be difficult.

I'm not sure what you mean by "semantic clarity".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In addition to my 👍 on Greg's suggested change, I suggest dropping the word "canonical". The lay-person doesn't know that word and it's not necessary to convey the concept.



### subschema

A [schema](#schema) which is itself contained within a surrounding parent schema.
Expand All @@ -117,7 +130,7 @@ For example, the `not` keyword takes a subschema value and inverts its result, s
Some subschemas may appear in more complex nested locations within a parent schema.
The `allOf` keyword, for instance, takes an array of multiple subschemas and succeeds whenever all of the subschemas do individually.

Whether something that otherwise *appears* to be a schema (based on its contents) actually *is* a subschema can be misleading at first glance without context or knowlege about its location within the parent schema.
Whether something that otherwise *appears* to be a schema (based on its contents) actually *is* a subschema can be misleading at first glance without context or knowledge about its location within the parent schema.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The spelling fix is appreciated, but the wording here is weird. Can we clean this up while we're here?

Specifically, in our above example, `{"type": "string"}` was a subschema of a larger schema, but in the schema `{"const": {"type": "string"}}`, it is *not* a subschema.
Even though as a value it looks the same, the `const` keyword, which compares instances against a specific expected value, does *not* take a subschema as its value, its value is an opaque value with no particular meaning (such that in this schema, the number 12 would be invalid, but the precise instance `{"type": "string"}` is valid).
Said more plainly, whether a particular value is a subschema or not depends on its precise location within a parent schema, as interpretation of the value depends on the defined behavior of the keyword(s) it lives under.
Expand Down