v3.2: Guidance on searching and evaluating schemas #4743

handrews · 2025-06-21T01:12:41Z

NOTE 1: This is intended to clarify requirements that already exist but have never been well-defined, both by making certain things required and stating clearly that other things are not. It is particularly relevant in light of the Encoding Object changes, although the vaguely-defined behavior predates the new features.

Some OAS features casually state that they depend on the type of data being examined, or implicitly carry ambiguity about how to determine how to parse the data.

This section attempts to provide some guidance and limits, requiring only that implementations follow the unambiguous, statically deterministic keywords $ref and allOf.

It also provides for just validating the data (when possible) and using the actual in-memory type when a schema is too complex to analyze statically.

One use of this is breaking apart schemas to use them with mixed binary and JSON-compatible data, and a new section has been added to address that.

Finally, a typo in a related section was fixed.

schema changes are included in this pull request
schema changes are needed for this pull request but not done yet
no schema changes are needed for this pull request

Some OAS features casually state that they depend on the type of data being examined, or implicitly carry ambiguity about how to determine how to parse the data. This section attempts to provide some guidance and limits, requiring only that implementations follow the unambiguous, statically deterministic keywords `$ref` and `allOf`. It also provides for just validating the data (when possible) and using the actual in-memory type when a schema is too complex to analyze statically. One use of this is breaking apart schemas to use them with mixed binary and JSON-compatible data, and a new section has been added to address that. Finally, a typo in a related section was fixed.

src/oas.md

handrews · 2025-06-22T01:16:07Z

@karenetheridge while I have your attention, do you think this is fine where it is or should it go under the Schema Object somewhere? I really could not decide.

handrews · 2025-06-22T03:59:54Z

I'm putting this in draft because based on @karenetheridge's feedback I'm going to rework it fairly substantially, but it's still of use when understanding how it fits with the other related PRs.

The effect of the rewrite should be the same, but I think the wording and organization will be significantly different. It's clear that the different use cases here need to be separated out and clarified. I think this ended up being a bit oddly abstract because of how I tried to split things up into PRs that don't conflict.

Move things under the Schema Object, organize by use case and by the point in the process at which things occur, and link directly from more parts of the spec so that the parts in the Schema Object section can stay more focused.

handrews · 2025-06-22T21:37:00Z

I have added a commit that almost totally rewrites this- you probably just want to review the whole thing and not look at the per-commit diff as it will be a mess. The new version:

Puts most things under the Schema Object
Organizes use cases by the point in the process they occur relative to schema evaluation
Links from elsewhere in the spec so that we do not need to include quite as much in the main part of the text

I do not think that has changed anything substantial, but it's essentially a new PR now.

handrews · 2025-06-23T01:13:19Z

@karenetheridge I'm going to mark various threads as resolved since the text is now so different that they are confusing- please do not take that to mean I'm dismissing open questions, please just re-start whatever is needed with comments on the new text, or as new top-level comments. Apologies for the inconvenience.

src/oas.md

Co-authored-by: Karen Etheridge <ether@cpan.org>

Also clarify that there is no one set list of keywords to search for, but rather each use case defines what is relevant.

handrews · 2025-06-28T22:42:14Z

@karenetheridge I trimmed back the multi-valued type requirements as from our discussion I just see too many ways it can go wrong. Now it's just "if you have [X, "null"] treat it like X" and everything else is optional guidance. How does that sit with you?

handrews · 2025-07-09T19:09:00Z

@karenetheridge I'm marking various threads resolved as I think subsequent commits addressed them, and it's a lot of at least somewhat outdated discussion for folks to have to read through before tomorrow's call. Please feel free to re-raise anything that is still not addressed.

karenetheridge

one small edit (not a change introduced by you, but still an improvement I think).

src/oas.md

lornajane · 2025-07-10T16:21:40Z

src/oas.md

@@ -2599,6 +2601,10 @@ Note that JSON Schema Draft 2020-12 does not require an `x-` prefix for extensio
 The [`format` keyword (when using default format-annotation vocabulary)](https://www.ietf.org/archive/id/draft-bhutton-json-schema-validation-01.html#section-7.2.1) and the [`contentMediaType`, `contentEncoding`, and `contentSchema` keywords](https://www.ietf.org/archive/id/draft-bhutton-json-schema-validation-01.html#section-8.2) define constraints on the data, but are treated as annotations instead of being validated directly.
 Extended validation is one way that these constraints MAY be enforced.

+In addition to extended validation, annotations are the most effective way to determine whether these keywords impact the type and structure of the fully parsed data.
+For example, formats such as `int64` can be applied to JSON strings, as JSON numbers have limitations that make large integers non-portable.
+If annotation collection is not available, implementations MUST perform a [schema search](#searching-schemas) for these keywords, and MUST document the limitations this imposes.


Suggested change

If annotation collection is not available, implementations MUST perform a [schema search](#searching-schemas) for these keywords, and MUST document the limitations this imposes.

If annotation collection is not available, implementations MUST perform a [schema search](#searching-schemas) for these keywords, and SHOULD document the limitations this imposes.

(removed, commented in wrong section)

lornajane

Minor suggestions from the TSC call

lornajane · 2025-07-10T16:38:52Z

src/oas.md

+For example, if `foo` had the schema `{"type": "string", "format": "int64")`, the data structure used for validation would still be the same, but the application will need to convert the string `"42"` to the 64-bit integer `42`.
+Similarly, the `content*` keywords can indicate further structure within a string.
+
+Implementations MUST either use [annotation collection](#extended-validation-with-annotations) to gather this information, or perform a [schema search](#searching-schemas), and MUST document which approach it implements.


Suggested change

Implementations MUST either use [annotation collection](#extended-validation-with-annotations) to gather this information, or perform a [schema search](#searching-schemas), and MUST document which approach it implements.

Implementations MUST either use [annotation collection](#extended-validation-with-annotations) to gather this information, or perform a [schema search](#searching-schemas), and SHOULD document which approach it implements.

As discussed in the meeting, if implementations don't do this, what would they do instead? If there isn't anything they can do, then I think the MUST would stand.

I really did not expect this PR to get hung up on a debate about how much to require implementations to document their behavior. Which I thought would be thoroughly non-controversial. Why would we not want them to do so?

So... I have no idea. I want everyone else to resolve their differences around documentation requirements so it doesn't hang up this PR, that's my opinion on the matter.

lornajane · 2025-07-10T16:42:02Z

src/oas.md

+
+Implementations MUST document which strategy or strategies they use, as well as any known limitations.
+
+##### Searching Schemas


Question about moving this section a little further up the document, who has thoughts?

lornajane · 2025-07-10T16:42:17Z

src/oas.md

+1. Use a placeholder value, on the assumption that no assertions will apply to the binary data and no conditional schema keywords will cause the schema to treat the placeholder value differently (e.g. a part that could be either plain text or binary might behave unexpectedly if a string is used as a binary placeholder, as it would likely be treated as plain text and subject to different subschemas and keywords).
+2. Perform [schema searches](#searching-schemas) to find the appropriate keywords (`properties`, `prefixItems`, etc.) in order to break up the subschemas and apply them separately to binary and JSON-compatible data.
+
+Implementations MUST document which strategy or strategies they use, as well as any known limitations.


Suggested change

Implementations MUST document which strategy or strategies they use, as well as any known limitations.

Implementations SHOULD document which strategy or strategies they use, as well as any known limitations.

Co-authored-by: Karen Etheridge <ether@cpan.org>

handrews · 2025-07-10T20:37:01Z

@lornajane @karenetheridge @duncanbeevers Can y'all sort out what we should be doing on documentation requirements and why? I have no idea why MUST requirements around documenting behavior are controversial, but all I really care about is that this does not hang up this PR. It sounds like @karenetheridge is disagreeing on one? I just want a broadly applicable rule that tells me what to do here.

handrews added this to the v3.2.0 milestone Jun 21, 2025

handrews requested a review from a team as a code owner June 21, 2025 01:12

handrews added the schema-object label Jun 21, 2025

handrews requested a review from a team as a code owner June 21, 2025 01:12

handrews added the media and encoding Issues regarding media type support and how to encode data (outside of query/path params) label Jun 21, 2025

This was referenced Jun 21, 2025

v3.2: (Split and smaller!) Support ordered multipart including streaming #4745

Open

v3.2: Support ordered multipart including streaming #4589

Closed

v3.2: Support nested multipart with nested Encoding Objects #4747

Open

karenetheridge reviewed Jun 21, 2025

View reviewed changes

src/oas.md Outdated Show resolved Hide resolved

src/oas.md Outdated Show resolved Hide resolved

src/oas.md Outdated Show resolved Hide resolved

handrews marked this pull request as draft June 22, 2025 03:57

Rework schema searching guidance

a3db2bb

Move things under the Schema Object, organize by use case and by the point in the process at which things occur, and link directly from more parts of the spec so that the parts in the Schema Object section can stay more focused.

handrews marked this pull request as ready for review June 22, 2025 22:08

karenetheridge reviewed Jun 26, 2025

View reviewed changes

src/oas.md Outdated Show resolved Hide resolved

src/oas.md Outdated Show resolved Hide resolved

src/oas.md Outdated Show resolved Hide resolved

src/oas.md Show resolved Hide resolved

src/oas.md Outdated Show resolved Hide resolved

src/oas.md Outdated Show resolved Hide resolved

handrews and others added 2 commits June 27, 2025 15:06

Fix spelling

0912400

Co-authored-by: Karen Etheridge <ether@cpan.org>

Punt on most multi-valued types

6290e79

Also clarify that there is no one set list of keywords to search for, but rather each use case defines what is relevant.

Fix incorrect bit about binary and schemas

fa12074

handrews mentioned this pull request Jul 4, 2025

Open Community (TDC) Meeting, Thursday 10 July 2025 #4752

Open

Fix typos

7928dbe

karenetheridge reviewed Jul 10, 2025

View reviewed changes

src/oas.md Outdated Show resolved Hide resolved

lornajane reviewed Jul 10, 2025

View reviewed changes

handrews mentioned this pull request Jul 10, 2025

Open Community (TDC) Meeting, Thursday 17 July 2025 #4763

Open

Improved old wording that had not been changed

e446e40

Co-authored-by: Karen Etheridge <ether@cpan.org>

	If annotation collection is not available, implementations MUST perform a [schema search](#searching-schemas) for these keywords, and MUST document the limitations this imposes.
	If annotation collection is not available, implementations MUST perform a [schema search](#searching-schemas) for these keywords, and SHOULD document the limitations this imposes.

	Implementations MUST either use [annotation collection](#extended-validation-with-annotations) to gather this information, or perform a [schema search](#searching-schemas), and MUST document which approach it implements.
	Implementations MUST either use [annotation collection](#extended-validation-with-annotations) to gather this information, or perform a [schema search](#searching-schemas), and SHOULD document which approach it implements.


		Implementations MUST document which strategy or strategies they use, as well as any known limitations.

		##### Searching Schemas

	Implementations MUST document which strategy or strategies they use, as well as any known limitations.
	Implementations SHOULD document which strategy or strategies they use, as well as any known limitations.

v3.2: Guidance on searching and evaluating schemas #4743

Are you sure you want to change the base?

v3.2: Guidance on searching and evaluating schemas #4743

Uh oh!

Conversation

handrews commented Jun 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

handrews commented Jun 22, 2025

Uh oh!

handrews commented Jun 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

handrews commented Jun 22, 2025

Uh oh!

handrews commented Jun 23, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

handrews commented Jun 28, 2025

Uh oh!

handrews commented Jul 9, 2025

Uh oh!

karenetheridge left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

lornajane Jul 10, 2025

Choose a reason for hiding this comment

Uh oh!

karenetheridge Jul 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lornajane left a comment

Choose a reason for hiding this comment

Uh oh!

lornajane Jul 10, 2025

Choose a reason for hiding this comment

Uh oh!

karenetheridge Jul 10, 2025

Choose a reason for hiding this comment

Uh oh!

handrews Jul 10, 2025

Choose a reason for hiding this comment

Uh oh!

lornajane Jul 10, 2025

Choose a reason for hiding this comment

Uh oh!

lornajane Jul 10, 2025

Choose a reason for hiding this comment

Uh oh!

handrews commented Jul 10, 2025

Uh oh!

Uh oh!

handrews commented Jun 21, 2025 •

edited

Loading

handrews commented Jun 22, 2025 •

edited

Loading

karenetheridge Jul 10, 2025 •

edited

Loading