Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify how Schema Objects require full-document parsing (3.1.1) #3758

Merged
merged 3 commits into from
May 14, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
25 changes: 25 additions & 0 deletions versions/3.1.1.md
Original file line number Diff line number Diff line change
Expand Up @@ -140,6 +140,31 @@ An OpenAPI Description (OAD) MAY be made up of a single document or be divided i

It is RECOMMENDED that the entry OpenAPI document be named: `openapi.json` or `openapi.yaml`.

#### <a name="parsingDocuments"></a>Parsing Documents

In order to properly handle [Schema Objects](#schemaObject), OAS 3.1 inherits the parsing requirements of [JSON Schema draft 2020-12 §9](https://datatracker.ietf.org/doc/html/draft-bhutton-json-schema-00#section-9), with appropriate modifications regarding base URIs as specified in [Relative References In URIs](#relativeReferencesURI).

This includes a requirement to parse complete documents before deeming a Schema object reference to be unresolvable, in order to detect keywords that might provide the reference target or impact the determination of the appropriate base URI.

Implementations MAY support complete-document parsing in any of the following ways:

* Detecting OpenAPI or JSON Schema documents using media types
* Detecting OpenAPI documents through the root `openapi` property
* Detecting JSON Schema documents through detecting keywords or otherwise successfully parsing the document in accordance with the JSON Schema specification
* Detecting a document containing a referenceable Object at its root based on the expected type of the reference
* Allowing users to configure the type of documents that might be loaded due to a reference to a non-root Object

Implementations that parse referenced fragments of OpenAPI content without regard for the content of the rest of the containing document will miss keywords that change the meaning and behavior of the reference target.
In particular, failing to take into account keywords that change the base URI introduces security risks by causing references to resolve to unintended URIs, with unpredictable results.
While some implementations support this sort of parsing due to the requirements of past versions of this specification, in version 3.1, the result of parsing fragments in isolation is _undefined_ and likely to contradict the requirements of this specification.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I may be missing some distinction, but this seems like the same behavior that's described as "implementation defined" in https://github.com/OAI/OpenAPI-Specification/pull/3732/files#diff-b92507e7acda65ae00a02236c555cefc68b6fca4661077b84c2fb9ab150e5e17R151

This section particularly addresses schema objects but the other PR seems to encompass schema objects too.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@notEthan eh... (wobbles hand back and forth in a wishy-washy way). There's an overlap, but it's not quite the same. This is where we get into the "it's too hard to even explain" that I mentioned in the undefined/implementation-defined PR, but this ones not one of the worst so here's an explanation:

If you have a document that consists of an empty JSON object, {}, that document is syntactically valid as several different Objects. Parsing that document from different reference contexts is still full-document parsing, so it does not fall into this undefined behavior. But it is parsing using multiple conflicting contexts, so it falls into the implementation-defined behavior of the other PR.

It's a pretty fine hair to split, but it is deterministic. The whole-document case is implementation-defined because at least one implementor I've spoken with did not think it required a fix in the spec. I'm trying to invalidate as few existing implementations as possible, which is tricky to balance with the scenarios that result in outright wrong behavior.

Copy link
Member Author

@handrews handrews May 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Neither of these changes get into the deeper question of whether documents are parsed as JSON/YAML once or each time, or whether, once parsed as JSON/YAML, the resulting structures are further parsed as OAS Objects once or each time. That would start getting into whether various parsing steps are cached, and what is done when a cached Object doesn't agree with a new parsing context. There are many ways this might or might not be detected, and many strategies one could pick to handle it - new each time, first wins, last wins, any conflict is an error, etc.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's an overlap, but it's not quite the same.

I'm happy to accept that, and appreciate the explanation.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I figured it might help folks to see what "complicated to explain" looks like :-)


While it is possible to structure certain OpenAPI Descriptions to ensure that they will behave correctly when references are parsed as isolated fragments, depending on this is NOT RECOMMENDED.
This specification does not explicitly enumerate the conditions under which such behavior is safe, and provides no guarantee for continued safety in any future versions of the OAS.

A special case of parsing fragments of OAS content would be if such fragments are embedded in another format, referred to as an _embedding format_ with respect to the OAS.
Note that the OAS itself is an embedding format with respect to JSON Schema, which is embedded as Schema Objects.
It is the responsibility of an embedding format to define how to parse embedded content, and OAS implementations that do not document support for an embedding format cannot be expected to parse embedded OAS content correctly.

#### <a name="structuralInteroperability"></a>Structural Interoperability

When parsing an OAD, JSON or YAML objects are parsed into specific Objects (such as [Operation Objects](#operationObject), [Response Objects](#responseObject), [Reference Objects](#referenceObject), etc.) based on the parsing context. Depending on how references are arranged, a given JSON or YAML object can be parsed in multiple different contexts:
Expand Down