Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Validate incoming OpenAPI docs using OpenAPIKit's built-in validation #130

Merged
merged 5 commits into from Jul 21, 2023

Conversation

czechboy0
Copy link
Collaborator

@czechboy0 czechboy0 commented Jul 19, 2023

Motivation

When provided with an OpenAPI document that has recursion, the generator either crashes, or produces Swift code that doesn't compile. We should catch recursion earlier in the process, and emit a descriptive error.

Modifications

This PR adds two validation steps:

  • OpenAPIKit's validate method catches some structural issues in the OpenAPI doc.
  • OpenAPIKit's locallyDereferenced() method is a great way to ensure no reference cycles appear in the document.

Catching reference cycles earlier in the process is great, as even if we generate the Swift code, it won't compile until we have some explicit support for recursive types (tracked by #70).

Result

Now, when an OpenAPI document with recursion is provided, instead of crashing or producing non-compiling Swift code, it prints a descriptive error and returns a non-0 code.

Test Plan

Tested manually on the CLI with purposefully malformed documents. But since this isn't our code, I don't want to add detailed tests of the validation details.

Copy link
Collaborator

@simonjbeaumont simonjbeaumont left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than roll our own cycle detection, can we leverage something from OpenAPIKit? It seems to provide dereferenced(...) functions that throw on cycles—here's a simple example:

import XCTest
import OpenAPIKit30
import Yams

final class ReferenceCycleTests: XCTestCase {
    func test() throws {
        let componentsYAML = """
        schemas:
          A:
            type: string
          B:
            $ref: "#/components/schemas/A"
          C:
            $ref: "#/components/schemas/B"
          D:
            $ref: "#/components/schemas/E"
          E:
            $ref: "#/components/schemas/D"
        """
        let components = try YAMLDecoder().decode(OpenAPI.Components.self, from: componentsYAML)
        XCTAssertNotNil(try components.schemas["A"]!.dereferenced(in: components))
        XCTAssertNotNil(try components.schemas["B"]!.dereferenced(in: components))
        XCTAssertNotNil(try components.schemas["C"]!.dereferenced(in: components))
        XCTAssertThrowsError(try components.schemas["D"]!.dereferenced(in: components))
        XCTAssertThrowsError(try components.schemas["D"]!.dereferenced(in: components))
    }
}

This test passes, and the error returned for the cycle cases is:

"Encountered a JSON Schema $ref cycle that prevents fully dereferencing document at '#/components/schemas/B'. This type of reference cycle is not inherently problematic for JSON Schemas, but it does mean OpenAPIKit cannot fully resolve references because attempting to do so results in an infinite loop over any reference cycles. You should still be able to parse the document, just avoid requesting a `locallyDereferenced()` copy."

Additionally, once we land a solution here, could you add something to the reference test suite? e.g. adding this currently results in an infinite loop, as you expect, perhaps something that tests that we throw the right error:

    func testComponentsSchemasReferenceCycle() throws {
        try self.assertSchemasTranslation(
            """
            schemas:
              A:
                $ref: "#/components/schemas/B"
              B:
                $ref: "#/components/schemas/A"
            """,
            """
            public enum Schemas {
                public typealias A = A
            }
            """
        )
    }

@czechboy0
Copy link
Collaborator Author

czechboy0 commented Jul 19, 2023

As the OpenAPIKit error says, You should still be able to parse the document, just avoid requesting a locallyDereferenced() copy." - that's what we do. We never use the dereferencing functions in OpenAPIKit, because our generator retains as much structure as possible, and dereferencing is lossy.

So we could dereference the whole document once at the start, and throw away the result if it doesn't throw an error – would you prefer that?

Regarding the snippet test, since we're testing that we throw an error, I'm not sure what code we'd expect to be generated.

@simonjbeaumont
Copy link
Collaborator

As the OpenAPIKit error says, You should still be able to parse the document, just avoid requesting a locallyDereferenced() copy." - that's what we do. We never use the dereferencing functions in OpenAPIKit, because our generator retains as much structure as possible, and dereferencing is lossy.

So we could dereference the whole document once at the start, and throw away the result if it doesn't throw an error – would you prefer that?

That's what I was meaning. We could use this function at the top of isSchemaSupported to test for and rethrow if there is a cycle. We'd then discard the result of the function. But it would mean we wouldn't need to carry around the bag of seen references and implement our own cycle detection IIUC.

Regarding the snippet test, since we're testing that we throw an error, I'm not sure what code we'd expect to be generated.

Right, I said "e.g. adding this currently results in an infinite loop, as you expect, perhaps something that tests that we throw the right error" so I'm not proposing that exact code, just used it as a way of showing the OpenAPI snippet. It would be a test that verifies the error thrown is whatever ends up in this PR.

@czechboy0
Copy link
Collaborator Author

The dereferencing is expensive, so I think we should just let OpenAPIKit dereference the full document at the start, during preflight validation, and then just assume there are no cycles from that point on. We call things like isSchemaSupported frequently, so I don't think we should call it every time. But the result will be the same, and will mean I can undo most of this PR 🙂

Re the error, sure we can test for the exact error. I'll change the PR now to use OpenAPIKit dereferencing, will take it back to draft.

@czechboy0 czechboy0 marked this pull request as draft July 19, 2023 15:25
@simonjbeaumont
Copy link
Collaborator

The dereferencing is expensive, so I think we should just let OpenAPIKit dereference the full document at the start, during preflight validation, and then just assume there are no cycles from that point on. We call things like isSchemaSupported frequently, so I don't think we should call it every time. But the result will be the same, and will mean I can undo most of this PR 🙂

That works for me. I guess we'll just do this as part of the parsing stage of the pipeline. Might be a good place to use the post-transition hook in that pipeline stage, or just bolt it into the implementation of that stage—up to you.

Re the error, sure we can test for the exact error. I'll change the PR now to use OpenAPIKit dereferencing, will take it back to draft.

SGTM—thanks @czechboy0!

@czechboy0 czechboy0 changed the title Detect reference cycles through allOf/anyOf/oneOf Validate incoming OpenAPI docs using OpenAPIKit's built-in validation Jul 20, 2023
@czechboy0 czechboy0 marked this pull request as ready for review July 20, 2023 20:03
@czechboy0
Copy link
Collaborator Author

Ok, ready for another look, @simonjbeaumont 🙏

Copy link
Collaborator

@simonjbeaumont simonjbeaumont left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for updating this. I think it's great that we found a way to leverage this in the library we already depend on, so thanks for looking into it!

Just a couple of questions on this one.

@czechboy0 czechboy0 merged commit b6d82cd into apple:main Jul 21, 2023
6 checks passed
@czechboy0 czechboy0 deleted the hd-detect-reference-cycles branch July 21, 2023 16:08
@czechboy0 czechboy0 added the semver/patch No public API change. label Jul 31, 2023
czechboy0 added a commit that referenced this pull request Aug 2, 2023
Hide strict OpenAPI validation behind a feature flag

### Motivation

In #130, we introduced stricted validation of input OpenAPI documents. Reflecting back, since this might start rejecting OpenAPI document that were previously accepted, that is considered a breaking change and should be hidden behind a feature flag, only to be enabled unconditionally in the next breaking version.

### Modifications

Hide the new validation behind a feature flag.

### Result

Without providing the feature flag explicitly, the generator will maintain its old behavior and not perform strict validation.

### Test Plan

N/A - this is a speculative fix to maintain backwards compatibility in the 0.1.x version.


Reviewed by: simonjbeaumont

Builds:
     ✔︎ pull request validation (5.8) - Build finished. 
     ✔︎ pull request validation (5.9) - Build finished. 
     ✔︎ pull request validation (docc test) - Build finished. 
     ✔︎ pull request validation (integration test) - Build finished. 
     ✔︎ pull request validation (nightly) - Build finished. 
     ✔︎ pull request validation (soundness) - Build finished. 

#162
czechboy0 added a commit that referenced this pull request Aug 8, 2023
Stop treating schema warnings as errors

### Motivation

In #130, we introduced extra validation by calling into OpenAPIKit's `validate` method. (It's hidden behind a feature flag, but I've been testing with it enabled.)

It looks for structural issues, such as non-unique operation ids, which would result in us generating non-compiling code anyway, so the validation helps catch these early and emit descriptive errors that adopters can use to fix their doc.

However, the method also takes a parameter `strict`, which defaults to `true`, and when enabled, it turns warnings emitted during schema parsing into errors. This part is _too_ strict for us and was rejecting OpenAPI documents that were valid enough for our needs.

An example of a schema warning is a schema having `minItems: 1` on a non-array schema. While it's not technically correct, it also doesn't impede our understanding of the non-array schema, as we never actually check what the value of `minItems` is. That's why these are just warnings, not errors, so we should stop promoting them to fatal errors that block an adopter from generating code.

### Modifications

This PR flips the `strict` parameter to `false`. This doesn't make us miss out on these warnings, as recently (in #174), we started forwarding these schema warnings into generator diagnostics, so the adopter will see them still, and can address them on their own schedule.

### Result

Now documents with only schema warnings aren't rejected by the generator anymore.

### Test Plan

Added a unit test of the broken-out `validateDoc` function, ensures that a schema with warnings doesn't trip it up anymore.


Reviewed by: simonjbeaumont

Builds:
     ✔︎ pull request validation (5.8) - Build finished. 
     ✔︎ pull request validation (5.9) - Build finished. 
     ✔︎ pull request validation (docc test) - Build finished. 
     ✔︎ pull request validation (integration test) - Build finished. 
     ✔︎ pull request validation (nightly) - Build finished. 
     ✔︎ pull request validation (soundness) - Build finished. 

#178
@hactar
Copy link

hactar commented Sep 5, 2023

I was trying out this generator for the first time and triggered this issue. The problem is that the error message can be misleading:

Encountered a JSON Schema $ref cycle that prevents fully dereferencing document at '#/components/schemas/KommentarDTO'. This type of reference cycle is not inherently problematic for JSON Schemas, but it does mean OpenAPIKit cannot fully resolve references because attempting to do so results in an infinite loop over any reference cycles. You should still be able to parse the document, just avoid requesting a locallyDereferenced() copy.

I understood this to mean I have to change some generator config setting to not try to request a locallyDereferenced copy. Spent some time trying to "fix" this first ☺️. Only once I read the source code comment I now understand that there is no workaround at this moment and that the openapi file that I would like to use will not work. I think it would be useful to catch this specific error and state that cyclic jsons are not supported yet?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
semver/patch No public API change.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants