Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible solution for non-JSON references support relying on schemaFormat #930

Open
derberg opened this issue Apr 26, 2023 · 8 comments
Open
Labels
💭 Strawman (RFC 0) RFC Stage 0 (See CONTRIBUTING.md)

Comments

@derberg
Copy link
Member

derberg commented Apr 26, 2023

This issue aims to kinda summarize idea mentioned dunring Adding support for non-JSON schemas (April 18th 2023) that could be a good solution for using JSON Reference for non-JSON structures like Protobuf or XSD.

JSON Reference spec

Current JSON Reference used in AsycAPI spec is JSON Reference v0.3.0.

In general, JSON Reference focuses on defining how a reference object should look like and requires only the following:

  • that it should have property $ref
  • that $ref is a string value of URI
  • that other properties in reference object must be ignored

Key words used to describe requirements, like SHALL or SHOULD follow https://datatracker.ietf.org/doc/html/rfc2119

Even thought the JSON Reference says in the intro

This specification defines a JSON [RFC4627] structure which allows a JSON value __to reference another value in a JSON document__

Later in spec in Resolution section it says in only SHOULD:

Resolution of a JSON Reference object SHOULD yield the referenced JSON value

Also, for fragments that start after #, JSON Pointer is not required but recommended:

If the representation of the referrant document is JSON, then the fragment identifier SHOULD be interpreted as a [JSON-Pointer].

Conclusion is

  • We can have a $ref pointing to .proto or .xsd for example
  • We can have a custom solution for fragments resolution in case of .proto for example, because JSON Pointer is just a recommended solution. So we can for example follow:
    • For .proto, when there are nested types that need to be referenced, we can follow standard Protobuf approach for pointing to nested types. So for $ref: "https://gist.githubusercontent.com/shankarshastri/c1b4d920188da78a0dbc9fc707e82996/raw/49e733499bfb302d9ecf320f2eca2f752f7e257b/LearnXInYMinProtocolBuffer.proto" if someone wants to reference FirstLevelNestedMessage type, fragment would look like, probably 😄, #NestedMessages.FirstLevelNestedMessage. So the final reference would be:
      {
          "$ref": "https://gist.githubusercontent.com/shankarshastri/c1b4d920188da78a0dbc9fc707e82996/raw/49e733499bfb302d9ecf320f2eca2f752f7e257b/LearnXInYMinProtocolBuffer.proto#NestedMessages.FirstLevelNestedMessage"
      }
    • For .xsd it is much easier, as there is already a XPath in place that allows to create a fragment pointing to a specific part of the XML/XSD. That requires further exploration and some input from experts but in theory, if we have https://www.w3.org/2001/XMLSchema schema, the xpath fragment to formChoice simple type would probably be xs:schema/xs:simpleType[@name='formChoice'] so the resulting reference is something like:
      {
          "$ref": "https://www.w3.org/2001/XMLSchema#xs:schema/xs:simpleType[@name='formChoice']"
      }

Now the question is how on a spec level specify that dereferencing for given $ref should not use JSON Pointer but XPath or some other solution

Solutions on a speck level

schemaFormat based

As discussed during Adding support for non-JSON schemas (April 18th 2023) meeting, we can say that dereferencing mechanism depends on the schemaFormat.

  • If schemaFormat specifies AsyncAPI Schema, JSON Schema, Avro or any other JSON structure, we follow JSON Reference + JSON Pointer in fragments
  • If schemaFormat specifies Protobuf, we follow JSON Reference + Protobuf Nested Types Reference in fragments
  • If schemaFormat specifies XSD, we follow JSON Reference + XPath in fragments

Alternative

Something I mentioned before somewhere in an issue or Slack, but can't find reference.

We can just follow example from the JSON reference resolver that we use, that explains how custom resolvers work -> https://apitools.dev/json-schema-ref-parser/docs/plugins/resolvers.html. So for example they had a use case that someone keeps schemas in a MongoDB and want to reference directly to the database. The solution is that $ref value starts with "mongodb://" because plugins can discover that like canRead: /^mongodb:/i.

so we can do:

  • XSD like "$ref": "xsd|https://www.w3.org/2001/XMLSchema#xs:schema/xs:simpleType[@name='formChoice']"
  • Proto like "$ref": "proto|https://gist.githubusercontent.com/shankarshastri/c1b4d920188da78a0dbc9fc707e82996/raw/49e733499bfb302d9ecf320f2eca2f752f7e257b/LearnXInYMinProtocolBuffer.proto#NestedMessages.FirstLevelNestedMessage"

Neverheless, sharing just for reference as when I thought about it, I did not take schemaFormat into consideration. So relying on schemaFormat is better imho.

Tooling

json-schema-ref-parser that we use is pretty flexible:

And writing them is not hard same using. So, again in theory, in current JS Parser we can do $RefParser.dereference(mySchema, { resolve: { proto: ourProtoResolver }}); whenever we encounter a proto reference. This will of course require some refactor in Parser as dereferencing happens once, globally for entire AsyncAPI document, and in new approach that should be done one by one, applying different resolver depending on schemaFormat


I think I covered entire discussion and summary. Lemme know if something is missing

pinging meeting participants: @GreenRover @jonaslagoni @fmvilas

@derberg derberg added the 💭 Strawman (RFC 0) RFC Stage 0 (See CONTRIBUTING.md) label Apr 26, 2023
@GreenRover
Copy link
Collaborator

I really like it and would go with relying on schemaFormat as well

@GreenRover
Copy link
Collaborator

How do you think about this Szenario:

components:
  messages:
    firstLevelNestedMessage:
      payload:
        $ref: '#components/schemas/LearnXInYMinProtocolBuffer#NestedMessages.FirstLevelNestedMessage'
  schemas:
    LearnXInYMinProtocolBuffer:
      schemaFormat: application/vnd.google.proto;version=3
      schema:
        $ref: https://gist.githubusercontent.com/shankarshastri/c1b4d920188da78a0dbc9fc707e82996/raw/49e733499bfb302d9ecf320f2eca2f752f7e257b/LearnXInYMinProtocolBuffer.proto
  • components.schemas.LearnXInYMinProtocolBuffer pointing to a whole proto|xsd file containing multiple schemas.
  • components.messages.firstLevelNestedMessage pointing to components.schemas.LearnXInYMinProtocolBuffer and picking NestedMessages.FirstLevelNestedMessage
  • Do a schema needs to be distinct or can it be multiple schemas. My opinion is it should be distinct. Either needs to contain an anchor pointing to a single schema or the file is only allowed to contain a single root level schema
    • A "root level schema" would i define as: A schema that is not used by other schemas.

@derberg
Copy link
Member Author

derberg commented May 9, 2023

Do a schema needs to be distinct or can it be multiple schemas. My opinion is it should be distinct. Either needs to contain an anchor pointing to a single schema or the file is only allowed to contain a single root level schema

sorry I do not get what you mean by distinct in this context. Not sure how it relates to your example

payload:
      $ref: '#components/schemas/LearnXInYMinProtocolBuffer#NestedMessages.FirstLevelNestedMessage'

this will not work, right? according to your PR, afaik, payload.$ref means it is pointing to AsyncAPI Schema. So normal resolution will be applied and fail

you would need to do

payload:
      schemaFormat: application/vnd.google.proto;version=3
      schema: '#components/schemas/LearnXInYMinProtocolBuffer#NestedMessages.FirstLevelNestedMessage'

@derberg
Copy link
Member Author

derberg commented May 11, 2023

@dalelane @char0n @smoya can you have a look 🙏🏼

basically the understanding is that this is a non breaking change that we could introduce in 3.0 and release as 3.1

but if it is breaking, then better know it sooner 😆

@dalelane
Copy link
Collaborator

I'm not super confident in making statements on this, as this is getting much deeper into the weeds of JSON specs than I've got before!

If I've understood the description above correctly, then it seems reasonable to do this in 3.x as adding behaviour for new schemaFormat values.

Has anyone tried spiking a prototype along the lines of the custom resolvers described above to convince ourselves that this is workable?

@github-actions
Copy link

This issue has been automatically marked as stale because it has not had recent activity 😴

It will be closed in 120 days if no further activity occurs. To unstale this issue, add a comment with a detailed explanation.

There can be many reasons why some specific issue has no activity. The most probable cause is lack of time, not lack of interest. AsyncAPI Initiative is a Linux Foundation project not owned by a single for-profit company. It is a community-driven initiative ruled under open governance model.

Let us figure out together how to push this issue forward. Connect with us through one of many communication channels we established here.

Thank you for your patience ❤️

@github-actions github-actions bot added the stale label Sep 12, 2023
@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Jan 10, 2024
@AnimeshKumar923
Copy link
Contributor

Still valid? @derberg @dalelane @GreenRover

@GreenRover
Copy link
Collaborator

Yes, definitely a 3.1 candidate forme.

@smoya smoya reopened this Feb 13, 2024
@smoya smoya removed the stale label Feb 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
💭 Strawman (RFC 0) RFC Stage 0 (See CONTRIBUTING.md)
Projects
None yet
Development

No branches or pull requests

5 participants