Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OpenAPI: support sibling values for reference schemas #1020

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

agboom
Copy link
Contributor

@agboom agboom commented Mar 7, 2022

I reproduced the issue from #888 in an existing test ReferencedSchemaTest.scala by using the withExample method on some schemas. After some debugging I found out that original contains the desired schema, so I came up with the solution to add the original value to the sibling fields. I'm quite new to the semantics of OpenAPI, so I'm not sure I got it right, but at least this test shows promising results. I also pasted the JSON in https://redocly.github.io/redoc and it rendered fine and with the examples in the right place.

If the approach is right, maybe some additional tests with description, title and default are desired?

- I.e. description, example, title and default
- Fixes endpoints4s#888
@codecov
Copy link

codecov bot commented Mar 7, 2022

Codecov Report

Merging #1020 (0073d09) into master (24aa7a1) will increase coverage by 0.01%.
The diff coverage is 62.50%.

❗ Current head 0073d09 differs from pull request most recent head de59377. Consider uploading reports for the commit de59377 to get more accurate results

@@            Coverage Diff             @@
##           master    #1020      +/-   ##
==========================================
+ Coverage   75.75%   75.77%   +0.01%     
==========================================
  Files         144      144              
  Lines        4764     4772       +8     
  Branches       57       59       +2     
==========================================
+ Hits         3609     3616       +7     
- Misses       1155     1156       +1     
Impacted Files Coverage Δ
...main/scala/endpoints4s/openapi/model/OpenApi.scala 58.22% <62.50%> (+0.28%) ⬆️
...main/scala-2/endpoints4s/generic/JsonSchemas.scala 68.00% <0.00%> (+1.33%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 24aa7a1...de59377. Read the comment docs.

Comment on lines +345 to +358
| "example": {
| "id": "2cb38bf2-ecbf-401c-be37-9ec325ae91a0",
| "title": "Essential Scala",
| "author": {
| "name": "Noel Welsh and Dave Gurnell"
| },
| "isbnCodes": [
|
| ],
| "storage": {
| "link": "https://underscore.io/books/essential-scala/",
| "storageType": "Online"
| }
| },
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The example is still here. I believe we should remove it from here. Probably, there is something wrong with the way we handle “named” schemas.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you're probably right that we should remove it, but I'm yet to find out how. With this added feature of silbing values, do we want to leave out examples etc. in the components/schemas object altogether or is there still a use for this?

Copy link
Member

@julienrf julienrf Mar 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my it is fine to have examples in the components but here the issue is that the example is attached to the wrong model.

I believe the issue is that when we call "someSchema.named("foo").withExample(...)", the example is added to the schema foo although it should be added to schema that references foo instead (like you did, actually, but we also need to make sure that we don't also add the example to the referenced schema).

Copy link
Contributor Author

@agboom agboom Mar 9, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking more closely now, I see that you're right that it is attached to the wrong model. In fact, the behavior of the referenced schema example hasn't changed (yet). With this new code, the other models just have added schemas. So we'll either need to prevent the example from being added to components/schemas or make sure that it contains the expected example.

In this case the first example is added to the generic Book schema in this line: https://github.com/endpoints4s/endpoints4s/pull/1020/files#diff-8eee4ea78e0bdeb38dd981988cae59c64e82196da2d576610a4cfcc5f62b1629R77

As a a user I would expect this example to be added to each endpoint that uses the Book schema, except if I override it with another example on a specific schema, like on this line: https://github.com/endpoints4s/endpoints4s/pull/1020/files#diff-8eee4ea78e0bdeb38dd981988cae59c64e82196da2d576610a4cfcc5f62b1629R99

Is this a correct assumption? If so, I can see two ways to achieve this in the documentation JSON: either leave examples out of components/schemas entirely and do everything in the specific schemas, or make sure that the right examples are present in the referenced schemas. I can imagine that the former solution results in more verbose JSON, but is less complex to implement. I'll try some things and see what happens.

Copy link
Contributor Author

@agboom agboom Mar 9, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still have some difficulty in understanding the purpose of original and how it's set. If I search for usages of constructors with original and withOriginal I cannot find any, yet original is set for all three Reference schemas (two Book and one Color). Do you have an idea how original is set via withExample/withExampleJsonSchema or other?

@julienrf
Copy link
Member

julienrf commented Mar 9, 2022

My understanding is the following.

When we use genericSchema[Foo], the derivation mechanism eventually calls .named("Foo") on the derived schema. This is important because the problem applies to every named schema.

To create a named schema, we just return the same schema with a name field set with the given name. When we call withExample (or withTitle, etc.) on that schema, we do return a new schema with the given example (or title, etc.) but also with the same name as the former schema. This means that if we call .withExample twice, we get two distinct schemas with the same name.

Then, when we produce the OpenAPI “components”, we discard the schemas with duplicated names, so only one schema “wins”.

Maybe a solution could be to handle “named” schema completely differently. Instead of having a field name on all the schemas, we could create a separate schema type that would contain the reference (the name and the referenced schema). Then calling any update method on it such as withExample would return an updated reference schema (instead updating the referenced schema). What do you think?

@agboom
Copy link
Contributor Author

agboom commented Mar 11, 2022

Thank you for your explanation, I understand it a lot better now!

Maybe a solution could be to handle “named” schema completely differently. Instead of having a field name on all the schemas, we could create a separate schema type that would contain the reference (the name and the referenced schema). Then calling any update method on it such as withExample would return an updated reference schema (instead updating the referenced schema). What do you think?

Can your elaborate a bit more on the separate schema type? Right now there's a Reference schema type that contains the name and the referenced schema (that's the original field right?), so how would this schema type be different?

In the end we need to have a unique name for each referenced schema, and the right model should have the right schema reference. So I agree that we should do something different than name because that's derived from the classname AFAIK. How exactly I'm not sure, but I can try some things out in the coming week.

@julienrf
Copy link
Member

Right now, the operation namedRecord returns a new Record. Instead, we should see if it is possible to return something like a RecordReference that would contain the name and the referenced record.

def namedRecord[A](schema: Record[A], name: String): Record[A] =
new Record(schema.ujsonSchema, schema.docs.withName(name))
def namedTagged[A](schema: Tagged[A], name: String): Tagged[A] =
new Tagged(schema.ujsonSchema, schema.docs.withName(name))
def namedEnum[A](schema: Enum[A], name: String): Enum[A] =
new Enum(schema.ujsonSchema, schema.docs.copy(name = Some(name)))

@harpocrates
Copy link
Collaborator

Just wanted to chime in and say I'm super excited to review this PR and intend to do so this weekend!

@agboom
Copy link
Contributor Author

agboom commented Mar 19, 2022

I did some more digging into the code. The generic json schemas interpreter is new to me as well, so it took a while to familiarize myself with this part of the codebase.
As an experiment I added a RecordReference to the openapi interpreter, but I have some difficulty seeing how this will affect the schema naming:

  • The generic interpreter doesn't know about this new type unless we add it to the JsonSchemas algebra, should we do that?
  • The RecordReference stores the reference name, so I imagine that this name will be used when producing the components, but how do we know when a reference name is already in use and a new one should be created to prevent duplicate names? Should we introduce a new kind of GenericSchemaName for this?

Sorry if the questions don't make sense, I'm trying to wrap my head around the name subject 🙈

@julienrf
Copy link
Member

  • The generic interpreter doesn't know about this new type unless we add it to the JsonSchemas algebra, should we do that?

If possible, we should see if we can manage to not change the JsonSchemas algebra. Does the generic interpreter need to be aware of RecordReference?

  • The RecordReference stores the reference name, so I imagine that this name will be used when producing the components, but how do we know when a reference name is already in use and a new one should be created to prevent duplicate names? Should we introduce a new kind of GenericSchemaName for this?

At some point in the openapi interpreter we create a large Map that contains all the referenced schemas indexed by name. Maybe during the process of creating this Map we could emit some warnings if we detected duplicates. Unfortunately, there is no logging system in endpoints4s yet, so you would have to introduce one, that you could use to write logger.warn("Reference xxx refers to different schemas").

@agboom
Copy link
Contributor Author

agboom commented Mar 21, 2022

If possible, we should see if we can manage to not change the JsonSchemas algebra. Does the generic interpreter need to be aware of RecordReference?

You're right, after some more tinkering I found out that the algebra doesn't need to be changed at all :)

At some point in the openapi interpreter we create a large Map that contains all the referenced schemas indexed by name. Maybe during the process of creating this Map we could emit some warnings if we detected duplicates. Unfortunately, there is no logging system in endpoints4s yet, so you would have to introduce one, that you could use to write logger.warn("Reference xxx refers to different schemas").

Hmm, I didn't think of warning on detected duplicates as a possible outcome. I was actually working towards solving the duplicate by giving the new referenced schema a different name so that it would show up in the eventual OpenApi JSON. What do you think?

I'm still connecting the dots, but slowly getting there. The references are built up in two places, here and here. The latter is the large Map as you mention. We can detect duplicates there and log them, but if you agree I'd like to try my hand at resolving the duplicates, because that would be more user friendly.

Thanks for your help so far, btw! 🙏

@agboom
Copy link
Contributor Author

agboom commented Mar 21, 2022

I pushed a work-in-progress commit to demonstrate what I mean. The RecordReference[A] is a Record[A] with an explicit name field. When adding an example to an existing JsonSchema (by calling withExample), the name is updated to be made unique here.

The code is far from ready, but I'm curious to know what you think of this approach.

@julienrf
Copy link
Member

julienrf commented Mar 22, 2022

by giving the new referenced schema a different name so that it would show up in the eventual OpenApi JSON. What do you think?

I am not sure, because it means that people manually calling .named("xxx") may get a schema with a different name, eventually (ie, the name could become xxx$). But at least for people using val x = genericJsonSchema[X] and then x.withExample(foo) and x.withExample(bar) at different places, the resulting schema would be correct.

In any case, users can always call .named wherever they want to override the generated unique name if they don’t like it, so I think this is fine.

Copy link
Collaborator

@harpocrates harpocrates left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for only actually getting to this quite late. I've taken a look and I think this is an interesting idea - I'm going to go try out some things in a REPL now just to try to remember more about this problem.

At a high level, I remember thinking that we should only need to modify the DocumentedJsonSchema by adding extra classes which only override examples/descriptions/etc. on the underlying schema (quite like RecordReference). That way, we also wouldn't need to start generating schema names either.

Inserting extra names into the OpenAPI schema is a little bit unsavoury. From a theoretical perspective, it risks colliding with an actual user-provided name. More practically, some OpenAPI documentation generators (at least Swagger, but I think others too) list out all named schemas at the bottom of the page, and it would be nice not to see that list clouded with autogenerated stuff.

Comment on lines +47 to +50
class RecordReference[A](
val name: String,
referencedRecord: Record[A]
) extends Record[A](referencedRecord.ujsonSchema, referencedRecord.docs.withName(name)) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting - I was expecting to see a subclass like this on DocumentedRecord, not Record.

Should it ever be possible to have a RecordReference inside of a RecordReference? Eg. RecordReference("foo", RecordReference("bar", RecordReference("baz", ...)))? I see some effort here to avoid that case, but I can't tell yet if that is an invariant.

schema match {
case s: RecordReference[A] =>
println(s"RECORDREFERENCE ${s.name}")
s.withName(s.name + "$").withExample(exampleJson)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a clever idea, but I'm not sure it'll actually work unless the name mangling is made more complicated.

Consider the case where the same JsonSchema is re-used twice to create two other schemas that just each have their examples changed. Since both calls for adding the example will use withExampleJsonSchema, they'll both end up adding the same $ to the initial schema's name. We're now back where we started with a a collision in the schema name.

@harpocrates
Copy link
Collaborator

@agboom I resurrected and committed the old ideas I had a while ago on this topic into harpocrates@ef9aaa7. That also solves #888 (only for records, but the idea should work for the other nameable schemas) without needing to generate extra names. IIRC, I gave up mostly because I had run out of time to investigate. Maybe some of the ideas will be useful here...

@agboom
Copy link
Contributor Author

agboom commented Mar 24, 2022

@harpocrates Thanks for pursuing this! Your solution looks a lot better than mine, so let's continue with that.

@harpocrates
Copy link
Collaborator

@agboom let me know if I can help out with anything in particular, and thanks again for looking into this. 😄

@harpocrates
Copy link
Collaborator

@agboom are you still working on this? Would you mind if I picked it up?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Consider preserving referential transparency of OpenAPI schemas
3 participants