Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

protoc-gen-openapi - incorrect output for multiple proto files #268

Closed
jan-sykora opened this issue Nov 11, 2021 · 12 comments · Fixed by #359
Closed

protoc-gen-openapi - incorrect output for multiple proto files #268

jan-sykora opened this issue Nov 11, 2021 · 12 comments · Fixed by #359

Comments

@jan-sykora
Copy link

Description

When running protoc-gen-openapi for generating OpenAPI description for multiple .proto files, the output is incorrectly mixed into one OpenAPI description file.

Expected behaviour is to generate OpenAPI description for each .proto file separately when there's more than one proto file on the input. The protoc-gen-openapiv2 from grpc-gateway also generates for each proto file a separate OpenAPI description file.

Example

Let there be two .proto files:

.
├── foo.proto # contains FooService proto definition
└── bar.proto # contains BarService proto definition

and the protoc-gen-openapi plugin is run for both of these files at once:

protoc foo.proto bar.proto -I=. --openapi_out=.

Expected output:

.
├── foo.proto
├── bar.proto
├── foo_openapi.yaml # contains only FooService OpenAPI description
└── bar_openapi.yaml # contains only BarService OpenAPI description

Actual output:

.
├── foo.proto
├── bar.proto
└── openapi.yaml # contains both FooService and BarService + the common content is incorrectly overwritten by the latest processed proto file

Buf incompatibility

This behaviour makes the protoc-gen-openapi plugin impossible to be used by Buf since Buf is designed to process multiple proto files when generating content (it works with modules which are basically directories containing multiple proto files) and it is not possible to configure Buf to process proto files one by one.

@clintberry
Copy link

For us we love this feature. We define our whole api in multiple proto files, but want one openapi spec for mapping from REST. I understand the buf incompatibility isn't great.

@casimcdaniels
Copy link

casimcdaniels commented Jan 7, 2022

On the feature side, I think it would be great to be able to provide an option such as output_mode that has choice of merged output, separated output, or both(?). That way users can pick which one makes sense for their API structure. For the separate output, the existing parameters for title, description, etc could either override any initial values (as the parameters work now) or be ignored and generate from the protobuf descriptor values.

@morphar
Copy link
Collaborator

morphar commented Jan 7, 2022

I don't share the opinion that the output is incorrect - like @clintberry, I wanted multiple proto files to be converted into 1 Open API file, which is one of the reasons I chose gnostic over Buf and protoc-gen-openapiv2 .

I do agree with @casimcdaniels, that it would be nice to have output to multiple files as an option.
Interoperability between projects is a nice feature.

It shouldn't be too hard to add this functionality, though there is some logic that might need some extra thinking.
The current logic has the benefit of gathering everything into 1 structure - multiple outputs might be a bit more tricky.
E.g.:

  • What should happen, when there is proto files with only messages?
  • If some proto files only defines messages for importing, how should these files be parsed and handled?
  • If the output should be compatible with Buf, which restrictions does that put on the output?

@jan-sykora if you (or anybody else) has some example proto files + example OpenAPI output files, that would be greatly appreciated.
The more elaborate the better.

/cc @timburks

@timburks
Copy link
Contributor

I also think that the current behavior "works as intended". At Google, APIs are frequently described with multiple .proto files that contain multiple services, and the API Discovery Service describes these APIs with a single document, which various third-party tools convert into OpenAPI descriptions.

@jan-sykora if you run protoc twice (once for foo.proto and once for bar.proto), you would get an appropriate output for each, and as long as your include path was correct, if foo imported bar, I would expect protoc to compile these files without errors.

@jan-sykora
Copy link
Author

First, thank you all for your replies! Below are some answers to your comments.

@jan-sykora if you run protoc twice (once for foo.proto and once for bar.proto), you would get an appropriate output for each, and as long as your include path was correct, if foo imported bar, I would expect protoc to compile these files without errors.

@timburks that's a good point. However, as I mentioned, this approach is not easily applicable when using Buf. We had to create a script for copying each proto file into tmp folder, running buf for each of these files, renaming each openapi output, cleaning afterwards, etc. just to get openAPI for each service.

I also think that the current behavior "works as intended". At Google, APIs are frequently described with multiple .proto files that contain multiple services, and the API Discovery Service describes these APIs with a single document, which various third-party tools convert into OpenAPI descriptions.

IMHO having openAPI per service is more readable than having it all mixed in one spec. It allows readers to easily display API of only one specific service that they care about. Though that may be a matter of personal preference.

I don't share the opinion that the output is incorrect - like @clintberry, I wanted multiple proto files to be converted into 1 Open API file, which is one of the reasons I chose gnostic over Buf and protoc-gen-openapiv2.

@morphar protoc-gen-openapiv2 supports option (allow_merge) to merge multiple proto files into 1 OpenAPI file. It would be nice if gnostic's protoc-gen-openapi could also support both options, not only one of them.

@jan-sykora if you (or anybody else) has some example proto files + example OpenAPI output files, that would be greatly appreciated.
The more elaborate the better.

I found that the generated info.title value is incorrect. Here's a minimal working example:

// foo.proto
syntax = "proto3";
package example;
option go_package="example/example";
service FooService {}
// bar.proto
syntax = "proto3";
package example;
option go_package="example/example";
service BarService {}
protoc foo.proto bar.proto -I=. --openapi_out=.

generates

# Generated with protoc-gen-openapi
# https://github.com/googleapis/gnostic/tree/master/apps/protoc-gen-openapi

openapi: 3.0.3
info:
    title: BarService
    version: 0.0.1
paths: {}
components:
    schemas: {}

--> value of info.title shouldn't be BarService as the API does not contain only BarService but also FooService. This is no big deal (but still a minor issue though).

When I saw the merged title I was afraid that the content of paths or components could get incorrectly merged (overwritten) as well. Luckily, I did not find any evidence of incorrectly merged paths or components in my generated openapi. I suppose it should not happen the way the openAPI is generated, is that right?

@scottbla
Copy link

I also think that the current behavior "works as intended". At Google, APIs are frequently described with multiple .proto files that contain multiple services, and the API Discovery Service describes these APIs with a single document, which various third-party tools convert into OpenAPI descriptions.

FWIW...the API Discovery Service descriptions of the APIs are describing the HTTP/"REST" interface to the API. For the Google APIs you mention, the HTTP interface is derived from the HTTP bindings added to the proto description of an RPC style interface. The HTTP interface is centered around "resources" (aka "collections") and methods that can be applied to them. In creating the discovery doc description of the HTTP interface, (by default) it derives the resource names from the HTTP paths that have been associated with the method (e.g. an HTTP path like /v1/shelves/*/books/* would probably map to a shelves.books resource within that API). So, the proto entity defined by a service keyword doesn't really map to anything in the HTTP interface described in those discovery docs.

@jeffsawatzky
Copy link
Contributor

I recently ran into this issue as well, and would really like the ability to have multiple files generated instead of just the one.

  1. First of all, while merging all of the files together into one openapi.yaml file makes the easy cases really easy, it makes the more complicated cases super duper complicated. Having to move files around into different folders just to get the output we'd like seems error prone. Having multiple files makes the easy case only slightly harder, and the more complicated case much easier and less error prone to deal with.
  2. Pretty much all of the other protoc generators support multiple files, and having this one not support it seems to go against the pattern
  3. Having information generated for me automatically (title, description, version, etc) seems to violate the principle of least surprise. I was not expecting it at least.
  4. We are also using tools such as buf which doesn't play well with this approach. We don't have fine grained control over how protoc is run when using buf, and if you use remote plugins then protoc is run on a completely different machine.

My use case is as follows:

  • we have many teams
  • each team manages one or more microservices
  • we want to allow teams to version their APIs when they need to make breaking changes (hopefully not that often)
  • we want to provide OpenAPI doc with only the most recent versions of our microservice APIs

For example, say on 2022-01-01 we have two microservices, service_a and service_b, both at v1. Our proto folder might look something like this:

protos\
   service_a\
      v1\
         a.proto
   service_b\
      v1\
         b.proto

Now, say that on 2022-12-31 the team that manages service_a needs to make a breaking change so rolls out v2, so now we have something like:

protos\
   service_a\
      v1\
         a.proto
      v2\
         a.proto
   service_b\
      v1\
         b.proto

What I'd like to provide is two OpenAPI documents:

  1. one with version 2022-01-01 which will contain the service_a.v1 and service_b.v1 endpoints
  2. one with version 2022-12-31 which will contain the service_a.v2 and service_b.v1 endpoints.

The way I'd like to be able to do that is by generating individual OpenAPI docs for each service, and then merging the ones I care about for each "date version" using something like yq to merge the relevant ones together into one.

I could also use the new feature added in #324 to create openapi.proto files to contain additional information about my API (with something like in #332 to specify my authentication schemes for each version).

For instance, I could do something like this:

protos\
   service_a\
      v1\
         a.proto
      v2\
         a.proto
   service_b\
      v1\
         b.proto
   openapi\
      v20220101\
         openapi.proto
      v20221231\
         openapi.proto

Then in the openapi.proto files I would specify additional information, such as title, description, authentication schemes (including scopes specific to that version), etc.

Running protoc-gen-openapi would then generate something like this:

gen\
   openapi\
      service_a\
         v1\
            a.yaml
         v2\
            a.yaml
      service_b\
         v1\
            b.yaml
      openapi\
         v20220101\
            openapi.yaml
         v20221231\
            openapi.yaml

Then to generate the full documention of my 2022-01-01 I would do something like:

yq m "gen/openapi/openapi/v20220101/openapi.yaml" "gen/openapi/service_a/v1/a.yaml" "gen/openapi/service_b/v1/b.yaml" "./openapi-v20220101.yaml"

With this solution there is no moving of files around to various places just to make things work.

But in order to do this I would need protoc-gen-openapi to support two things:

  1. Multiple files (obviously)
  2. Stop putting stuff in the file for me automatically. Especially now that [protoc-gen-openapi] Support Fully Qualified Schema Names and Annotations for File/Document and Method/Operation #324 is done, I feel like that is a better way to allow people to control the extra meta data. The way it is now, the automatic title, description, etc might override the stuff I specifically put in my openapi.proto files.

@morphar
Copy link
Collaborator

morphar commented Apr 5, 2022

It all makes sense :)

@timburks you know the goals of gnostic the best, what are your current thoughts on 1 vs multiple files?
Is 1 file still the wanted default behaviour?
Maybe it would make sense to have a clearly stated set of goals and non-goals for gnostic, in the README?
gnostic has changed a lot and still has some way to go, before coming close to a v1.0.0.
But a lot of discussions are centered around "the goals" of gnostic, which is not clearly stated, but something you can slowly pick up (filtered through own opinions), when you read code, comments and pull requests :)

@jeffsawatzky the automatic title and description stuff, was me expanding on functionality already there.
I haven't put much thought into whether it should be there or not, just that it was sane and reversible.
I coded it, based on my understanding of the goal of gnostic being able to convert between any formats.

I don't necessarily agree, that gnostic should follow other tools' way of doing things, just because that is the norm.
Just for the sake of discussion:

Google had and has a certain set of ideas and goals for protobuf and the API Discovery Service spec.
Google probably also put the absolutely most man-power and thought into spec'ing, documenting and developing the eco-system.

So I would personally prefer, whatever Google sees as the best way, regardless of the norm and standards arising from other tools.

It's of course always nice to have compatibility through e.g. options, when defaults breaks compatibility.

But the aim of being able to convert between different formats, is vastly different than "just" being able to create e.g. an OpenAPI spec from .proto files, but not vice versa.

This doesn't mean that it shouldn't be possible to "break" this capability, by using "unsafe" options, but the defaults should reflect the goals, in my opinion, regardless of whatever else is out there.

A personal example of why I'd rather stick to Google's way of doing things:
They're really good at making things "just work" when interacting.
When I moved to Envoy, everything "just worked", except for some REST endpoints, which was apparently broken.
The definition was perfectly fine, according to some other tools, but the spec disagreed and so did Envoy.
Of course I needed to be backwards-compatible, so I had to jump through hoops to convert wrong calls, before they hit Envoy. Which in this case was at least possible.

I don't know if this was a bug or a "nice feature" (default behaviour) of the other tool, but it caused a lot of problems down the line.

I would personally much prefer gnostic as an opinionated tool, that can convert between anything, without loosing knowledge, rather than focusing on compatible with other tools.
Especially since that would sometimes mean breaking away from correct functionality.

Anyway... That became a rant, sorry.
And again: just my personal opinions :)

@jeffsawatzky
Copy link
Contributor

jeffsawatzky commented Apr 6, 2022

@timburks if someone were to put up a PR to optionally support multiple files, would you be ok with that or does it go against gnostic ideals and get rejected?

@morphar many of the tools that I refer to are the built in tools created by Google themself. So if, as you say, Google has put a lot of thought into this then there is probably a good reason as to why their tools support multiple files.

Also I'm not sure how supporting multiple files loses any information or breaks away from correct functionality, especially if it is optional. Plus, gnostic is already losing information. For instance with oneof. The way gnostic is currently converting protobuf oneofs to openapi spec, it is not possible to then take that openapi spec and produce the same protobuf as the oneof information has been lost in the openapi spec. It could also be argued that creating one big openapi spec loses information about the packages that the original messages were defined in, especially if you use the default of not using fully qualified schema names in the output. In my mind, having fully qualified schema names along with separate output files per proto file provides you with more information with which you can then use to regenerate the original proto files (if that is the goal of gnostic).

Anyway, long story short, I just want to know whether a PR to add multiple output file support would be ok or rejected. 😄 As for how it would work, I'd suggest following something similar to what grpc-gateway's protoc-gen-openapiv2 is doing.

@morphar
Copy link
Collaborator

morphar commented Apr 6, 2022

@jeffsawatzky short answer: ask @timburks, I'm also just a contributor ;)
I personally can't see why, multiple files should not be supported as optional.

many of the tools that I refer to are the built in tools created by Google themself

I was referring to proto and API Discovery with regards to 1 file.

Also I'm not sure how supporting multiple files loses any information or breaks away from correct functionality, especially if it is optional.

I am totally for supporting as much functionality and being as compatible as possible - especially as optional.
I didn't say that it would cause loss of information, I was just sharing my personal opinion of how I would prefer that gnostic works - which I also think is relatively in line with the goals of gnostic.

Plus, gnostic is already losing information. For instance with oneof. The way #251 (comment), it is not possible to then take that openapi spec and produce the same protobuf as the oneof information has been lost in the openapi spec.

Again: I agree. gnostic is not at v1.0.0 and has not achieved all goals yet.
Some code is closer to hitting the goals than other - e.g. protoc-gen-openapi has received a lot of love lately and I think your suggestions and our talk about oneof, is going to push protoc-gen-openapi a lot closer to perfection and a v1.0.0.

As I have stated earlier, I think this project would benefit hugely from having officially stated goals.
Not that they can't change, but we have a lot of back and forth talks, that could be easily resolved, by looking at suck a list.
@timburks would something like that be possible?

@timburks
Copy link
Contributor

timburks commented Apr 9, 2022

Hi, I agree with @morphar's suggestion that it would be good to discuss overall project goals and influences. I'll start an issue about that (edit: see #340).

On this topic, I can say that an original goal of protoc-gen-openapi was to bypass Google's API Discovery format, which to my knowledge, has no open source tooling for going from Google's primary API design language (protos) to the documents published on the Google API Discovery Service. These "Discovery documents" can be used to generate OpenAPI with tools like @IvanGoncharov and @MikeRalphson's google-discovery-to-swagger, but with more public definitions of HTTP transcoding (like [aip-127](https://aip.dev/127 and the Envoy and grpc-gateway implementations), a direct path to OpenAPI seems easier to maintain.

Basically this means supporting the googleapis protos and generating OpenAPI docs for each transcoded API.

@jeffsawatzky, the Discovery docs are organized one-per-API-version. These are all generated from protos, either from public or internal ones, and each API version corresponds to a directory path that ends in a version name (v1, v2beta1, v2, etc). So the Google API methodology would produce separate Discovery docs for your service_a/v1, service_a/v2, service_b/v1, etc. and we would get those by running each directory of protos separately through protoc and plugins. AFAIK, we don't distribute any formal API descriptions that merge these Discovery docs, and I wouldn't expect that we would do that for OpenAPI documents, either... although I see that some companies do produce huge aggregated OpenAPI docs. One problem with aggregations is that they are monolithically versioned - a change to any sub-API bumps the version of the entire collection, and it's not clear what, if anything, relates the versions of the aggregate to the versions of its components.

Could aggregation be treated as a separate operation on OpenAPI documents? If so, that moves it out of scope for protoc-gen-openapi. If we really need to have it here (is there a buf-related limitation that I haven't fully grasped yet?), then I wouldn't rule it out -- but let's first discuss an interface (plugin options) to control this.

@gfelbing
Copy link
Contributor

gfelbing commented Jul 12, 2022

@timburks in regards what @jeffsawatzky proposed: I'd like to propose an PR, discussing the interface (plugin options) there: #359

We have the usecase of using buf in a monorepo.
The repo contains multiple services, each holding a set of protos, defining their own API.
Generation of a single openapi yaml doesn't make sense for us.

My suggestion would be:

add a flag output_mode with two options:

  • merged: Default, the current behavior
  • source_relative:
    • for each file (having file.Generate == true) path/to/file.proto: Create a separate path/to/file.openapi.yaml
    • each file contains only the paths (rpcs) from the proto it corresponds to, as well as all schemas it references (keeping the yamls self-contained)

optional: add a boolean flag skip_empty

  • false: Default, the current behavior
  • true: to avoid generating openapi.yaml that contain an empty skeleton, skip the generation when there are no paths.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants