Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't generate a file if the .proto doesn't contain any services #42

Closed
rwlincoln opened this issue Oct 9, 2018 · 9 comments
Closed
Assignees

Comments

@rwlincoln
Copy link

Some protobuf packages may be split into multiple .proto files, but generated with a single call to protoc. For example:

/myapi/myapi.proto
/myapi/services.proto

In this case, a myapi_grpc.py will be generated even if myapi.proto doesn't contain any services. Could grpclib be modified to only generate files if a service is defined in the corresponding .proto file.

@rwlincoln
Copy link
Author

The official gRPC Python plug-in generates _pb2_grpc.py files even if the corresponding .proto file doesn't contain any services. As a workaround, they can be deleted with the command:

find . -type f -name "*_grpc.py" -size -500c -delete

@vmagamedov vmagamedov self-assigned this Oct 9, 2018
@vmagamedov
Copy link
Owner

I agree with you, it is worthless to generate stub files for .proto files without service definitions.

@aaliddell
Copy link
Contributor

aaliddell commented Mar 24, 2019

This change actually complicates using this package in build systems, as now they need to be aware of the content of files to tell which will actually produce output files. More precisely, there is no longer a N input -> NxM output mapping of files as you would expect from a typical 'compiler' such as you'd see *.cpp->*.o etc (e.g if gcc started omitting .o files for empty .cpp files, there would be a similar issue).

All other protoc plugins that I have come across (bar one exception that has this same build issue) produce this consistent N input NxM output (or single output for N input) behaviour. See grpc/grpc#14172 for this same discussion on mainline grpc, which settled on keeping the empty output files for the reason given above, amongst others.

So, would it be possible to reconsider this change, as removing these files hasn't gained much except some 'cleanliness', but breaks use with build systems?

@vmagamedov
Copy link
Owner

@aaliddell I see that there are two common ways to generate files:

  • lazy approach: using one-liner with **/*.proto wildcards
  • manage several lists of proto files: messages-only, messages+services, services-only

I'm using second approach in my projects, so I precisely know which files contain messages, and which files contain services. So I don't have any problems in any case.

For me it is strange that if I have 10 files with only messages and 1 file with a service, build system generates 22 files where 11 files are empty. To be fair there will be 10 empty files, pb2.py file of a service will contain a service descriptor.

Here is an example, where build system knows explicitly when proto file contains services: https://github.com/envoyproxy/envoy/blob/d841d20/api/envoy/api/v2/BUILD#L35

Can you elaborate on how current behavior brakes use with build systems? Maybe I'm missing something. grpc/grpc#14172 issue doesn't answers this question.

@aaliddell
Copy link
Contributor

Take for example the current Bazel rules_proto package, which is the current advised method of generating grpc libraries for that build system. Whilst in the analysis (setup) phase, Bazel expects to construct a graph of all input files, output files and processes, so that it can figure out the sequence of operations to produce a given target. In order to do this, Bazel must know in advance what outputs a given process (compiler, protoc etc) will produce given any valid set of inputs (i.e the operation must be consistent across inputs). Using this consistency, the build system can predict at the analysis phase that running protoc with example.proto will always produce two outputs (example_pb2.py and example_pb2_grpc.py) without ever looking at the files, which may not even exist yet if they are created by a previous operation. This same expectancy on consistent compiler behaviour extends to pretty much any graph based build system, beyond just Bazel.

Your second method works for single package builds where the proto files are known and segmented in advance, but once you start using dependency chains on proto libraries it begins to fail, as there is typically no method for distinguishing proto file types in transitive dependencies. It's also particularly brittle to unexpectedly fail if you happen to add or remove services to a file in the future without remapping the groups, as the build system will keel over because an output file it expected now doesn't exist, or a new unexpected file pops up.

So to sum up in one sentence: In its current state, the grpclib package cannot reasonably integrate with such build systems, as the outputs generated are not consistently determined by just the input file names, but rather by the content of the input files.

The reason this came up was that I was looking to add grpclib support to rules_proto, but to do so one must be able to specify a proto_plugin, which requires this consistent behaviour. See the outputs lines here, which specify what output file suffix will be generated for every input file: https://github.com/stackb/rules_proto/blob/master/python/BUILD.bazel https://github.com/stackb/rules_proto/blob/master/plugin.bzl#L32-L34

@vmagamedov
Copy link
Owner

Thank you for the explanation. Seems reasonable to revert original behavior.

vmagamedov added a commit that referenced this issue Mar 26, 2019
@aaliddell
Copy link
Contributor

Ok, thank you.

If you want a middle-ground, perhaps there could be an option to the plugin using the passed parameter (https://github.com/protocolbuffers/protobuf/blob/master/src/google/protobuf/compiler/plugin.proto#L75) to select that empty services may be skipped for those that don't want them.

@vmagamedov
Copy link
Owner

perhaps there could be an option to the plugin using the passed parameter

This is how this issue should have been fixed from the beginning 🤦‍♂️ Thanks!

@vmagamedov
Copy link
Owner

Implemented skip-empty parameter for protoc plugin long time ago, but can't merge this feature, this just doesn't feels right. It is better to properly configure build system with explicit inputs and outputs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants