Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: ability to specify output package name to protoc #2283

Closed
geigerj opened this issue Oct 25, 2016 · 17 comments
Closed

Feature request: ability to specify output package name to protoc #2283

geigerj opened this issue Oct 25, 2016 · 17 comments
Assignees

Comments

@geigerj
Copy link

geigerj commented Oct 25, 2016

Context: googleapis/artman#97. This currently affects Python only.

What

We'd like the ability to specify the output package name in calls to protoc (actually, we are currently using the gRPC Python wrapper grpc.tools.protoc). Currently the Python package corresponds exactly to the proto directory structure, for example, foo/bar/baz.proto generates the Python module foo.bar.baz_pb2. It should be possible to substitute an arbitrary value for foo.bar. Currently this only seems to be possible by moving the input proto to a different directory; ideally this could just be an argument to protoc.

Why

We do code generation from googleapis/googleapis, which by convention keeps protos in a directory structure the same as the proto package. Sometimes the proto package is not usable as a Python package (for example, it may already be used by google-cloud-python). Ideally we shouldn't need to move the protos in googleapis/googleapis around just for Python code generation.

@geigerj geigerj changed the title Ability to specify output package name Feature request: ability to specify output package name to protoc Oct 25, 2016
@bjwatson
Copy link

A related issue is that internal imports use the proto package rather than the directory where the proto file is found, so we have to manually fix these imports when we do this "change proto directory" trick to change the Python package. An example is the dependency of image_annotator_pb2.py on geometry_pb2.py in Cloud Vision v1.

@xfxyjwf
Copy link
Contributor

xfxyjwf commented Oct 25, 2016

@haberman

@xfxyjwf
Copy link
Contributor

xfxyjwf commented Oct 25, 2016

@bjwatson I don't quite understand the issue you mentioned. What do you mean by "internal imports use the proto package"? For example:

// foo/bar/baz.proto
package a.b.c;
message X {}

"a.b.c" is the proto package. foo.bar is the python package which is the same as the file path. What 's the "internal imports" you are referring to?

@bjwatson
Copy link

@xfxyjwf Take for example Cloud Vision image_annotator.proto.

Move it into a path like google/cloud/grpc/vision/v1 (which we have to do to for technical reasons related to our Python google-cloud and GAPIC client libraries), and then use python -m grpc.tools.protoc to build google/cloud/grpc/vision/v1/image_annotator_pb2.py.

You will see that everything builds correctly for the new path (google/cloud/grpc/vision/v1), except for one line which says:

# note no 'grpc' is in the path
from google.cloud.vision.v1 import geometry_pb2 as google_dot_cloud_dot_vision_dot_v1_dot_geometry__pb2

Maybe this won't be an issue once Jacob's feature request here is implemented (and we don't have to move the proto to a different directory), but I want to make sure this issue with our current workaround is understood.

@xfxyjwf
Copy link
Contributor

xfxyjwf commented Oct 26, 2016

@bjwatson Have you updated the import statement in image_annotator.proto to point to the correct path of "google/cloud/grpc/vision/v1/geometry.proto"? If you haven't, what you observed is the right behavior.

@bjwatson
Copy link

@xfxyjwf Yes, I have to manually update the import statement every time I regenerate image_annotator_pb2.py. We will work around this in the short term by scripting it, but the overarching purpose of this ticket is to future-proof what we're doing.

Basically, we need to be able to reliably build Python code that has a different path than the proto package. We have a recipe that allows us to do this right now, but there's no guarantee that future protobuf and/or gRPC changes won't break it.

@sigmonsays
Copy link

+1

@xfxyjwf xfxyjwf self-assigned this Nov 2, 2016
@xfxyjwf
Copy link
Contributor

xfxyjwf commented Nov 2, 2016

We talked about this issue in a protobuf team meeting and it was pointed out that supporting an alternative python package requires build tool changes. Right now Google's internal build tool blaze assumes a .proto file foo/bar.proto will generate foo/bar_pb2.py. There needs a way to teach blaze that certain .proto files are generating python code in a different package. I can imagine it's not going to be an easy change. Comparing the complexity (update both protobuf and blaze and potentially lots of other tools that analyzes BUILD files) against its benefit (give some people an easier way to avoid name conflict) I would say it's not worthwhile to support.

@bjwatson
Copy link

bjwatson commented Nov 2, 2016

@xfxyjwf Can you hold this issue open for a week or so? I'd like to discuss the trade-offs with you, but I have a few other urgent items I need to address first.

@xfxyjwf
Copy link
Contributor

xfxyjwf commented Nov 2, 2016

@bjwatson sure, I will keep this issue open for a while.

@bjwatson
Copy link

bjwatson commented Nov 2, 2016

Thanks.

@geigerj
Copy link
Author

geigerj commented Nov 8, 2016

@xfxyjwf Is this still problematic if it is done by a proto option, like java_package/go_package?

@xfxyjwf
Copy link
Contributor

xfxyjwf commented Nov 8, 2016

@geigerj I don't know about go, but for Java, the java_package option doesn't affect the generated artifacts of blaze because we put all generated Java code into one single srcjar file. A python_package option will work differently because the generated py file will be in a different path, and that would require a blaze build rule attribute for it to work (like the py_api_version we have internally). It's doable but I don't see a very compelling reason to support it. In this particular issue, a python_package option does not eliminate package conflicts. It just offers you an alternative approach to rename packages. I believe there are other means to rename packages without this option, and yes, the alternative may be more complicated than a python_package option, but the complication a python_package option adds to protobuf/blaze is much more than that.

@bjwatson
Copy link

bjwatson commented Dec 5, 2016

@xfxyjwf I think you can close this issue based on our discussion a few weeks ago. @geigerj do you have any concerns about that?

@geigerj
Copy link
Author

geigerj commented Dec 5, 2016

@bjwatson Nope, no concerns.

@tjerkw
Copy link

tjerkw commented Jan 17, 2020

This is weird. So i cannot generate python proto files in a specific package?
Do i realy have to manually edit this?

@wyattanderson
Copy link

It's doable but I don't see a very compelling reason to support it.

Is this issue not a compelling enough reason to support this? How else could we resolve conflicts like this? This is extremely cumbersome to work around otherwise given how many places in the generated code this would need to be fixed—not just import paths but the __module__ property on GeneratedProtocolMessageType. Maintaining this externally requires deep knowledge of the internals of generated code and is very brittle. An option is the only way to facilitate this in a stable, maintainable manner.

In this particular issue, a python_package option does not eliminate package conflicts. It just offers you an alternative approach to rename packages.

On its own does it eliminate package conflicts? No. Would it provide a way for package authors and Bazel users to eliminate package conflicts? Yes.

I believe there are other means to rename packages without this option, and yes, the alternative may be more complicated than a python_package option, but the complication a python_package option adds to protobuf/blaze is much more than that.

What alternative is there for the solution above? Is it really a compelling alternative to have to add an intermediate processing step to change generated code—the structure of which may change upstream at any time? Again, it's not just a matter of changing imports in an AST-aware manner, but internal string constants like __module__ need to change as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants