Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

opentelemetry-exporter-otlp-proto-grpc breaks with protobuf 4.21.0 #2717

Closed
tl-moreno-vendra opened this issue May 26, 2022 · 10 comments · Fixed by #2720
Closed

opentelemetry-exporter-otlp-proto-grpc breaks with protobuf 4.21.0 #2717

tl-moreno-vendra opened this issue May 26, 2022 · 10 comments · Fixed by #2720
Labels
bug Something isn't working

Comments

@tl-moreno-vendra
Copy link

opentelemetry-exporter-otlp-proto-grpc is broken on its latest stable release (v1.11.1). This is due to the breaking changes contained in the latest release of protobuf. You can find more details about the breaking changes here: https://developers.google.com/protocol-buffers/docs/news/2022-05-06#python-updates

Describe your environment

  • Python version: 3.9.10
  • opentelemetry-exporter-otlp-proto-grpc version: v1.11.1

Steps to reproduce

from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import OTLPSpanExporter

What is the expected behavior?
The OTLPSpanExporter is imported correctly.

What is the actual behavior?
This exception is thrown:

Traceback (most recent call last):
  File "/Users/vendra/Projects/tl/tmp/python-otel/protobuf_issue.py", line 1, in <module>
    from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import OTLPSpanExporter
  File "/Users/vendra/Library/Caches/pypoetry/virtualenvs/python-otel-issue-demo-UdOd2j3B-py3.9/lib/python3.9/site-packages/opentelemetry/exporter/otlp/proto/grpc/trace_exporter/__init__.py", line 22, in <module>
    from opentelemetry.exporter.otlp.proto.grpc.exporter import (
  File "/Users/vendra/Library/Caches/pypoetry/virtualenvs/python-otel-issue-demo-UdOd2j3B-py3.9/lib/python3.9/site-packages/opentelemetry/exporter/otlp/proto/grpc/exporter.py", line 39, in <module>
    from opentelemetry.proto.common.v1.common_pb2 import (
  File "/Users/vendra/Library/Caches/pypoetry/virtualenvs/python-otel-issue-demo-UdOd2j3B-py3.9/lib/python3.9/site-packages/opentelemetry/proto/common/v1/common_pb2.py", line 36, in <module>
    _descriptor.FieldDescriptor(
  File "/Users/vendra/Library/Caches/pypoetry/virtualenvs/python-otel-issue-demo-UdOd2j3B-py3.9/lib/python3.9/site-packages/google/protobuf/descriptor.py", line 560, in __new__
    _message.Message._CheckCalledFromGeneratedFile()
TypeError: Descriptors cannot not be created directly.
If this call came from a _pb2.py file, your generated code is out of date and must be regenerated with protoc >= 3.19.0.
If you cannot immediately regenerate your protos, some other possible workarounds are:
 1. Downgrade the protobuf package to 3.20.x or lower.
 2. Set PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python (but this will use pure-Python parsing and will be much slower).

More information: https://developers.google.com/protocol-buffers/docs/news/2022-05-06#python-updates

Additional context
More context about the issue and the latest protobuf release can be found here:
https://developers.google.com/protocol-buffers/docs/news/2022-05-06#python-updates

@tl-moreno-vendra tl-moreno-vendra added the bug Something isn't working label May 26, 2022
@srikanthccv
Copy link
Member

@TBBle
Copy link
Contributor

TBBle commented May 26, 2022

We could just change install_requires to protobuf~=3.20, although I'm not sure if the protobuf package is intentionally using "major-version==API version" like that though. (I assume it is since it's a 4.x release that broke the back-compat, but I can't see that documented anywhere)

@tl-moreno-vendra
Copy link
Author

tl-moreno-vendra commented May 26, 2022

We could just change install_requires to protobuf~=3.20, although I'm not sure if the protobuf package is intentionally using "major-version==API version" like that though. (I assume it is since it's a 4.x release that broke the back-compat, but I can't see that documented anywhere)

I ran some local tests and pinning to the previous major version of protobuf actually fixes it. Here's the list of breaking changes in the 4.x protobuf release (taken from here):

The new release does contain some breaking changes. Specifically:

  • The UnknownFields() method, which relied on an implicitly created class, is replaced with the explicitly-created UnknownFieldSet class.
  • Some non-core characteristics may have changed, such as the specific format of certain strings or error messages. These are not considered breaking changes, but may still impact your existing code base.
  • Applications that rely on sharing messages between Python and C++ break in the new version. Most developers won't be affected by this, but users of Nucleus{.external} and possibly other libraries may be. As a workaround, you can set an environment variable that forces the library to preserve compatibility.
  • Python upb requires generated code that has been generated from protoc 3.19.0 or newer.

@TBBle
Copy link
Contributor

TBBle commented May 26, 2022

Yeah, explicitly depending on protobuf~=3.20 is the workaround we're using here, and it would be a short-term solution to do this in setup.cfg.

Long-term, regenerating the code with "protoc 3.19.0 or newer" needs to happen, and in fact (per the protobuf Python install docs) there seems to be an expectation that when using protobuf version X, the generated code should be generated with protoc of the same version.

So perhaps the protobuf version should always have been pinned to the protoc version used to generate the code, and conversely, regeneration done more often to shift the pin?

@TBBle
Copy link
Contributor

TBBle commented May 26, 2022

For the record, https://github.com/open-telemetry/opentelemetry-python-contrib/blob/main/instrumentation/opentelemetry-instrumentation-grpc/setup.cfg#L53 (in the opentelemetry-python-contrib repo) has the same problem. I didn't look around to see if anything else in either repo has the overly-optimistic protobuf dependency spec.

@TBBle
Copy link
Contributor

TBBle commented May 26, 2022

I posted #2720 to fix this in the easy (and backportable) way, but in the meantime, the protobuf 4.x release package (4.21.0) has been yanked, so this is maybe not-so-urgent.

@aabmass
Copy link
Member

aabmass commented May 26, 2022

Thanks for jumping on this @TBBle!

and in fact (per the protobuf Python install docs) there seems to be an expectation that when using protobuf version X, the generated code should be generated with protoc of the same version.

I don't think this is feasible, if someone has multiple libraries with generated code which don't use the same protobuf version, they will hit dependency conflicts, e.g. a google cloud client libraries and opentelemetry-proto.

although I'm not sure if the protobuf package is intentionally using "major-version==API version" like that though. (I assume it is since it's a 4.x release that broke the back-compat, but I can't see that documented anywhere)

I think that is a fair assumption, see protocolbuffers/protobuf#10051 (comment)

@TBBle
Copy link
Contributor

TBBle commented May 27, 2022

Yup, that makes sense, since Python package managers resolve dependencies across a project, not per-project, there needs to be a commonly-compatible protobuf package for all libraries in a project.

Given that there's clearly a wide-spread habit of protobuf >= 3.x dependency-spec, based on the blast radius visible in protocolbuffers/protobuf#10051, there's going to be a period of pain as people adapt their libraries to exclude or support 4.x. Once that's past then the risk is only when there are non-4.x and 4.x-only libraries in the same project, so regenerating with newer protoc still needs to be done (and is probably a good idea anyway) over time, to eventually unstick everyone.

I see protobuf 4.21.0 has been unyanked again, and there's still some activity on that ticket, e.g.

We generally do not recommend people checkin generated source code, but instead generate it just in time as part of their builds to prevent this sorta thing. However, if it has to happen, I would also expect end users to pin the runtime to the exact version they generated said code with.

which suggests that even if not-feasible, at least some committer on that project had the same expectation as I think the docs implied (and also a workflow in mind that I don't think I've actually seen in the various protobuf-using projects I've touched in Python or Go).

@grotrek
Copy link

grotrek commented May 30, 2022

If I understand correctly if you regenerate OTEL code using protobuf version 3.19 it will be working with both 3.x proto and 4.x proto. And 3.19 does not have breaking changes with 3.13, so it should work with previously installed setups. Maybe it will be a good solution for now? Protobuf 4.x offers significantly better parsing performance and it will be sad if I cannot use it with OTEL in my project.

@TBBle
Copy link
Contributor

TBBle commented May 31, 2022

Yes, regenerating with modern protoc and changing the dependency to ~=3.19 || ~=4.0 (or probably 3.20, whatever protoc version is used) is the correct next step.

The reasons I didn't do it in my PR:

  • Stuff was broken right now and "fix that works immediately" had high value.
  • The fix I posted can be easily/safely/trivially backported if desired to create a new 1.11 release, since past releases are broken too, and 1.12.0rc0 has some outstanding issues.
  • I noticed (and more in the -contrib repo) that a bunch of stuff was depending on quite old protobuf versions, and possibly need closer examination when increasing the minimum to 3.19.x. (This shouldn't be a problem, that's not the same as knowing it's not a problem.)
  • I've never dealt with protoc Python codegen, so thought I'd leave it for someone who knows what they're doing and would know if the resulting changes were valid/test-covered etc.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants