Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Jobs] Couldn't build proto file into descriptor pool! exception after installing ray in project #31358

Closed
brary opened this issue Dec 29, 2022 · 10 comments · Fixed by #31632
Closed
Assignees
Labels
bug Something that is supposed to be working; but isn't core Issues that should be addressed in Ray Core core-clusters For launching and managing Ray clusters/jobs/kubernetes P0 Issues that should be fixed in short order

Comments

@brary
Copy link

brary commented Dec 29, 2022

Couldn't build proto file into descriptor pool! exception after installing ray in project

Hi, I have an existing working project in which I added the ray[default] dependency. Now the following has started coming.

from opentelemetry.exporter.opencensus.trace_exporter import OpenCensusSpanExporter
  File "/tmp/ray/session_2022-12-23_07-50-00_227742_8/runtime_resources/pip/513edf5d83ec05c69a73be1be828c66f252e0b48/virtualenv/lib/python3.7/site-packages/opentelemetry/exporter/opencensus/trace_exporter/__init__.py", line 21, in <module>
    from opencensus.proto.agent.trace.v1 import (
  File "/tmp/ray/session_2022-12-23_07-50-00_227742_8/runtime_resources/pip/513edf5d83ec05c69a73be1be828c66f252e0b48/virtualenv/lib/python3.7/site-packages/opencensus/proto/agent/trace/v1/trace_service_pb2.py", line 16, in <module>
    from opencensus.proto.resource.v1 import resource_pb2 as opencensus_dot_proto_dot_resource_dot_v1_dot_resource__pb2
  File "/tmp/ray/session_2022-12-23_07-50-00_227742_8/runtime_resources/pip/513edf5d83ec05c69a73be1be828c66f252e0b48/virtualenv/lib/python3.7/site-packages/opencensus/proto/resource/v1/resource_pb2.py", line 22, in <module>
    serialized_pb=_b('\n+opencensus/proto/resource/v1/resource.proto\x12\x1copencensus.proto.resource.v1\"\x8b\x01\n\x08Resource\x12\x0c\n\x04type\x18\x01 \x01(\t\x12\x42\n\x06labels\x18\x02 \x03(\x0b\x32\x32.opencensus.proto.resource.v1.Resource.LabelsEntry\x1a-\n\x0bLabelsEntry\x12\x0b\n\x03key\x18\x01 \x01(\t\x12\r\n\x05value\x18\x02 \x01(\t:\x02\x38\x01\x42\x98\x01\n\x1fio.opencensus.proto.resource.v1B\rResourceProtoP\x01ZEgithub.com/census-instrumentation/opencensus-proto/gen-go/resource/v1\xea\x02\x1cOpenCensus.Proto.Resource.V1b\x06proto3')
  File "/home/ray/anaconda3/lib/python3.7/site-packages/google/protobuf/descriptor.py", line 982, in __new__
    return _message.default_pool.AddSerializedFile(serialized_pb)
TypeError: Couldn't build proto file into descriptor pool!
Invalid proto descriptor for file "opencensus/proto/resource/v1/resource.proto":
  opencensus/proto/resource/v1/resource.proto: A file with this name is already in the pool.

Versions / Dependencies

opentelemetry-exporter-opencensus = 0.17b0
ray = 2.2.0

Reproduction script

test.py

from opentelemetry.exporter.opencensus.trace_exporter import OpenCensusSpanExporter
from ray.job_submission import JobSubmissionClient, JobStatus

pyproject.toml

[tool.poetry]
name = "testing"
version = "0.1.0"
description = "Testing Library"
authors = ["Engineering <navinder_brar@yahoo.com>"]

[[tool.poetry.source]]
name = 'pypi-public'
url = "https://pypi.org/simple/"

[tool.poetry.dependencies]
python = "^3.7"
ray = {extras = ["default"], version = "^2.2.0"}
opentelemetry-api = "^0.17b0"
opentelemetry-sdk = "^0.17b0"
opentelemetry-instrumentation-grpc = "^0.17b0"
opentelemetry-instrumentation = "^0.17b0"
opentelemetry-exporter-opencensus = "^0.17b0"
opentelemetry-exporter-prometheus = "^0.17b0"
protobuf = "3.20.3"

[build-system]
requires = ["poetry>=0.12", "pip>=20.2"]
build-backend = "poetry.masonry.api"

After installing dependencies via poetry install -> run test.py

Issue Severity

High: It blocks me from completing my task.

@brary brary added bug Something that is supposed to be working; but isn't triage Needs triage (eg: priority, bug/not-bug, and owning component) labels Dec 29, 2022
@brary brary changed the title [Jobs] [Jobs] Couldn't build proto file into descriptor pool! exception after installing ray in project Dec 29, 2022
@architkulkarni architkulkarni added the core-clusters For launching and managing Ray clusters/jobs/kubernetes label Dec 30, 2022
@architkulkarni
Copy link
Contributor

Hi @brary, how are you running test.py? Based on the traceback, it looks like you're specifying a runtime_env with the pip field, and somehow import OpenCensusSpanExporter is failing in that environment. Knowing the full runtime_env specification would help debug this.

@brary
Copy link
Author

brary commented Dec 31, 2022

For now I am running it directly via python test.py Basically, I am not running this on ray worker nodes. I have just added the ray dependency in my project and the imports are failing now because the proto files are clashing.

@rkooo567
Copy link
Contributor

rkooo567 commented Jan 6, 2023

I think the root cause is ray already has the same protobuf built

ray/BUILD.bazel

Line 2908 in 5695c93

sed -i -E 's/from opencensus.proto.resource.v1 import/from . import/' "$${files[@]}"
, and there's a collision because of that (Ray uses opencensus as a dependency).

Also as a workaround, this may work ValvePython/csgo#8. We will investigate the root cause fix soon

@rkooo567 rkooo567 added core Issues that should be addressed in Ray Core P1 Issue that should be fixed within a few weeks Ray 2.4 and removed triage Needs triage (eg: priority, bug/not-bug, and owning component) labels Jan 6, 2023
@pcmoritz
Copy link
Contributor

pcmoritz commented Jan 6, 2023

The --no-binary=protobuf workaround is already deployed by the user -- but we should make sure we fix the root cause, because (a) this is very annoying for users and (b) using the workaround can have a performance impact :)

@pcmoritz pcmoritz added Ray 2.3 and removed Ray 2.4 labels Jan 6, 2023
@pcmoritz
Copy link
Contributor

pcmoritz commented Jan 6, 2023

Let's make sure we fix this for Ray 2.3 :)

@rkooo567
Copy link
Contributor

rkooo567 commented Jan 6, 2023

Sounds good. I will prioritize to fix this by 2.3

@rkooo567 rkooo567 self-assigned this Jan 6, 2023
@zhe-thoughts zhe-thoughts added P0 Issues that should be fixed in short order and removed P1 Issue that should be fixed within a few weeks labels Jan 9, 2023
@scv119
Copy link
Contributor

scv119 commented Jan 9, 2023

protocolbuffers/protobuf#3002 suggests an alternative is to use

PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION="python"

@scv119
Copy link
Contributor

scv119 commented Jan 9, 2023

verified locally

PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION="python"

works.

@scv119
Copy link
Contributor

scv119 commented Jan 9, 2023

(ray) ubuntu@ip-172-31-45-118:/tmp/workspace$ python conflict.py
Traceback (most recent call last):
  File "conflict.py", line 2, in <module>
    from ray.job_submission import JobSubmissionClient, JobStatus
  File "/home/ubuntu/.local/lib/python3.8/site-packages/ray/__init__.py", line 109, in <module>
    import ray._raylet  # noqa: E402
  File "python/ray/_raylet.pyx", line 117, in init ray._raylet
  File "/home/ubuntu/.local/lib/python3.8/site-packages/ray/exceptions.py", line 17, in <module>
    from ray.util.annotations import DeveloperAPI, PublicAPI
  File "/home/ubuntu/.local/lib/python3.8/site-packages/ray/util/__init__.py", line 5, in <module>
    from ray._private.services import get_node_ip_address
  File "/home/ubuntu/.local/lib/python3.8/site-packages/ray/_private/services.py", line 26, in <module>
    from ray._private.gcs_utils import GcsClient
  File "/home/ubuntu/.local/lib/python3.8/site-packages/ray/_private/gcs_utils.py", line 14, in <module>
    from ray.core.generated import gcs_service_pb2, gcs_service_pb2_grpc
  File "/home/ubuntu/.local/lib/python3.8/site-packages/ray/core/generated/gcs_service_pb2.py", line 18, in <module>
    from . import pubsub_pb2 as src_dot_ray_dot_protobuf_dot_pubsub__pb2
  File "/home/ubuntu/.local/lib/python3.8/site-packages/ray/core/generated/pubsub_pb2.py", line 20, in <module>
    from . import reporter_pb2 as src_dot_ray_dot_protobuf_dot_reporter__pb2
  File "/home/ubuntu/.local/lib/python3.8/site-packages/ray/core/generated/reporter_pb2.py", line 15, in <module>
    from . import metrics_pb2 as opencensus_dot_proto_dot_metrics_dot_v1_dot_metrics__pb2
  File "/home/ubuntu/.local/lib/python3.8/site-packages/ray/core/generated/metrics_pb2.py", line 17, in <module>
    from . import resource_pb2 as opencensus_dot_proto_dot_resource_dot_v1_dot_resource__pb2
  File "/home/ubuntu/.local/lib/python3.8/site-packages/ray/core/generated/resource_pb2.py", line 17, in <module>
    DESCRIPTOR = _descriptor_pool.Default().AddSerializedFile(b'\n+opencensus/proto/resource/v1/resource.proto\x12\x1copencensus.proto.resource.v1\"\xa5\x01\n\x08Resource\x12\x12\n\x04type\x18\x01 \x01(\tR\x04type\x12J\n\x06labels\x18\x02 \x03(\x0b\x32\x32.opencensus.proto.resource.v1.Resource.LabelsEntryR\x06labels\x1a\x39\n\x0bLabelsEntry\x12\x10\n\x03key\x18\x01 \x01(\tR\x03key\x12\x14\n\x05value\x18\x02 \x01(\tR\x05value:\x02\x38\x01\x42\x98\x01\n\x1fio.opencensus.proto.resource.v1B\rResourceProtoP\x01ZEgithub.com/census-instrumentation/opencensus-proto/gen-go/resource/v1\xea\x02\x1cOpenCensus.Proto.Resource.V1b\x06proto3')
TypeError: Couldn't build proto file into descriptor pool!
Invalid proto descriptor for file "opencensus/proto/resource/v1/resource.proto":
  opencensus/proto/resource/v1/resource.proto: A file with this name is already in the pool.

(ray) ubuntu@ip-172-31-45-118:/tmp/workspace$  PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION="python" python conflict.py
(ray) ubuntu@ip-172-31-45-118:/tmp/workspace$ cat conflict.py
from opentelemetry.exporter.opencensus.trace_exporter import OpenCensusSpanExporter
from ray.job_submission import JobSubmissionClient, JobStatus

@rkooo567
Copy link
Contributor

We decide to fix this asap. I am working on the fix now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something that is supposed to be working; but isn't core Issues that should be addressed in Ray Core core-clusters For launching and managing Ray clusters/jobs/kubernetes P0 Issues that should be fixed in short order
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants