Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Build] Explicitly set protobuf dependency version to 3.16.0 #15756

Merged
merged 1 commit into from
May 13, 2021

Conversation

mwtian
Copy link
Member

@mwtian mwtian commented May 12, 2021

Why are these changes needed?

This change is based on #14117 (comment) and #14728. It allows building ray with bazel 4.0.0 which is the latest version. By including the protobuf dependency explicitly, likely we avoid implicitly using an older version of protobuf.

Java protobuf dependencies are also updated to the same version, to avoid build failures.

Related issue number

Closes #14117

Checks

  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

@mwtian mwtian marked this pull request as draft May 12, 2021 04:03
@mwtian
Copy link
Member Author

mwtian commented May 12, 2021

Test failure in windows-msvc should hopefully be fixed by #15758.
IIUC the Java io.ray.test.JobConfigTest.testInActor test segfault in buildkite/ray-builders-pr/java is unrelated and happens in other pull requests as well https://buildkite.com/ray-project/ray-builders-pr/builds/5154#b8c2dfd3-d572-41e1-bdc8-6413e6f17e0f

@mwtian mwtian marked this pull request as ready for review May 12, 2021 06:24
@mwtian mwtian changed the title Explicitly set protobuf dependency version to 3.16.0 [Build] Explicitly set protobuf dependency version to 3.16.0 May 12, 2021
… bazel 4.0.0

Java protobuf dependency version is made to be consistent as well.
@mwtian
Copy link
Member Author

mwtian commented May 13, 2021

The remaining test failure is from //python/ray/tests:test_placement_group with:

2021-05-12 23:18:23,724	WARNING worker.py:1102 -- The agent on node 1e6b800dbf31 failed with the following error:
--
  | Traceback (most recent call last):
  | File "/ray/python/ray/new_dashboard/agent.py", line 326, in <module>
  | loop.run_until_complete(agent.run())
  | File "/opt/miniconda/lib/python3.6/asyncio/base_events.py", line 488, in run_until_complete
  | return future.result()
  | File "/ray/python/ray/new_dashboard/agent.py", line 138, in run
  | modules = self._load_modules()
  | File "/ray/python/ray/new_dashboard/agent.py", line 92, in _load_modules
  | c = cls(self)
  | File "/ray/python/ray/new_dashboard/modules/reporter/reporter_agent.py", line 148, in __init__
  | self._metrics_agent = MetricsAgent(dashboard_agent.metrics_export_port)
  | File "/ray/python/ray/_private/metrics_agent.py", line 76, in __init__
  | namespace="ray", port=metrics_export_port)))
  | File "/ray/python/ray/_private/prometheus_exporter.py", line 334, in new_stats_exporter
  | options=option, gatherer=option.registry, collector=collector)
  | File "/ray/python/ray/_private/prometheus_exporter.py", line 266, in __init__
  | self.serve_http()
  | File "/ray/python/ray/_private/prometheus_exporter.py", line 321, in serve_http
  | port=self.options.port, addr=str(self.options.address))
  | File "/opt/miniconda/lib/python3.6/site-packages/prometheus_client/exposition.py", line 149, in start_wsgi_server
  | httpd = make_server(addr, port, app, ThreadingWSGIServer, handler_class=_SilentHandler)
  | File "/opt/miniconda/lib/python3.6/wsgiref/simple_server.py", line 153, in make_server
  | server = server_class((host, port), handler_class)
  | File "/opt/miniconda/lib/python3.6/socketserver.py", line 456, in __init__
  | self.server_bind()
  | File "/opt/miniconda/lib/python3.6/wsgiref/simple_server.py", line 50, in server_bind
  | HTTPServer.server_bind(self)
  | File "/opt/miniconda/lib/python3.6/http/server.py", line 136, in server_bind
  | socketserver.TCPServer.server_bind(self)
  | File "/opt/miniconda/lib/python3.6/socketserver.py", line 470, in server_bind
  | self.socket.bind(self.server_address)
  | OSError: [Errno 98] Address already in use

https://buildkite.com/ray-project/ray-builders-pr/builds/5213#a0ad406a-a457-47ec-9fbc-e341268a2686
I believe this is a known issue from #13763 (review).

There is also a CI build that timed out: https://travis-ci.com/github/ray-project/ray/jobs/505058849. CI builds from most other pull requests seem to use cached C++ build artifacts and spend little time on building C++ libraries, so the timeout is likely due to the new proto dependency requiring rebuilding many libraries. If this is the case, the CI build slow down should be transient.

@rkooo567
Copy link
Contributor

I will just re-run tests just in case.

@mwtian
Copy link
Member Author

mwtian commented May 13, 2021

I will just re-run tests just in case.

Thanks! The rerun of the tests seem to pass. I'm rerunning the build too.

@rkooo567
Copy link
Contributor

The failing flaky test is a known flaky test. Merging it.

@rkooo567 rkooo567 merged commit dce13d3 into ray-project:master May 13, 2021
ijrsvt added a commit to ijrsvt/ray that referenced this pull request May 13, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bazel] Build error from source
4 participants