-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Core] Upgrade grpc from 1.46.6 to 1.57.0 #39210
Conversation
Signed-off-by: Jiajun Yao <jeromeyjj@gmail.com>
@@ -54,7 +54,8 @@ class LocalModeTaskSubmitter : public TaskSubmitter { | |||
|
|||
absl::Mutex actor_contexts_mutex_; | |||
|
|||
std::unordered_map<std::string, ActorID> named_actors_ GUARDED_BY(named_actors_mutex_); | |||
std::unordered_map<std::string, ActorID> named_actors_ | |||
ABSL_GUARDED_BY(named_actors_mutex_); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Breaking change from absl:
The legacy spellings of the thread annotation macros/functions (e.g. GUARDED_BY()) [have been removed by default](https://github.com/abseil/abseil-cpp/commit/6acb60c161f1203e6eca929b87f2041da7714bfe) in favor of the ABSL_ prefixed versions (e.g. ABSL_GUARDED_BY()) due to clashes with other libraries. The compatibility macro ABSL_LEGACY_THREAD_ANNOTATIONS can be defined on the compile command-line to temporarily restore these spellings, but this compatibility macro will be removed in the future.
python/setup.py
Outdated
@@ -349,7 +349,7 @@ def get_packages(self): | |||
"numpy >= 1.16; python_version < '3.9'", | |||
"numpy >= 1.19.3; python_version >= '3.9'", | |||
"packaging", | |||
"protobuf >= 3.15.3, != 3.19.5", | |||
"protobuf >= 3.20.0", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jjyao can you post screenshots of the before and after memory usage for the serve workload? |
Note: all the core tests are failing
|
@@ -8,6 +8,11 @@ build:windows --action_env=PATH | |||
# For --compilation_mode=dbg, consider enabling checks in the standard library as well (below). | |||
build --compilation_mode=opt | |||
# Using C++ 17 on all platforms. | |||
build:linux --host_cxxopt="-std=c++17" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is needed to compile upgraded absil.
strip_prefix = "protobuf-2c5fa078d8e86e5f4bd34e6f4c9ea9e8d7d4d44a", | ||
urls = [ | ||
# https://github.com/protocolbuffers/protobuf/commits/v23.4 | ||
"https://github.com/protocolbuffers/protobuf/archive/2c5fa078d8e86e5f4bd34e6f4c9ea9e8d7d4d44a.tar.gz", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it mean we will use the lower version? (3.19 -> 2.3?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not 2.3 but 23 so 19 -> 23
@@ -279,11 +289,11 @@ def ray_deps_setup(): | |||
# https://github.com/grpc/grpc/blob/1ff1feaa83e071d87c07827b0a317ffac673794f/bazel/grpc_deps.bzl#L189 | |||
# Ensure this rule matches the rule used by grpc's bazel/grpc_deps.bzl | |||
name = "boringssl", | |||
sha256 = "534fa658bd845fd974b50b10f444d392dfd0d93768c4a51b61263fd37d851c40", | |||
strip_prefix = "boringssl-b9232f9e27e5668bc0414879dcdedb2a59ea75f2", | |||
sha256 = "0675a4f86ce5e959703425d6f9063eaadf6b61b7f3399e77a154c0e85bad46b1", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how did we choose this? is it for a specific version (add a comment for the version?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# Ensure this rule matches the rule used by grpc's bazel/grpc_deps.bzl
patches = [ | ||
"@com_github_grpc_grpc//third_party:protobuf.patch", | ||
], | ||
patch_args = ["-p1"], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you comment why this patch is needed from code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# This is copied from grpc's bazel/grpc_deps.bzlc
@@ -8,8 +8,8 @@ def gen_java_deps(): | |||
"com.github.java-json-tools:json-schema-validator:2.2.14", | |||
"com.google.code.gson:gson:2.9.1", | |||
"com.google.guava:guava:32.0.1-jre", | |||
"com.google.protobuf:protobuf-java:3.19.6", | |||
"com.google.protobuf:protobuf-java-util:3.19.6", | |||
"com.google.protobuf:protobuf-java:3.23.4", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe not necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needed since we upgraded protoc.
@@ -30,6 +30,8 @@ diff --git bazel/cython_library.bzl bazel/cython_library.bzl | |||
+ srcs = [stem + ".cpp"] + cc_kwargs.pop("srcs", []), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you tell me why we need these changes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Existing patches, I don't know why we need it in the first place.
@@ -5,7 +5,7 @@ index 3bb5962..706b9b4 100644 | |||
@@ -25,7 +25,7 @@ namespace opencensus { | |||
namespace exporters { | |||
namespace stats { | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why are there logic changes in this file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No logic change, only GUARDED_BY
-> ABSL_GUARDED_BY
@@ -1,12 +0,0 @@ | |||
diff --git third_party/py/python_configure.bzl third_party/py/python_configure.bzl |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we delete a file? Seems like it is not used?
BUILD.bazel
Outdated
"@io_opencensus_proto//opencensus/proto/metrics/v1:metrics.proto", | ||
"@io_opencensus_proto//opencensus/proto/resource/v1:resource.proto", | ||
] + glob([ | ||
"src/ray/protobuf/**/*.proto", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel not all proto is needed actually. For example, not all proto is compiled with py.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, I just keep it simple to compile everything, it should be no harm.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
codeowner stamp for lint change
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM if tests pass from my end (please get @iycheng;s approval before merging it!). It seems like it has 13K lines of changes, is this expected?
I think a lot of it is checking in generated pb files. |
@jjyao I think the core nightly test failure is fixed by 313a923. You can double check it by downloading a log file from gcs_server.out and check if the check failure is from the autoscaler manager Can you also post the performance difference from the master here just in case to see if there's a big change? |
Signed-off-by: Jiajun Yao <jeromeyjj@gmail.com>
Microbenchmark:
Seems ok. |
Signed-off-by: Jiajun Yao <jeromeyjj@gmail.com>
Signed-off-by: Jiajun Yao <jeromeyjj@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Stamping for test_state_api_log being flaky for known problems. #39410
1.46.6 has a memory leak and 1.57.0 fixes it. grpc: 1.46.6 -> 1.57.0 protobuf: v19.4 -> v23.4 absl: 2022062.1 -> 20230125.3 boringssl: b9232f9e27e5668bc0414879dcdedb2a59ea75f2 -> 342e805bc1f5dfdd650e3f031686d6c939b095d9 Signed-off-by: Jiajun Yao <jeromeyjj@gmail.com> Co-authored-by: Yi Cheng <74173148+iycheng@users.noreply.github.com>
1.46.6 has a memory leak and 1.57.0 fixes it. grpc: 1.46.6 -> 1.57.0 protobuf: v19.4 -> v23.4 absl: 2022062.1 -> 20230125.3 boringssl: b9232f9e27e5668bc0414879dcdedb2a59ea75f2 -> 342e805bc1f5dfdd650e3f031686d6c939b095d9 Signed-off-by: Jiajun Yao <jeromeyjj@gmail.com> Co-authored-by: Yi Cheng <74173148+iycheng@users.noreply.github.com>
* [Core] Upgrade grpc from 1.46.6 to 1.57.0 (#39210) 1.46.6 has a memory leak and 1.57.0 fixes it. grpc: 1.46.6 -> 1.57.0 protobuf: v19.4 -> v23.4 absl: 2022062.1 -> 20230125.3 boringssl: b9232f9e27e5668bc0414879dcdedb2a59ea75f2 -> 342e805bc1f5dfdd650e3f031686d6c939b095d9 Signed-off-by: Jiajun Yao <jeromeyjj@gmail.com> Co-authored-by: Yi Cheng <74173148+iycheng@users.noreply.github.com> * fix Signed-off-by: Jiajun Yao <jeromeyjj@gmail.com> * up Signed-off-by: Jiajun Yao <jeromeyjj@gmail.com> * fix Signed-off-by: Jiajun Yao <jeromeyjj@gmail.com> --------- Signed-off-by: Jiajun Yao <jeromeyjj@gmail.com> Co-authored-by: Yi Cheng <74173148+iycheng@users.noreply.github.com>
1.46.6 has a memory leak and 1.57.0 fixes it. grpc: 1.46.6 -> 1.57.0 protobuf: v19.4 -> v23.4 absl: 2022062.1 -> 20230125.3 boringssl: b9232f9e27e5668bc0414879dcdedb2a59ea75f2 -> 342e805bc1f5dfdd650e3f031686d6c939b095d9 Signed-off-by: Jiajun Yao <jeromeyjj@gmail.com> Co-authored-by: Yi Cheng <74173148+iycheng@users.noreply.github.com> Signed-off-by: Jim Thompson <jimthompson5802@gmail.com>
1.46.6 has a memory leak and 1.57.0 fixes it. grpc: 1.46.6 -> 1.57.0 protobuf: v19.4 -> v23.4 absl: 2022062.1 -> 20230125.3 boringssl: b9232f9e27e5668bc0414879dcdedb2a59ea75f2 -> 342e805bc1f5dfdd650e3f031686d6c939b095d9 Signed-off-by: Jiajun Yao <jeromeyjj@gmail.com> Co-authored-by: Yi Cheng <74173148+iycheng@users.noreply.github.com> Signed-off-by: Victor <vctr.y.m@example.com>
Revert back to the old way (before #39210) of auto-generating protobuf and grpc code so we don't need to manually generate and check them in. Also decouple the protobuf version we use in c++ and the protobuf version we use to auto generate python and java code (the protobuf version to auto generate python code affects the python protobuf library version that Ray can support) so that they can be upgraded independently. Signed-off-by: Jiajun Yao <jeromeyjj@gmail.com>
Why are these changes needed?
1.46.6 has a memory leak and 1.57.0 fixes it.
grpc: 1.46.6 -> 1.57.0
protobuf: v19.4 -> v23.4
absl: 2022062.1 -> 20230125.3
boringssl: b9232f9e27e5668bc0414879dcdedb2a59ea75f2 -> 342e805bc1f5dfdd650e3f031686d6c939b095d9
For a serve workload, the memory usage before:
after:
![Screenshot 2023-09-01 at 7 40 03 AM](https://private-user-images.githubusercontent.com/898023/266067329-06218117-07e9-4929-a526-ae88e4b60257.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MTk4MDMyOTEsIm5iZiI6MTcxOTgwMjk5MSwicGF0aCI6Ii84OTgwMjMvMjY2MDY3MzI5LTA2MjE4MTE3LTA3ZTktNDkyOS1hNTI2LWFlODhlNGI2MDI1Ny5wbmc_WC1BbXotQWxnb3JpdGhtPUFXUzQtSE1BQy1TSEEyNTYmWC1BbXotQ3JlZGVudGlhbD1BS0lBVkNPRFlMU0E1M1BRSzRaQSUyRjIwMjQwNzAxJTJGdXMtZWFzdC0xJTJGczMlMkZhd3M0X3JlcXVlc3QmWC1BbXotRGF0ZT0yMDI0MDcwMVQwMzAzMTFaJlgtQW16LUV4cGlyZXM9MzAwJlgtQW16LVNpZ25hdHVyZT1iYTQ0ZWUyNmZiYzgzYTI3YTY5YWY1YTM5Y2Q4YjJmNjA1NGQ3MWUwM2U0ZjcxNzc1M2IyMDhiYzZjZjk4MDBmJlgtQW16LVNpZ25lZEhlYWRlcnM9aG9zdCZhY3Rvcl9pZD0wJmtleV9pZD0wJnJlcG9faWQ9MCJ9.OvquNZMEF_e1Px56plDDnMYF0FsSSu9gltW1cRJzrKg)
![Screenshot 2023-09-01 at 7 40 26 AM](https://private-user-images.githubusercontent.com/898023/266067351-47944600-3ebc-417d-a60d-adc1bf7c16e2.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MTk4MDMyOTEsIm5iZiI6MTcxOTgwMjk5MSwicGF0aCI6Ii84OTgwMjMvMjY2MDY3MzUxLTQ3OTQ0NjAwLTNlYmMtNDE3ZC1hNjBkLWFkYzFiZjdjMTZlMi5wbmc_WC1BbXotQWxnb3JpdGhtPUFXUzQtSE1BQy1TSEEyNTYmWC1BbXotQ3JlZGVudGlhbD1BS0lBVkNPRFlMU0E1M1BRSzRaQSUyRjIwMjQwNzAxJTJGdXMtZWFzdC0xJTJGczMlMkZhd3M0X3JlcXVlc3QmWC1BbXotRGF0ZT0yMDI0MDcwMVQwMzAzMTFaJlgtQW16LUV4cGlyZXM9MzAwJlgtQW16LVNpZ25hdHVyZT0xOWI0NTBhOTgyMTgzM2RhNGQ4YWMyODI3N2Q5ODg4ODUyZjY5MDViMTI5MTk1ZGNkZTEzOWE4MjU0NDlmMDIzJlgtQW16LVNpZ25lZEhlYWRlcnM9aG9zdCZhY3Rvcl9pZD0wJmtleV9pZD0wJnJlcG9faWQ9MCJ9.B3Dg8YlpXAJFsYXMIdhRR6L-4-Dqksm441_bosokaJA)
![Screenshot 2023-09-01 at 7 41 09 AM](https://private-user-images.githubusercontent.com/898023/266067369-4c45f73f-0344-4dbc-b992-ae7f94ce1b2a.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MTk4MDMyOTEsIm5iZiI6MTcxOTgwMjk5MSwicGF0aCI6Ii84OTgwMjMvMjY2MDY3MzY5LTRjNDVmNzNmLTAzNDQtNGRiYy1iOTkyLWFlN2Y5NGNlMWIyYS5wbmc_WC1BbXotQWxnb3JpdGhtPUFXUzQtSE1BQy1TSEEyNTYmWC1BbXotQ3JlZGVudGlhbD1BS0lBVkNPRFlMU0E1M1BRSzRaQSUyRjIwMjQwNzAxJTJGdXMtZWFzdC0xJTJGczMlMkZhd3M0X3JlcXVlc3QmWC1BbXotRGF0ZT0yMDI0MDcwMVQwMzAzMTFaJlgtQW16LUV4cGlyZXM9MzAwJlgtQW16LVNpZ25hdHVyZT1hZTFlMGYxOGExYjIxNTNlMWJjOGY0YmE3NTc2NmQwNDhlZDljNGJkMjNiNDE5OWY4N2FlODEwZjNmYTU0ZDA0JlgtQW16LVNpZ25lZEhlYWRlcnM9aG9zdCZhY3Rvcl9pZD0wJmtleV9pZD0wJnJlcG9faWQ9MCJ9.7S3TdELtDEJlPM6Qd5X5uNfOG1nbKPBDyPs7TsmYYZ8)
NOTE
For now, we check in the auto generated proto source files for python since we need to use an old version of protoc to support python 3.7. Once we deprecate 3.7, we can undo this part.
Related issue number
Closes #38591
Checks
git commit -s
) in this PR.scripts/format.sh
to lint the changes in this PR.method in Tune, I've added it in
doc/source/tune/api/
under thecorresponding
.rst
file.