-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[core] add grpc opencensus plugin #39082
Conversation
Nice, it would be very valuable. |
wait, it compiles in my laptop but not in CI |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! LGTM but I'll defer to @rkooo567 since he is more familiar with this code.
can we make the grpc stats off by default (add a system config to enable it)? Also can you give me a list of grpc stats? I wonder if we should filter some of them since it will be a lot. Lastly, can you pick which stats we should track from the multi cloud setup? |
config added.
https://github.com/census-instrumentation/opencensus-specs/blob/master/stats/gRPC.md
|
SGTM. I guess we can also track the completed rpcs with status to understand the network failures
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Excited to have grpc metrics :)! Nit comments on adding unit tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO
- Allow to choose components (gcs, worker, raylet) to export grpc metrics due to cardinality concern.
I think we can do it by creating a separate method & call it in the beg of the entrypoint script
Blocked on #39210 because we need grpc to be upgraded to >= 1.49.0 to have grpc/grpc#30567. Once our #39210 is merged I will merge to this PR and CI should work. |
Signed-off-by: Ruiyang Wang <rywang014@gmail.com>
Signed-off-by: Ruiyang Wang <rywang014@gmail.com>
Signed-off-by: Ruiyang Wang <rywang014@gmail.com>
Signed-off-by: Ruiyang Wang <rywang014@gmail.com>
Signed-off-by: Ruiyang Wang <rywang014@gmail.com>
Signed-off-by: Ruiyang Wang <rywang014@gmail.com>
Signed-off-by: Ruiyang Wang <rywang014@gmail.com>
Signed-off-by: Ruiyang Wang <rywang014@gmail.com>
… fix. Signed-off-by: Ruiyang Wang <rywang014@gmail.com>
Signed-off-by: Ruiyang Wang <rywang014@gmail.com>
Signed-off-by: Ruiyang Wang <rywang014@gmail.com>
Signed-off-by: Ruiyang Wang <rywang014@gmail.com>
Can you merge the latest master? Also lint failure + left of comments |
Signed-off-by: Ruiyang Wang <rywang014@gmail.com>
Signed-off-by: Ruiyang Wang <rywang014@gmail.com>
ready to merge |
Can you update the PR description? Also, test_prometheus_physical_stats_record failure seesm related |
updated |
gRPC has a built in plugin to export metrics via opencensus, as detailed in https://github.com/census-instrumentation/opencensus-specs/blob/master/stats/gRPC.md . We already use opencensus to collect Ray metrics, and here we add the gRPC-sourced metrics to the list. In consideration of potential performance impacts, we disable it by default and can be enabled by component via config enable_grpc_metrics_collection_for. Now we only support gcs, in following patches we plan to support raylet and core_worker. The reason we don't support it right away is, we need to configure before any gRPC traffic, but in raylet and in core worker we init our stats after a grpc call to gcs/raylet to get the configs updated. Signed-off-by: Simon Zehnder <simon.zehnder@gmail.com>
gRPC has a built in plugin to export metrics via opencensus, as detailed in https://github.com/census-instrumentation/opencensus-specs/blob/master/stats/gRPC.md . We already use opencensus to collect Ray metrics, and here we add the gRPC-sourced metrics to the list. In consideration of potential performance impacts, we disable it by default and can be enabled by component via config enable_grpc_metrics_collection_for. Now we only support gcs, in following patches we plan to support raylet and core_worker. The reason we don't support it right away is, we need to configure before any gRPC traffic, but in raylet and in core worker we init our stats after a grpc call to gcs/raylet to get the configs updated. Signed-off-by: Victor <vctr.y.m@example.com>
gRPC has a built in plugin to export metrics via opencensus, as detailed in https://github.com/census-instrumentation/opencensus-specs/blob/master/stats/gRPC.md . We already use opencensus to collect Ray metrics, and here we add the gRPC-sourced metrics to the list.
In consideration of potential performance impacts, we disable it by default and can be enabled by component via config
enable_grpc_metrics_collection_for
. Now we only supportgcs
, in following patches we plan to supportraylet
andcore_worker
. The reason we don't support it right away is, we need to configure before any gRPC traffic, but in raylet and in core worker we init our stats after a grpc call to gcs/raylet to get the configs updated.