-
Notifications
You must be signed in to change notification settings - Fork 2.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Prometheus gRPC metrics for hubble and hubble-relay #20376
Add Prometheus gRPC metrics for hubble and hubble-relay #20376
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome work. I'm not familiar with interceptors or the Prometheus gRPC metrics package, so I mostly focused around the existing Hubble logic. A few minor things that I think need to be addressed.
67848d9
to
02baed5
Compare
@gandro Thanks for the review, I've addressed your comments. LMK what you think. |
02baed5
to
6ef7e6b
Compare
4c2619e
to
e84ad3d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome, this looks great. Thanks a lot!
/test Job 'Cilium-PR-K8s-GKE' failed: Click to show.Test Name
Failure Output
If it is a flake and a GitHub issue doesn't already exist to track it, comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very comprehensive PR 💯 , I have only two minor comments as per below.
Sample metrics list for other reviewer
# HELP grpc_server_handled_total Total number of RPCs completed on the server, regardless of success or failure.
# TYPE grpc_server_handled_total counter
grpc_server_handled_total{grpc_code="Aborted",grpc_method="Check",grpc_service="grpc.health.v1.Health",grpc_type="unary"} 0
grpc_server_handled_total{grpc_code="Aborted",grpc_method="GetAgentEvents",grpc_service="observer.Observer",grpc_type="server_stream"} 0
grpc_server_handled_total{grpc_code="Aborted",grpc_method="GetDebugEvents",grpc_service="observer.Observer",grpc_type="server_stream"} 0
grpc_server_handled_total{grpc_code="Aborted",grpc_method="GetFlows",grpc_service="observer.Observer",grpc_type="server_stream"} 0
grpc_server_handled_total{grpc_code="Aborted",grpc_method="GetNodes",grpc_service="observer.Observer",grpc_type="unary"} 0
grpc_server_handled_total{grpc_code="Aborted",grpc_method="ServerReflectionInfo",grpc_service="grpc.reflection.v1alpha.ServerReflection",grpc_type="bidi_stream"} 0
grpc_server_handled_total{grpc_code="Aborted",grpc_method="ServerStatus",grpc_service="observer.Observer",grpc_type="unary"} 0
grpc_server_handled_total{grpc_code="Aborted",grpc_method="Watch",grpc_service="grpc.health.v1.Health",grpc_type="server_stream"} 0
grpc_server_handled_total{grpc_code="AlreadyExists",grpc_method="Check",grpc_service="grpc.health.v1.Health",grpc_type="unary"} 0
grpc_server_handled_total{grpc_code="AlreadyExists",grpc_method="GetAgentEvents",grpc_service="observer.Observer",grpc_type="server_stream"} 0
grpc_server_handled_total{grpc_code="AlreadyExists",grpc_method="GetDebugEvents",grpc_service="observer.Observer",grpc_type="server_stream"} 0
grpc_server_handled_total{grpc_code="AlreadyExists",grpc_method="GetFlows",grpc_service="observer.Observer",grpc_type="server_stream"} 0
grpc_server_handled_total{grpc_code="AlreadyExists",grpc_method="GetNodes",grpc_service="observer.Observer",grpc_type="unary"} 0
grpc_server_handled_total{grpc_code="AlreadyExists",grpc_method="ServerReflectionInfo",grpc_service="grpc.reflection.v1alpha.ServerReflection",grpc_type="bidi_stream"} 0
grpc_server_handled_total{grpc_code="AlreadyExists",grpc_method="ServerStatus",grpc_service="observer.Observer",grpc_type="unary"} 0
grpc_server_handled_total{grpc_code="AlreadyExists",grpc_method="Watch",grpc_service="grpc.health.v1.Health",grpc_type="server_stream"} 0
grpc_server_handled_total{grpc_code="Canceled",grpc_method="Check",grpc_service="grpc.health.v1.Health",grpc_type="unary"} 0
grpc_server_handled_total{grpc_code="Canceled",grpc_method="GetAgentEvents",grpc_service="observer.Observer",grpc_type="server_stream"} 0
grpc_server_handled_total{grpc_code="Canceled",grpc_method="GetDebugEvents",grpc_service="observer.Observer",grpc_type="server_stream"} 0
grpc_server_handled_total{grpc_code="Canceled",grpc_method="GetFlows",grpc_service="observer.Observer",grpc_type="server_stream"} 0
grpc_server_handled_total{grpc_code="Canceled",grpc_method="GetNodes",grpc_service="observer.Observer",grpc_type="unary"} 0
grpc_server_handled_total{grpc_code="Canceled",grpc_method="ServerReflectionInfo",grpc_service="grpc.reflection.v1alpha.ServerReflection",grpc_type="bidi_stream"} 0
grpc_server_handled_total{grpc_code="Canceled",grpc_method="ServerStatus",grpc_service="observer.Observer",grpc_type="unary"} 0
grpc_server_handled_total{grpc_code="Canceled",grpc_method="Watch",grpc_service="grpc.health.v1.Health",grpc_type="server_stream"} 0
grpc_server_handled_total{grpc_code="DataLoss",grpc_method="Check",grpc_service="grpc.health.v1.Health",grpc_type="unary"} 0
grpc_server_handled_total{grpc_code="DataLoss",grpc_method="GetAgentEvents",grpc_service="observer.Observer",grpc_type="server_stream"} 0
grpc_server_handled_total{grpc_code="DataLoss",grpc_method="GetDebugEvents",grpc_service="observer.Observer",grpc_type="server_stream"} 0
grpc_server_handled_total{grpc_code="DataLoss",grpc_method="GetFlows",grpc_service="observer.Observer",grpc_type="server_stream"} 0
grpc_server_handled_total{grpc_code="DataLoss",grpc_method="GetNodes",grpc_service="observer.Observer",grpc_type="unary"} 0
grpc_server_handled_total{grpc_code="DataLoss",grpc_method="ServerReflectionInfo",grpc_service="grpc.reflection.v1alpha.ServerReflection",grpc_type="bidi_stream"} 0
grpc_server_handled_total{grpc_code="DataLoss",grpc_method="ServerStatus",grpc_service="observer.Observer",grpc_type="unary"} 0
grpc_server_handled_total{grpc_code="DataLoss",grpc_method="Watch",grpc_service="grpc.health.v1.Health",grpc_type="server_stream"} 0
grpc_server_handled_total{grpc_code="DeadlineExceeded",grpc_method="Check",grpc_service="grpc.health.v1.Health",grpc_type="unary"} 0
grpc_server_handled_total{grpc_code="DeadlineExceeded",grpc_method="GetAgentEvents",grpc_service="observer.Observer",grpc_type="server_stream"} 0
grpc_server_handled_total{grpc_code="DeadlineExceeded",grpc_method="GetDebugEvents",grpc_service="observer.Observer",grpc_type="server_stream"} 0
grpc_server_handled_total{grpc_code="DeadlineExceeded",grpc_method="GetFlows",grpc_service="observer.Observer",grpc_type="server_stream"} 0
grpc_server_handled_total{grpc_code="DeadlineExceeded",grpc_method="GetNodes",grpc_service="observer.Observer",grpc_type="unary"} 0
grpc_server_handled_total{grpc_code="DeadlineExceeded",grpc_method="ServerReflectionInfo",grpc_service="grpc.reflection.v1alpha.ServerReflection",grpc_type="bidi_stream"} 0
grpc_server_handled_total{grpc_code="DeadlineExceeded",grpc_method="ServerStatus",grpc_service="observer.Observer",grpc_type="unary"} 0
grpc_server_handled_total{grpc_code="DeadlineExceeded",grpc_method="Watch",grpc_service="grpc.health.v1.Health",grpc_type="server_stream"} 0
grpc_server_handled_total{grpc_code="FailedPrecondition",grpc_method="Check",grpc_service="grpc.health.v1.Health",grpc_type="unary"} 0
grpc_server_handled_total{grpc_code="FailedPrecondition",grpc_method="GetAgentEvents",grpc_service="observer.Observer",grpc_type="server_stream"} 0
grpc_server_handled_total{grpc_code="FailedPrecondition",grpc_method="GetDebugEvents",grpc_service="observer.Observer",grpc_type="server_stream"} 0
grpc_server_handled_total{grpc_code="FailedPrecondition",grpc_method="GetFlows",grpc_service="observer.Observer",grpc_type="server_stream"} 0
grpc_server_handled_total{grpc_code="FailedPrecondition",grpc_method="GetNodes",grpc_service="observer.Observer",grpc_type="unary"} 0
grpc_server_handled_total{grpc_code="FailedPrecondition",grpc_method="ServerReflectionInfo",grpc_service="grpc.reflection.v1alpha.ServerReflection",grpc_type="bidi_stream"} 0
grpc_server_handled_total{grpc_code="FailedPrecondition",grpc_method="ServerStatus",grpc_service="observer.Observer",grpc_type="unary"} 0
grpc_server_handled_total{grpc_code="FailedPrecondition",grpc_method="Watch",grpc_service="grpc.health.v1.Health",grpc_type="server_stream"} 0
grpc_server_handled_total{grpc_code="Internal",grpc_method="Check",grpc_service="grpc.health.v1.Health",grpc_type="unary"} 0
grpc_server_handled_total{grpc_code="Internal",grpc_method="GetAgentEvents",grpc_service="observer.Observer",grpc_type="server_stream"} 0
grpc_server_handled_total{grpc_code="Internal",grpc_method="GetDebugEvents",grpc_service="observer.Observer",grpc_type="server_stream"} 0
grpc_server_handled_total{grpc_code="Internal",grpc_method="GetFlows",grpc_service="observer.Observer",grpc_type="server_stream"} 0
grpc_server_handled_total{grpc_code="Internal",grpc_method="GetNodes",grpc_service="observer.Observer",grpc_type="unary"} 0
grpc_server_handled_total{grpc_code="Internal",grpc_method="ServerReflectionInfo",grpc_service="grpc.reflection.v1alpha.ServerReflection",grpc_type="bidi_stream"} 0
grpc_server_handled_total{grpc_code="Internal",grpc_method="ServerStatus",grpc_service="observer.Observer",grpc_type="unary"} 0
grpc_server_handled_total{grpc_code="Internal",grpc_method="Watch",grpc_service="grpc.health.v1.Health",grpc_type="server_stream"} 0
grpc_server_handled_total{grpc_code="InvalidArgument",grpc_method="Check",grpc_service="grpc.health.v1.Health",grpc_type="unary"} 0
grpc_server_handled_total{grpc_code="InvalidArgument",grpc_method="GetAgentEvents",grpc_service="observer.Observer",grpc_type="server_stream"} 0
grpc_server_handled_total{grpc_code="InvalidArgument",grpc_method="GetDebugEvents",grpc_service="observer.Observer",grpc_type="server_stream"} 0
grpc_server_handled_total{grpc_code="InvalidArgument",grpc_method="GetFlows",grpc_service="observer.Observer",grpc_type="server_stream"} 0
grpc_server_handled_total{grpc_code="InvalidArgument",grpc_method="GetNodes",grpc_service="observer.Observer",grpc_type="unary"} 0
grpc_server_handled_total{grpc_code="InvalidArgument",grpc_method="ServerReflectionInfo",grpc_service="grpc.reflection.v1alpha.ServerReflection",grpc_type="bidi_stream"} 0
grpc_server_handled_total{grpc_code="InvalidArgument",grpc_method="ServerStatus",grpc_service="observer.Observer",grpc_type="unary"} 0
grpc_server_handled_total{grpc_code="InvalidArgument",grpc_method="Watch",grpc_service="grpc.health.v1.Health",grpc_type="server_stream"} 0
grpc_server_handled_total{grpc_code="NotFound",grpc_method="Check",grpc_service="grpc.health.v1.Health",grpc_type="unary"} 0
grpc_server_handled_total{grpc_code="NotFound",grpc_method="GetAgentEvents",grpc_service="observer.Observer",grpc_type="server_stream"} 0
grpc_server_handled_total{grpc_code="NotFound",grpc_method="GetDebugEvents",grpc_service="observer.Observer",grpc_type="server_stream"} 0
grpc_server_handled_total{grpc_code="NotFound",grpc_method="GetFlows",grpc_service="observer.Observer",grpc_type="server_stream"} 0
grpc_server_handled_total{grpc_code="NotFound",grpc_method="GetNodes",grpc_service="observer.Observer",grpc_type="unary"} 0
grpc_server_handled_total{grpc_code="NotFound",grpc_method="ServerReflectionInfo",grpc_service="grpc.reflection.v1alpha.ServerReflection",grpc_type="bidi_stream"} 0
grpc_server_handled_total{grpc_code="NotFound",grpc_method="ServerStatus",grpc_service="observer.Observer",grpc_type="unary"} 0
grpc_server_handled_total{grpc_code="NotFound",grpc_method="Watch",grpc_service="grpc.health.v1.Health",grpc_type="server_stream"} 0
grpc_server_handled_total{grpc_code="OK",grpc_method="Check",grpc_service="grpc.health.v1.Health",grpc_type="unary"} 0
grpc_server_handled_total{grpc_code="OK",grpc_method="GetAgentEvents",grpc_service="observer.Observer",grpc_type="server_stream"} 0
grpc_server_handled_total{grpc_code="OK",grpc_method="GetDebugEvents",grpc_service="observer.Observer",grpc_type="server_stream"} 0
grpc_server_handled_total{grpc_code="OK",grpc_method="GetFlows",grpc_service="observer.Observer",grpc_type="server_stream"} 0
grpc_server_handled_total{grpc_code="OK",grpc_method="GetNodes",grpc_service="observer.Observer",grpc_type="unary"} 0
grpc_server_handled_total{grpc_code="OK",grpc_method="ServerReflectionInfo",grpc_service="grpc.reflection.v1alpha.ServerReflection",grpc_type="bidi_stream"} 0
grpc_server_handled_total{grpc_code="OK",grpc_method="ServerStatus",grpc_service="observer.Observer",grpc_type="unary"} 0
grpc_server_handled_total{grpc_code="OK",grpc_method="Watch",grpc_service="grpc.health.v1.Health",grpc_type="server_stream"} 0
grpc_server_handled_total{grpc_code="OutOfRange",grpc_method="Check",grpc_service="grpc.health.v1.Health",grpc_type="unary"} 0
grpc_server_handled_total{grpc_code="OutOfRange",grpc_method="GetAgentEvents",grpc_service="observer.Observer",grpc_type="server_stream"} 0
grpc_server_handled_total{grpc_code="OutOfRange",grpc_method="GetDebugEvents",grpc_service="observer.Observer",grpc_type="server_stream"} 0
grpc_server_handled_total{grpc_code="OutOfRange",grpc_method="GetFlows",grpc_service="observer.Observer",grpc_type="server_stream"} 0
grpc_server_handled_total{grpc_code="OutOfRange",grpc_method="GetNodes",grpc_service="observer.Observer",grpc_type="unary"} 0
grpc_server_handled_total{grpc_code="OutOfRange",grpc_method="ServerReflectionInfo",grpc_service="grpc.reflection.v1alpha.ServerReflection",grpc_type="bidi_stream"} 0
grpc_server_handled_total{grpc_code="OutOfRange",grpc_method="ServerStatus",grpc_service="observer.Observer",grpc_type="unary"} 0
grpc_server_handled_total{grpc_code="OutOfRange",grpc_method="Watch",grpc_service="grpc.health.v1.Health",grpc_type="server_stream"} 0
grpc_server_handled_total{grpc_code="PermissionDenied",grpc_method="Check",grpc_service="grpc.health.v1.Health",grpc_type="unary"} 0
grpc_server_handled_total{grpc_code="PermissionDenied",grpc_method="GetAgentEvents",grpc_service="observer.Observer",grpc_type="server_stream"} 0
grpc_server_handled_total{grpc_code="PermissionDenied",grpc_method="GetDebugEvents",grpc_service="observer.Observer",grpc_type="server_stream"} 0
grpc_server_handled_total{grpc_code="PermissionDenied",grpc_method="GetFlows",grpc_service="observer.Observer",grpc_type="server_stream"} 0
grpc_server_handled_total{grpc_code="PermissionDenied",grpc_method="GetNodes",grpc_service="observer.Observer",grpc_type="unary"} 0
grpc_server_handled_total{grpc_code="PermissionDenied",grpc_method="ServerReflectionInfo",grpc_service="grpc.reflection.v1alpha.ServerReflection",grpc_type="bidi_stream"} 0
grpc_server_handled_total{grpc_code="PermissionDenied",grpc_method="ServerStatus",grpc_service="observer.Observer",grpc_type="unary"} 0
grpc_server_handled_total{grpc_code="PermissionDenied",grpc_method="Watch",grpc_service="grpc.health.v1.Health",grpc_type="server_stream"} 0
grpc_server_handled_total{grpc_code="ResourceExhausted",grpc_method="Check",grpc_service="grpc.health.v1.Health",grpc_type="unary"} 0
grpc_server_handled_total{grpc_code="ResourceExhausted",grpc_method="GetAgentEvents",grpc_service="observer.Observer",grpc_type="server_stream"} 0
grpc_server_handled_total{grpc_code="ResourceExhausted",grpc_method="GetDebugEvents",grpc_service="observer.Observer",grpc_type="server_stream"} 0
grpc_server_handled_total{grpc_code="ResourceExhausted",grpc_method="GetFlows",grpc_service="observer.Observer",grpc_type="server_stream"} 0
grpc_server_handled_total{grpc_code="ResourceExhausted",grpc_method="GetNodes",grpc_service="observer.Observer",grpc_type="unary"} 0
grpc_server_handled_total{grpc_code="ResourceExhausted",grpc_method="ServerReflectionInfo",grpc_service="grpc.reflection.v1alpha.ServerReflection",grpc_type="bidi_stream"} 0
grpc_server_handled_total{grpc_code="ResourceExhausted",grpc_method="ServerStatus",grpc_service="observer.Observer",grpc_type="unary"} 0
grpc_server_handled_total{grpc_code="ResourceExhausted",grpc_method="Watch",grpc_service="grpc.health.v1.Health",grpc_type="server_stream"} 0
grpc_server_handled_total{grpc_code="Unauthenticated",grpc_method="Check",grpc_service="grpc.health.v1.Health",grpc_type="unary"} 0
grpc_server_handled_total{grpc_code="Unauthenticated",grpc_method="GetAgentEvents",grpc_service="observer.Observer",grpc_type="server_stream"} 0
grpc_server_handled_total{grpc_code="Unauthenticated",grpc_method="GetDebugEvents",grpc_service="observer.Observer",grpc_type="server_stream"} 0
grpc_server_handled_total{grpc_code="Unauthenticated",grpc_method="GetFlows",grpc_service="observer.Observer",grpc_type="server_stream"} 0
grpc_server_handled_total{grpc_code="Unauthenticated",grpc_method="GetNodes",grpc_service="observer.Observer",grpc_type="unary"} 0
grpc_server_handled_total{grpc_code="Unauthenticated",grpc_method="ServerReflectionInfo",grpc_service="grpc.reflection.v1alpha.ServerReflection",grpc_type="bidi_stream"} 0
grpc_server_handled_total{grpc_code="Unauthenticated",grpc_method="ServerStatus",grpc_service="observer.Observer",grpc_type="unary"} 0
grpc_server_handled_total{grpc_code="Unauthenticated",grpc_method="Watch",grpc_service="grpc.health.v1.Health",grpc_type="server_stream"} 0
grpc_server_handled_total{grpc_code="Unavailable",grpc_method="Check",grpc_service="grpc.health.v1.Health",grpc_type="unary"} 0
grpc_server_handled_total{grpc_code="Unavailable",grpc_method="GetAgentEvents",grpc_service="observer.Observer",grpc_type="server_stream"} 0
grpc_server_handled_total{grpc_code="Unavailable",grpc_method="GetDebugEvents",grpc_service="observer.Observer",grpc_type="server_stream"} 0
grpc_server_handled_total{grpc_code="Unavailable",grpc_method="GetFlows",grpc_service="observer.Observer",grpc_type="server_stream"} 0
grpc_server_handled_total{grpc_code="Unavailable",grpc_method="GetNodes",grpc_service="observer.Observer",grpc_type="unary"} 0
grpc_server_handled_total{grpc_code="Unavailable",grpc_method="ServerReflectionInfo",grpc_service="grpc.reflection.v1alpha.ServerReflection",grpc_type="bidi_stream"} 0
grpc_server_handled_total{grpc_code="Unavailable",grpc_method="ServerStatus",grpc_service="observer.Observer",grpc_type="unary"} 0
grpc_server_handled_total{grpc_code="Unavailable",grpc_method="Watch",grpc_service="grpc.health.v1.Health",grpc_type="server_stream"} 0
grpc_server_handled_total{grpc_code="Unimplemented",grpc_method="Check",grpc_service="grpc.health.v1.Health",grpc_type="unary"} 0
grpc_server_handled_total{grpc_code="Unimplemented",grpc_method="GetAgentEvents",grpc_service="observer.Observer",grpc_type="server_stream"} 0
grpc_server_handled_total{grpc_code="Unimplemented",grpc_method="GetDebugEvents",grpc_service="observer.Observer",grpc_type="server_stream"} 0
grpc_server_handled_total{grpc_code="Unimplemented",grpc_method="GetFlows",grpc_service="observer.Observer",grpc_type="server_stream"} 0
grpc_server_handled_total{grpc_code="Unimplemented",grpc_method="GetNodes",grpc_service="observer.Observer",grpc_type="unary"} 0
grpc_server_handled_total{grpc_code="Unimplemented",grpc_method="ServerReflectionInfo",grpc_service="grpc.reflection.v1alpha.ServerReflection",grpc_type="bidi_stream"} 0
grpc_server_handled_total{grpc_code="Unimplemented",grpc_method="ServerStatus",grpc_service="observer.Observer",grpc_type="unary"} 0
grpc_server_handled_total{grpc_code="Unimplemented",grpc_method="Watch",grpc_service="grpc.health.v1.Health",grpc_type="server_stream"} 0
grpc_server_handled_total{grpc_code="Unknown",grpc_method="Check",grpc_service="grpc.health.v1.Health",grpc_type="unary"} 0
grpc_server_handled_total{grpc_code="Unknown",grpc_method="GetAgentEvents",grpc_service="observer.Observer",grpc_type="server_stream"} 0
grpc_server_handled_total{grpc_code="Unknown",grpc_method="GetDebugEvents",grpc_service="observer.Observer",grpc_type="server_stream"} 0
grpc_server_handled_total{grpc_code="Unknown",grpc_method="GetFlows",grpc_service="observer.Observer",grpc_type="server_stream"} 0
grpc_server_handled_total{grpc_code="Unknown",grpc_method="GetNodes",grpc_service="observer.Observer",grpc_type="unary"} 0
grpc_server_handled_total{grpc_code="Unknown",grpc_method="ServerReflectionInfo",grpc_service="grpc.reflection.v1alpha.ServerReflection",grpc_type="bidi_stream"} 0
grpc_server_handled_total{grpc_code="Unknown",grpc_method="ServerStatus",grpc_service="observer.Observer",grpc_type="unary"} 0
grpc_server_handled_total{grpc_code="Unknown",grpc_method="Watch",grpc_service="grpc.health.v1.Health",grpc_type="server_stream"} 0
# HELP grpc_server_msg_received_total Total number of RPC stream messages received on the server.
# TYPE grpc_server_msg_received_total counter
grpc_server_msg_received_total{grpc_method="Check",grpc_service="grpc.health.v1.Health",grpc_type="unary"} 0
grpc_server_msg_received_total{grpc_method="GetAgentEvents",grpc_service="observer.Observer",grpc_type="server_stream"} 0
grpc_server_msg_received_total{grpc_method="GetDebugEvents",grpc_service="observer.Observer",grpc_type="server_stream"} 0
grpc_server_msg_received_total{grpc_method="GetFlows",grpc_service="observer.Observer",grpc_type="server_stream"} 0
grpc_server_msg_received_total{grpc_method="GetNodes",grpc_service="observer.Observer",grpc_type="unary"} 0
grpc_server_msg_received_total{grpc_method="ServerReflectionInfo",grpc_service="grpc.reflection.v1alpha.ServerReflection",grpc_type="bidi_stream"} 0
grpc_server_msg_received_total{grpc_method="ServerStatus",grpc_service="observer.Observer",grpc_type="unary"} 0
grpc_server_msg_received_total{grpc_method="Watch",grpc_service="grpc.health.v1.Health",grpc_type="server_stream"} 0
# HELP grpc_server_msg_sent_total Total number of gRPC stream messages sent by the server.
# TYPE grpc_server_msg_sent_total counter
grpc_server_msg_sent_total{grpc_method="Check",grpc_service="grpc.health.v1.Health",grpc_type="unary"} 0
grpc_server_msg_sent_total{grpc_method="GetAgentEvents",grpc_service="observer.Observer",grpc_type="server_stream"} 0
grpc_server_msg_sent_total{grpc_method="GetDebugEvents",grpc_service="observer.Observer",grpc_type="server_stream"} 0
grpc_server_msg_sent_total{grpc_method="GetFlows",grpc_service="observer.Observer",grpc_type="server_stream"} 0
grpc_server_msg_sent_total{grpc_method="GetNodes",grpc_service="observer.Observer",grpc_type="unary"} 0
grpc_server_msg_sent_total{grpc_method="ServerReflectionInfo",grpc_service="grpc.reflection.v1alpha.ServerReflection",grpc_type="bidi_stream"} 0
grpc_server_msg_sent_total{grpc_method="ServerStatus",grpc_service="observer.Observer",grpc_type="unary"} 0
grpc_server_msg_sent_total{grpc_method="Watch",grpc_service="grpc.health.v1.Health",grpc_type="server_stream"} 0
# HELP grpc_server_started_total Total number of RPCs started on the server.
# TYPE grpc_server_started_total counter
grpc_server_started_total{grpc_method="Check",grpc_service="grpc.health.v1.Health",grpc_type="unary"} 0
grpc_server_started_total{grpc_method="GetAgentEvents",grpc_service="observer.Observer",grpc_type="server_stream"} 0
grpc_server_started_total{grpc_method="GetDebugEvents",grpc_service="observer.Observer",grpc_type="server_stream"} 0
grpc_server_started_total{grpc_method="GetFlows",grpc_service="observer.Observer",grpc_type="server_stream"} 0
grpc_server_started_total{grpc_method="GetNodes",grpc_service="observer.Observer",grpc_type="unary"} 0
grpc_server_started_total{grpc_method="ServerReflectionInfo",grpc_service="grpc.reflection.v1alpha.ServerReflection",grpc_type="bidi_stream"} 0
grpc_server_started_total{grpc_method="ServerStatus",grpc_service="observer.Observer",grpc_type="unary"} 0
grpc_server_started_total{grpc_method="Watch",grpc_service="grpc.health.v1.Health",grpc_type="server_stream"} 0
install/kubernetes/cilium/templates/hubble-relay/metrics_service.yaml
Outdated
Show resolved
Hide resolved
e84ad3d
to
9fc526b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent 🚀
I left a non-blocking suggestion.
In Go, Serve() blocks, and it's the callers responsibiilty to decide if it should run in a Go routine. Signed-off-by: Chance Zibolski <chance.zibolski@gmail.com>
Signed-off-by: Chance Zibolski <chance.zibolski@gmail.com>
Signed-off-by: Chance Zibolski <chance.zibolski@gmail.com>
Too much initialization was occurring in the Serve() function which should generally only deal with listening on the socket/port and starting the server. Signed-off-by: Chance Zibolski <chance.zibolski@gmail.com>
Signed-off-by: Chance Zibolski <chance.zibolski@gmail.com>
Signed-off-by: Chance Zibolski <chance.zibolski@gmail.com>
Signed-off-by: Chance Zibolski <chance.zibolski@gmail.com>
The cancel() looks potentially unused in the local observer code here, so added a comment indicating how it's used and why the cancellation is there at all. Signed-off-by: Chance Zibolski <chance.zibolski@gmail.com>
This is based off the same approach we take within observer.NewLocalServer, which allows other packages to extend the default options of Hubble relay the same way packages can extend the options of the Hubble observer server. Signed-off-by: Chance Zibolski <chance.zibolski@gmail.com>
9fc526b
to
f99d8d7
Compare
Even though it's late in 1.12 cycle, more metrics typically help with production operations and the risk here seems low. I didn't review the code in depth since it looked like others already provided sufficient attention. |
/test Job 'Cilium-PR-K8s-GKE' failed: Click to show.Test Name
Failure Output
If it is a flake and a GitHub issue doesn't already exist to track it, comment |
/mlh new-flake Cilium-PR-K8s-GKE 👍 created #20445 |
/test |
/ci-l4lb |
/ci-aks |
@joestringer I believe the L4LB, AKS and GKE tests were disabled due to flakes, so at this point I think this is ready. |
Travis didn't kick off on this PR for some reason, but I checked out the PR locally & ran |
I wasn't sure if I should split this up into multiple PRs (one for hubble, one for relay, and maybe more for just adding interceptors vs the metrics middleware), but I didn't because then I would have to deal with adding the same dependency in each and deal with conflicts later, and the code is quite similar for the two subsystems. Let me know if you would prefer this is split up.
This PR:
These metrics can help determine if errors are occurring with the hubble APIs, and diagnose potential performance impacts of making queries against hubble.
Additionally, by supporting configuring gRPC interceptors, we can further enhance our gRPC servers with additional middleware (eg: logging, tracing, auth, etc) in the future.