Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PR #104394 - ui: add Networking metrics dashboard #17252

Closed
cockroach-teamcity opened this issue Jun 14, 2023 · 5 comments
Closed

PR #104394 - ui: add Networking metrics dashboard #17252

cockroach-teamcity opened this issue Jun 14, 2023 · 5 comments

Comments

@cockroach-teamcity
Copy link
Member

cockroach-teamcity commented Jun 14, 2023

Exalate commented:

Related PR: cockroachdb/cockroach#104394
Commit: cockroachdb/cockroach@0f9d338
Epic: none


Release note (ui change): the DB Console metrics dashboard now has a
"Networking" tab, and a few metrics previously displayed on the Hardware tab
have moved there.

Jira Issue: DOC-8075

@florence-crl
Copy link
Contributor

florence-crl commented Sep 8, 2023

When the networking dashboard is documented, update troubleshooting page (cluster-setup-troubleshooting.md) as described in this comment in PR 17824

Copy link

Florence Morris (florence-crl) commented:
HI Tobias Grieger (since you worked on PR 104394) would you be able to provide more description for these 3 graphs on the Networking tab? I just included their tooltips below.

h2. RPC Heartbeat Latency: 50th percentile

  • Round-trip latency for recent successful outgoing heartbeats.
  • Metric: {{cr.node.round-trip-latency-p50}}

h2. RPC Heartbeat Latency: 99th percentile

  • Round-trip latency for recent successful outgoing heartbeats.
  • Metric: {{cr.node.round-trip-latency-p99}}

h2. Unhealthy RPC Connections

  • The number of outgoing connections on each node that are in an unhealthy state.
  • Metric: {{cr.node.rpc.connection.unhealthy}} Gauge of current connections in an unhealthy state (not bidirectionally connected or heartbeating)

Questions:

Does “successful outgoing heartbeats” refer to the RPC ping mention in the docs here

{quote}From the client, we send an additional RPC ping every 1s to check that connections are alive.{quote}

  1. Does the Unhealthy RPC Connection metric include the “heartbeating” pings?

  2. What does “bidirectionally connected” refer to?

Thanks!

Copy link

Florence Morris (florence-crl) commented:
HiAndrii Vorobyov I saw in slack that Tobias is on sabbatical until July. As the reviewer on PR 104394 would you be able to answer my questions in my previous comment?

Copy link

Andrii Vorobyov (koorosh) commented:
Florence Morris , here’s extended comment that Tobias added regarding RPC Heartbeat Latency:

{noformat}Name: "round-trip-latency",
Help: `Distribution of round-trip latencies with other nodes.

This only reflects successful heartbeats and measures gRPC overhead as well as
possible head-of-line blocking. Elevated values in this metric may hint at
network issues and/or saturation, but they are no proof of them. CPU overload
can similarly elevate this metric. The operator should look towards OS-level
metrics such as packet loss, retransmits, etc, to conclusively diagnose network
issues. Heartbeats are not very frequent (~seconds), so they may not capture
rare or short-lived degradations.
`,{noformat}

{quote}1. Does “successful outgoing heartbeats” refer to the RPC ping mention in the docs here {quote}

Correct.

{quote}2. Does the Unhealthy RPC Connection metric include the “heartbeating” pings?{quote}

Correct, it includes all outgoing requests including “heartbeating” pings.

{quote}3. What does “bidirectionally connected” refer to?{quote}

It refers to the process of establishing connection between nodes.

“bidirectionally connected” means that connection considered to be successful if Node 1 sends request to Node 2 and Node 2 dials back (sends request back to Node 1). It ensures that communication is healthy in both directions.

Copy link

Florence Morris (florence-crl) commented:
Hi Andrii Vorobyov would you be able to tech review my docs PR: #18075 ? I was not able to add you as a reviewer in Github. thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants