[docdb] Expose timeout metrics from the RPC layer #2196

bmatican · 2019-09-01T01:57:27Z

Jira Link: DB-1595
In absence of explicit application level metrics, timeouts at the TS layer might be the most likely indicator of user impact. If we can expose timeout counters from top-level TS RPCs (read/write), and potentially even from proxies (cql/ysql), we can use those as a high level filter for system issues (back-pressure, network hiccups, etc).

Furthermore, ensuring this stays at a flat 0 during rolling restarts would help as validation of providing zero downtime operations!

bmatican added kind/enhancement This is an enhancement of an existing feature area/docdb YugabyteDB core features labels Sep 1, 2019

bmatican added this to the v2.1 milestone Sep 1, 2019

bmatican added this to To Do in YBase features via automation Sep 1, 2019

bmatican mentioned this issue Sep 1, 2019

[docdb] Improve experience during planned restarts #2198

Open

bmatican added this to To do in Usability via automation Oct 30, 2019

bmatican removed this from the v2.1 milestone Jun 8, 2020

yugabyte-ci added the priority/medium Medium priority issue label Jun 8, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[docdb] Expose timeout metrics from the RPC layer #2196

[docdb] Expose timeout metrics from the RPC layer #2196

bmatican commented Sep 1, 2019 •

edited by yugabyte-ci

[docdb] Expose timeout metrics from the RPC layer #2196

[docdb] Expose timeout metrics from the RPC layer #2196

Comments

bmatican commented Sep 1, 2019 • edited by yugabyte-ci

bmatican commented Sep 1, 2019 •

edited by yugabyte-ci