Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[YSQL] Optimize use of gettimeofday() for timing RPCs in Write Path #17864

Closed
1 task done
karthik-ramanathan-3006 opened this issue Jun 20, 2023 · 0 comments
Closed
1 task done
Assignees
Labels
area/ysql Yugabyte SQL (YSQL) kind/enhancement This is an enhancement of an existing feature priority/medium Medium priority issue

Comments

@karthik-ramanathan-3006
Copy link
Contributor

karthik-ramanathan-3006 commented Jun 20, 2023

Jira Link: DB-6950

Description

Background:
RPCs to DocDB in the read path tend to be synchronous ie. query execution blocks until the RPCs obtain a response.
Consequently, when the timing is disabled for Explain Analyze queries that perform reads, we can avoid timing the appropriate RPCs by passing down the query flags down to the execution layer.

Since RPCs that perform writes are asynchronous, there is a disconnect between the flags requested by the query (such as timing on/off), and the asynchronous execution of the RPCs. Today, to get accurate information about all RPCs, timing information is enabled by default for all RPCs in the write path, with no option to control it.

The goal of this ticket is to come up with a better approach for timing write RPCs and to have it disabled by default.

Warning: Please confirm that this issue does not contain any sensitive information

  • I confirm this issue does not contain any sensitive information.
@karthik-ramanathan-3006 karthik-ramanathan-3006 added area/ysql Yugabyte SQL (YSQL) status/awaiting-triage Issue awaiting triage labels Jun 20, 2023
@yugabyte-ci yugabyte-ci added kind/bug This issue is a bug priority/medium Medium priority issue labels Jun 20, 2023
@karthik-ramanathan-3006 karthik-ramanathan-3006 added kind/enhancement This is an enhancement of an existing feature and removed kind/bug This issue is a bug labels Jun 20, 2023
@yugabyte-ci yugabyte-ci added kind/bug This issue is a bug kind/enhancement This is an enhancement of an existing feature and removed kind/enhancement This is an enhancement of an existing feature status/awaiting-triage Issue awaiting triage kind/bug This issue is a bug labels Jun 20, 2023
karthik-ramanathan-3006 added a commit that referenced this issue Jul 27, 2023
…locks in Linux.

Summary:
This changes enables the usage of a high resolution (more granular) clock for timing RPCs requests in the asynchronous write path (pg_operation_buffer).

Prior to D26457, we had the following behavior for timing RPCs:
| Catalog Reads  |  Timing enabled by default.    | Low res timer in prod, high res in tests.
| Data Reads       |  Timing disabled by default.   | Low res timer in prod, high res in tests.
| Data Writes       | Timing disabled by default.   | Low res timer in prod, high res in tests.

After D26457, this is the default behavior:
| Catalog Reads |  Timing enabled by default.    | High res timer in prod and tests.
| Data Reads |  Timing disabled by default.        | High res timer in prod and tests.
| Data Writes | Timing disabled by default.         | Low res timer in prod and tests.

**With this change, we seek to make Data Writes use the the high res timer in both prod and tests.**
That is, the proposed behavior is as follows:
| Catalog Reads |  Timing enabled by default.    | High res timer in prod and tests.
| Data Reads |  Timing disabled by default.        | High res timer in prod and tests.
| Data Writes | Timing disabled by default.         | High res timer in prod and tests.

This change can be done for the following reasons:
1. Explain Analyze uses global variables for tracking execution state (such as timing = on/off). This allows such information to be accessible in the asynchronous write path.
2. Write flushes necessarily happen at/before statement boundaries. This means that we will NOT mistakenly associate writes flushes of a query (n - 1) with that of query (n).

Attaching perf reports in comments.
Jira: DB-7383, DB-6950

Test Plan:
```
ybd --java-test 'org.yb.pgsql.TestPgExplainAnalyze'
```

Reviewers: telgersma

Reviewed By: telgersma

Subscribers: yql, smishra

Differential Revision: https://phorge.dev.yugabyte.com/D27285
karthik-ramanathan-3006 added a commit that referenced this issue Jul 28, 2023
…t to granular clocks in Linux.

Summary:
Original commit: 55f595f / D27285
This changes enables the usage of a high resolution (more granular) clock for timing RPCs requests in the asynchronous write path (pg_operation_buffer).

Prior to D26457, we had the following behavior for timing RPCs:
| Catalog Reads  |  Timing enabled by default.    | Low res timer in prod, high res in tests.
| Data Reads       |  Timing disabled by default.   | Low res timer in prod, high res in tests.
| Data Writes       | Timing disabled by default.   | Low res timer in prod, high res in tests.

After D26457, this is the default behavior:
| Catalog Reads |  Timing enabled by default.    | High res timer in prod and tests.
| Data Reads |  Timing disabled by default.        | High res timer in prod and tests.
| Data Writes | Timing disabled by default.         | Low res timer in prod and tests.

**With this change, we seek to make Data Writes use the the high res timer in both prod and tests.**
That is, the proposed behavior is as follows:
| Catalog Reads |  Timing enabled by default.    | High res timer in prod and tests.
| Data Reads |  Timing disabled by default.        | High res timer in prod and tests.
| Data Writes | Timing disabled by default.         | High res timer in prod and tests.

This change can be done for the following reasons:
1. Explain Analyze uses global variables for tracking execution state (such as timing = on/off). This allows such information to be accessible in the asynchronous write path.
2. Write flushes necessarily happen at/before statement boundaries. This means that we will NOT mistakenly associate writes flushes of a query (n - 1) with that of query (n).

Attaching perf reports in comments.
Jira: DB-7383, DB-6950

Test Plan:
```
ybd --java-test 'org.yb.pgsql.TestPgExplainAnalyze'
```

Reviewers: telgersma, smishra

Reviewed By: smishra

Subscribers: smishra, yql

Differential Revision: https://phorge.dev.yugabyte.com/D27354
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/ysql Yugabyte SQL (YSQL) kind/enhancement This is an enhancement of an existing feature priority/medium Medium priority issue
Projects
None yet
Development

No branches or pull requests

2 participants