Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: add client time to top-level execution stats #117591

Merged
merged 1 commit into from Jan 23, 2024

Conversation

yuzefovich
Copy link
Member

@yuzefovich yuzefovich commented Jan 10, 2024

This commit adds "client time" execution statistic to plan.txt file of the stmt bundles which tracks the amount of time that was spent waiting for the client to consume the query result (as well as constructing the PGWire messages in the first place). This should be helpful when investigating some slow queries were the bottleneck is actually the client. Note, however, this new statistic is unlikely to help in explicit EXPLAIN ANALYZE cases since in that setup we discard the output rows, so there is nothing for the client to consume.

An example output of plan.txt from the test added in this commit (where we inject 1s of sleep between reading messages corresponding to two rows):

        planning time: 104µs
        execution time: 1s
        distribution: full
        vectorized: true
        rows decoded from KV: 2 (48 B, 1 gRPC calls)
        cumulative time spent in KV: 434µs
        maximum memory usage: 10 KiB
        network usage: 0 B (0 messages)
        sql cpu time: 4µs
        client time: 1s
        isolation level: serializable
        priority: normal
        quality of service: regular

        • scan
          nodes: n1
          actual row count: 2
          KV time: 434µs
          KV contention time: 0µs
          KV rows decoded: 2
          KV bytes read: 48 B
          KV gRPC calls: 1
          estimated max memory allocated: 10 KiB
          sql cpu time: 4µs
          missing stats
          table: t@t_pkey
          spans: FULL SCAN

Fixes: #115110.

Release note (sql change): Additional execution statistic that measures "client time" is now included into plan.txt files of the stmt bundles. This client time tracks how long the query execution was blocked on the client receiving the PGWire protocol messages. Note that in EXPLAIN ANALYZE context it doesn't make much sense because in that variant the output rows are discarded and are not communicated to the client.

Copy link

blathers-crl bot commented Jan 10, 2024

It looks like your PR touches production code but doesn't add or edit any test code. Did you consider adding tests to your PR?

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@yuzefovich
Copy link
Member Author

Ignore the first commit here - it's in a separate PR #118149.

Copy link
Collaborator

@DrewKimball DrewKimball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: Cool!

Reviewed 1 of 1 files at r1, 8 of 8 files at r2, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @michae2)

This commit adds "client time" execution statistic to `plan.txt` file of
the stmt bundles which tracks the amount of time that was spent waiting
for the client to consume the query result (as well as constructing the
PGWire messages in the first place). This should be helpful when
investigating some slow queries were the bottleneck is actually the
client. Note, however, this new statistic is unlikely to help in
explicit EXPLAIN ANALYZE cases since in that setup we discard the output
rows, so there is nothing for the client to consume.

An example output of `plan.txt` from the test added in this commit
(where we inject 1s of sleep between reading messages corresponding to
two rows):
```
        planning time: 104µs
        execution time: 1s
        distribution: full
        vectorized: true
        rows decoded from KV: 2 (48 B, 1 gRPC calls)
        cumulative time spent in KV: 434µs
        maximum memory usage: 10 KiB
        network usage: 0 B (0 messages)
        sql cpu time: 4µs
        client time: 1s
        isolation level: serializable
        priority: normal
        quality of service: regular

        • scan
          nodes: n1
          actual row count: 2
          KV time: 434µs
          KV contention time: 0µs
          KV rows decoded: 2
          KV bytes read: 48 B
          KV gRPC calls: 1
          estimated max memory allocated: 10 KiB
          sql cpu time: 4µs
          missing stats
          table: t@t_pkey
          spans: FULL SCAN
```

Release note (sql change): Additional execution statistic that measures
"client time" is now included into `plan.txt` files of the stmt bundles.
This client time tracks how long the query execution was blocked on the
client receiving the PGWire protocol messages. Note that in EXPLAIN
ANALYZE context it doesn't make much sense because in that variant the
output rows are discarded and are not communicated to the client.
@yuzefovich
Copy link
Member Author

TFTR!

bors r+

@craig
Copy link
Contributor

craig bot commented Jan 23, 2024

Build succeeded:

@craig craig bot merged commit 585ffa0 into cockroachdb:master Jan 23, 2024
8 of 9 checks passed
@yuzefovich yuzefovich deleted the client-time branch January 23, 2024 03:58
Copy link
Collaborator

@michae2 michae2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: Very nice! This is going to come in handy!

Reviewed 8 of 8 files at r2, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (and 1 stale)


pkg/sql/explain_bundle_test.go line 650 at r3 (raw file):

	urlTemplate := getBundleDownloadURL(t, sqlutils.MatrixToStr(rows))
	prefixLength := strings.LastIndex(urlTemplate, "/")
	url := urlTemplate[:prefixLength] + "/" + strconv.Itoa(id)

Hah, wow. Very clever.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

sql: add client / protocol time to explain analyze stats
4 participants