Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[BACKPORT 2.20][#12676] YSQL: Add Storage Rows Scanned to output of E…
…XPLAIN(ANALYZE, DIST). Summary: Original commit: a302b50 / D31111 This revision adds Storage Row statistics to the output of `EXPLAIN (ANALYZE, DIST)`. These stats help explain the selectivity of remote filters and index conditions, thereby allowing the user to distinguish between the work done by the storage layer vs the query layer. Consider this example: ``` yugabyte=# \d filter_test Table "public.filter_test" Column | Type | Collation | Nullable | Default --------+-----------------------------+-----------+----------+--------- h | integer | | not null | v1 | integer | | | ts | timestamp without time zone | | | Indexes: "filter_test_pkey" PRIMARY KEY, lsm (h ASC), colocation: true "filter_test_v1" lsm (v1 ASC), colocation: true Colocation: true yugabyte=# SELECT COUNT(*) FROM filter_test; count ------- 10000 (1 row) yugabyte=# EXPLAIN (ANALYZE, DIST) SELECT * FROM filter_test WHERE h <= 1000 AND h % 2 = 0 AND ts >= NOW() - (interval '30 days'); QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------------- Index Scan using filter_test_pkey on filter_test (cost=0.00..4.14 rows=1 width=16) (actual time=30.375..31.272 rows=146 loops=1) Index Cond: (h <= 1000) Remote Filter: ((h % 2) = 0) Filter: (ts >= (now() - '30 days'::interval)) Rows Removed by Filter: 354 Storage Table Read Requests: 1 Storage Table Read Execution Time: 29.232 ms Storage Table Rows Scanned: 1000 Planning Time: 0.311 ms Execution Time: 31.549 ms Storage Read Requests: 1 Storage Read Execution Time: 29.232 ms Storage Rows Scanned: 1000 Storage Write Requests: 0 Catalog Read Requests: 0 Catalog Write Requests: 0 Storage Flush Requests: 0 Storage Execution Time: 29.232 ms Peak Memory Usage: 24 kB (19 rows) ``` From the output of EXPLAIN for the above query, the user understands that: - 1000 rows out of 10000 rows in the table filter_test were scanned by Index Scan in DocDB. This represents the selectivity of the Index Condition - 146 + 354 = 500 rows out of 1000 matched the modulo (remote) filter and were returned to Postgres - 146 rows out of 500 matched the timestamp filter and were returned as the result of the query This revision adds the following stats to scan plan nodes: # `Storage Table Rows Scanned`: The number of logical rows scanned in the main table # `Storage Index Rows Scanned`: The number of logical rows scanned in secondary indexes **Can existing DocDB metrics be used instead?** The difference between the DocDB metrics `docdb_keys_found` and `docdb_obsolete_keys_found` for a given query/plan node provides a good approximation of the number of scanned rows. However, this approach suffers from the following pitfalls: - It is an approximation and does not provide the exact the number. - Relying on the implementation of key retrieval in DocDB causes a loss of abstraction. - More work has to be done to distinguish between Table Rows and Index Rows. **Is the distinction between Table and Index Rows scanned important?** In most cases, one can expect to see identical number of scanned rows for tables and indexes. However the two can differ when an index has both a condition and a remote filter on it. Other applicable scenarios include lossy bitmap scans. Consider this example: ``` yugabyte=# CREATE TABLE filter_test(a int, b int, ts timestamp); CREATE TABLE yugabyte=# INSERT INTO filter_test (SELECT i, i, NOW() FROM generate_series(1, 1000) AS i); INSERT 0 1000 yugabyte=# CREATE INDEX ON filter_test(a ASC); CREATE INDEX yugabyte=# \d filter_test Table "public.filter_test" Column | Type | Collation | Nullable | Default --------+-----------------------------+-----------+----------+--------- a | integer | | | b | integer | | | ts | timestamp without time zone | | | Indexes: "filter_test_a_idx" lsm (a ASC) yugabyte=# EXPLAIN (ANALYZE, DIST) SELECT * FROM filter_test WHERE a < 100 AND a % 2 = 0; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------- Index Scan using filter_test_a_idx on filter_test (cost=0.00..5.33 rows=10 width=16) (actual time=6.279..6.328 rows=49 loops=1) Index Cond: (a < 100) Remote Index Filter: ((a % 2) = 0) Storage Table Read Requests: 1 Storage Table Read Execution Time: 1.207 ms Storage Table Rows Scanned: 49 Storage Index Read Requests: 1 Storage Index Read Execution Time: 2.639 ms Storage Index Rows Scanned: 99 Planning Time: 17.864 ms Execution Time: 6.852 ms Storage Read Requests: 2 Storage Read Execution Time: 3.845 ms Storage Rows Scanned: 148 Storage Write Requests: 0 Catalog Read Requests: 20 Catalog Read Execution Time: 54.329 ms Catalog Write Requests: 0 Storage Flush Requests: 0 Storage Execution Time: 58.174 ms Peak Memory Usage: 24 kB (21 rows) ``` In this example: - The index condition `a < 100` is applied to selectively scan 99 rows out of 1000 in DocDB. - The index filter `(a % 2) = 0` then filters out 49 rows of 99 in which the value of `a` is even. - The corresponding 49 rows are scanned on the main table (`filter_test`) and returned to Postgres which in turn returns them as the result of the query. Notes: - The number of rows scanned by Catalog requests are not shown. - The rows scanned by an IndexScan on the Primary Key of a user table are counted as Storage Table Rows Scanned. **Upgrade/Rollback safety:** This revision does not alter persisted data. New fields are added to the tserver response (to pggate) proto. Checks are present in pggate to handle both the presence and absence of these fields in the response proto. This is upgrade and rollback safe. No new flags are added to guard the feature. Jira: DB-569 Test Plan: ``` ./yb_build.sh ---java-test org.yb.pgsql.TestPgExplainAnalyze ./yb_build.sh --java-test org.yb.pgsql.TestPgExplainAnalyzeColocated ./yb_build.sh --java-test org.yb.pgsql.TestPgExplainAnalyzeScans#testIndexScanConditionAndFilter ``` Reviewers: esheng, telgersma, amartsinchyk Reviewed By: telgersma Subscribers: yql, smishra, ybase Tags: #jenkins-ready Differential Revision: https://phorge.dev.yugabyte.com/D31931
- Loading branch information