[YSQL] Change explain output of distinct pushdown to be more user friendly #20831

tanujnay112 · 2024-01-29T21:17:41Z

Jira Link: DB-9820

Description

5d84faa introduced planner support for pushing down DISTINCT queries to supported indexes.
Here is an example of its usage.

yugabyte=# explain select distinct  col_int_key from mm;
                                            QUERY PLAN                                             
---------------------------------------------------------------------------------------------------
 Unique  (cost=0.00..12.00 rows=93 width=4)
   ->  Distinct Index Only Scan using idx_mm_col_int_key on mm  (cost=0.00..12.00 rows=93 width=4)
         Distinct Prefix: 1
(3 rows)

The output "Distinct Prefix" encodes what prefix of the index we run distinct on. In this case, we do so on the first column of idx_mm_col_int_key, col_int_key. It would be more user-friendly to have this field read something else like maybe
Distinct columns: col_int_key

Issue Type

kind/enhancement

Warning: Please confirm that this issue does not contain any sensitive information

I confirm this issue does not contain any sensitive information.

The text was updated successfully, but these errors were encountered:

…tput Summary: ### Objective Before this change, the explain output looks as follows ``` yugabyte=# explain select distinct col_int_key from mm; QUERY PLAN --------------------------------------------------------------------------------------------------- Unique (cost=0.00..12.00 rows=93 width=4) -> Distinct Index Only Scan using idx_mm_col_int_key on mm (cost=0.00..12.00 rows=93 width=4) Distinct Prefix: 1 (3 rows) ``` We displayed the length of the distinct prefix because that is the sole parameter necessary for the HybridScan on the DocDB side. However, users may find it more useful if we instead displayed the keys corresponding to the index columns instead. The new output looks ``` yugabyte=# explain select distinct k1 from t; QUERY PLAN ---------------------------------------------------------------------------------- Unique (cost=0.00..22.80 rows=200 width=4) -> Distinct Index Scan using t_pkey on t (cost=0.00..22.80 rows=200 width=4) Distinct Prefix Keys: k1 (3 rows) ``` ### Solution Pick the prefix length number of leading columns from the list of index targets and print them in explain. Also, change the regression test output to reflect the change. Jira: DB-9820 Test Plan: Jenkins ./yb_build.sh --java-test TestPgRegressDistinctPushdown Reviewers: tnayak Reviewed By: tnayak Subscribers: yql Differential Revision: https://phorge.dev.yugabyte.com/D33042

…in the explain output Summary: Original commit: e75e20d / D33042 ### Objective Before this change, the explain output looks as follows ``` yugabyte=# explain select distinct col_int_key from mm; QUERY PLAN --------------------------------------------------------------------------------------------------- Unique (cost=0.00..12.00 rows=93 width=4) -> Distinct Index Only Scan using idx_mm_col_int_key on mm (cost=0.00..12.00 rows=93 width=4) Distinct Prefix: 1 (3 rows) ``` We displayed the length of the distinct prefix because that is the sole parameter necessary for the HybridScan on the DocDB side. However, users may find it more useful if we instead displayed the keys corresponding to the index columns instead. The new output looks ``` yugabyte=# explain select distinct k1 from t; QUERY PLAN ---------------------------------------------------------------------------------- Unique (cost=0.00..22.80 rows=200 width=4) -> Distinct Index Scan using t_pkey on t (cost=0.00..22.80 rows=200 width=4) Distinct Prefix Keys: k1 (3 rows) ``` ### Solution Pick the prefix length number of leading columns from the list of index targets and print them in explain. Also, change the regression test output to reflect the change. Jira: DB-9820 Test Plan: Jenkins ./yb_build.sh --java-test TestPgRegressDistinctPushdown Reviewers: tnayak, smishra Reviewed By: tnayak Subscribers: yql Tags: #jenkins-ready Differential Revision: https://phorge.dev.yugabyte.com/D33230

… the explain output Summary: Original commit: e75e20d / D33042 Before this change, the explain output looks as follows ``` yugabyte=# explain select distinct col_int_key from mm; QUERY PLAN --------------------------------------------------------------------------------------------------- Unique (cost=0.00..12.00 rows=93 width=4) -> Distinct Index Only Scan using idx_mm_col_int_key on mm (cost=0.00..12.00 rows=93 width=4) Distinct Prefix: 1 (3 rows) ``` We displayed the length of the distinct prefix because that is the sole parameter necessary for the HybridScan on the DocDB side. However, users may find it more useful if we instead displayed the keys corresponding to the index columns instead. The new output looks ``` yugabyte=# explain select distinct k1 from t; QUERY PLAN ---------------------------------------------------------------------------------- Unique (cost=0.00..22.80 rows=200 width=4) -> Distinct Index Scan using t_pkey on t (cost=0.00..22.80 rows=200 width=4) Distinct Prefix Keys: k1 (3 rows) ``` Pick the prefix length number of leading columns from the list of index targets and print them in explain. Also, change the regression test output to reflect the change. Jira: DB-9820 Test Plan: Jenkins ./yb_build.sh --java-test TestPgRegressDistinctPushdown Reviewers: tnayak, smishra Reviewed By: tnayak Subscribers: yql Tags: #jenkins-ready Differential Revision: https://phorge.dev.yugabyte.com/D33234

…plain output Summary: ### Objective Before this change, the explain output looks as follows ``` yugabyte=# explain select distinct col_int_key from mm; QUERY PLAN --------------------------------------------------------------------------------------------------- Unique (cost=0.00..12.00 rows=93 width=4) -> Distinct Index Only Scan using idx_mm_col_int_key on mm (cost=0.00..12.00 rows=93 width=4) Distinct Prefix: 1 (3 rows) ``` We displayed the length of the distinct prefix because that is the sole parameter necessary for the HybridScan on the DocDB side. However, users may find it more useful if we instead displayed the keys corresponding to the index columns instead. The new output looks ``` yugabyte=# explain select distinct k1 from t; QUERY PLAN ---------------------------------------------------------------------------------- Unique (cost=0.00..22.80 rows=200 width=4) -> Distinct Index Scan using t_pkey on t (cost=0.00..22.80 rows=200 width=4) Distinct Prefix Keys: k1 (3 rows) ``` ### Solution Pick the prefix length number of leading columns from the list of index targets and print them in explain. Also, change the regression test output to reflect the change. Jira: DB-9820 Test Plan: Jenkins ./yb_build.sh --java-test TestPgRegressDistinctPushdown Reviewers: tnayak Reviewed By: tnayak Subscribers: yql Differential Revision: https://phorge.dev.yugabyte.com/D33042

tanujnay112 added area/ysql Yugabyte SQL (YSQL) status/awaiting-triage Issue awaiting triage labels Jan 29, 2024

yugabyte-ci added kind/enhancement This is an enhancement of an existing feature priority/medium Medium priority issue labels Jan 29, 2024

tanujnay112 assigned vbalvida Jan 29, 2024

pao214 assigned pao214 and unassigned vbalvida Feb 9, 2024

sushantrmishra removed the status/awaiting-triage Issue awaiting triage label Feb 9, 2024

pao214 closed this as completed Mar 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[YSQL] Change explain output of distinct pushdown to be more user friendly #20831

[YSQL] Change explain output of distinct pushdown to be more user friendly #20831

tanujnay112 commented Jan 29, 2024 •

edited by jira bot

Loading

[YSQL] Change explain output of distinct pushdown to be more user friendly #20831

[YSQL] Change explain output of distinct pushdown to be more user friendly #20831

Comments

tanujnay112 commented Jan 29, 2024 • edited by jira bot Loading

Description

Issue Type

Warning: Please confirm that this issue does not contain any sensitive information

tanujnay112 commented Jan 29, 2024 •

edited by jira bot

Loading