Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[YSQL] Change explain output of distinct pushdown to be more user friendly #20831

Closed
1 task done
tanujnay112 opened this issue Jan 29, 2024 · 0 comments
Closed
1 task done
Assignees
Labels
area/ysql Yugabyte SQL (YSQL) kind/enhancement This is an enhancement of an existing feature priority/medium Medium priority issue

Comments

@tanujnay112
Copy link
Contributor

tanujnay112 commented Jan 29, 2024

Jira Link: DB-9820

Description

5d84faa introduced planner support for pushing down DISTINCT queries to supported indexes.
Here is an example of its usage.

yugabyte=# explain select distinct  col_int_key from mm;
                                            QUERY PLAN                                             
---------------------------------------------------------------------------------------------------
 Unique  (cost=0.00..12.00 rows=93 width=4)
   ->  Distinct Index Only Scan using idx_mm_col_int_key on mm  (cost=0.00..12.00 rows=93 width=4)
         Distinct Prefix: 1
(3 rows)

The output "Distinct Prefix" encodes what prefix of the index we run distinct on. In this case, we do so on the first column of idx_mm_col_int_key, col_int_key. It would be more user-friendly to have this field read something else like maybe
Distinct columns: col_int_key

Issue Type

kind/enhancement

Warning: Please confirm that this issue does not contain any sensitive information

  • I confirm this issue does not contain any sensitive information.
@tanujnay112 tanujnay112 added area/ysql Yugabyte SQL (YSQL) status/awaiting-triage Issue awaiting triage labels Jan 29, 2024
@yugabyte-ci yugabyte-ci added kind/enhancement This is an enhancement of an existing feature priority/medium Medium priority issue labels Jan 29, 2024
@pao214 pao214 assigned pao214 and unassigned vbalvida Feb 9, 2024
@sushantrmishra sushantrmishra removed the status/awaiting-triage Issue awaiting triage label Feb 9, 2024
pao214 added a commit that referenced this issue Mar 15, 2024
…tput

Summary:
### Objective

Before this change, the explain output looks as follows

```
yugabyte=# explain select distinct  col_int_key from mm;
                                            QUERY PLAN
---------------------------------------------------------------------------------------------------
 Unique  (cost=0.00..12.00 rows=93 width=4)
   ->  Distinct Index Only Scan using idx_mm_col_int_key on mm  (cost=0.00..12.00 rows=93 width=4)
         Distinct Prefix: 1
(3 rows)
```

We displayed the length of the distinct prefix because that is the sole parameter necessary for the HybridScan on the DocDB side.

However, users may find it more useful if we instead displayed the keys corresponding to the index columns instead. The new output looks

```
yugabyte=# explain select distinct k1 from t;
                                    QUERY PLAN
----------------------------------------------------------------------------------
 Unique  (cost=0.00..22.80 rows=200 width=4)
   ->  Distinct Index Scan using t_pkey on t  (cost=0.00..22.80 rows=200 width=4)
         Distinct Prefix Keys: k1
(3 rows)
```

### Solution

Pick the prefix length number of leading columns from the list of index targets and print them in explain.
Also, change the regression test output to reflect the change.
Jira: DB-9820

Test Plan:
Jenkins
./yb_build.sh --java-test TestPgRegressDistinctPushdown

Reviewers: tnayak

Reviewed By: tnayak

Subscribers: yql

Differential Revision: https://phorge.dev.yugabyte.com/D33042
pao214 added a commit that referenced this issue Mar 17, 2024
…in the explain output

Summary:
Original commit: e75e20d / D33042
### Objective

Before this change, the explain output looks as follows

```
yugabyte=# explain select distinct  col_int_key from mm;
                                            QUERY PLAN
---------------------------------------------------------------------------------------------------
 Unique  (cost=0.00..12.00 rows=93 width=4)
   ->  Distinct Index Only Scan using idx_mm_col_int_key on mm  (cost=0.00..12.00 rows=93 width=4)
         Distinct Prefix: 1
(3 rows)
```

We displayed the length of the distinct prefix because that is the sole parameter necessary for the HybridScan on the DocDB side.

However, users may find it more useful if we instead displayed the keys corresponding to the index columns instead. The new output looks

```
yugabyte=# explain select distinct k1 from t;
                                    QUERY PLAN
----------------------------------------------------------------------------------
 Unique  (cost=0.00..22.80 rows=200 width=4)
   ->  Distinct Index Scan using t_pkey on t  (cost=0.00..22.80 rows=200 width=4)
         Distinct Prefix Keys: k1
(3 rows)
```

### Solution

Pick the prefix length number of leading columns from the list of index targets and print them in explain.
Also, change the regression test output to reflect the change.
Jira: DB-9820

Test Plan:
Jenkins
./yb_build.sh --java-test TestPgRegressDistinctPushdown

Reviewers: tnayak, smishra

Reviewed By: tnayak

Subscribers: yql

Tags: #jenkins-ready

Differential Revision: https://phorge.dev.yugabyte.com/D33230
pao214 added a commit that referenced this issue Mar 17, 2024
… the explain output

Summary:
Original commit: e75e20d / D33042

Before this change, the explain output looks as follows

```
yugabyte=# explain select distinct  col_int_key from mm;
                                            QUERY PLAN
---------------------------------------------------------------------------------------------------
 Unique  (cost=0.00..12.00 rows=93 width=4)
   ->  Distinct Index Only Scan using idx_mm_col_int_key on mm  (cost=0.00..12.00 rows=93 width=4)
         Distinct Prefix: 1
(3 rows)
```

We displayed the length of the distinct prefix because that is the sole parameter necessary for the HybridScan on the DocDB side.

However, users may find it more useful if we instead displayed the keys corresponding to the index columns instead. The new output looks

```
yugabyte=# explain select distinct k1 from t;
                                    QUERY PLAN
----------------------------------------------------------------------------------
 Unique  (cost=0.00..22.80 rows=200 width=4)
   ->  Distinct Index Scan using t_pkey on t  (cost=0.00..22.80 rows=200 width=4)
         Distinct Prefix Keys: k1
(3 rows)
```

Pick the prefix length number of leading columns from the list of index targets and print them in explain.
Also, change the regression test output to reflect the change.
Jira: DB-9820

Test Plan:
Jenkins
./yb_build.sh --java-test TestPgRegressDistinctPushdown

Reviewers: tnayak, smishra

Reviewed By: tnayak

Subscribers: yql

Tags: #jenkins-ready

Differential Revision: https://phorge.dev.yugabyte.com/D33234
asrinivasanyb pushed a commit to asrinivasanyb/yugabyte-db that referenced this issue Mar 18, 2024
…plain output

Summary:
### Objective

Before this change, the explain output looks as follows

```
yugabyte=# explain select distinct  col_int_key from mm;
                                            QUERY PLAN
---------------------------------------------------------------------------------------------------
 Unique  (cost=0.00..12.00 rows=93 width=4)
   ->  Distinct Index Only Scan using idx_mm_col_int_key on mm  (cost=0.00..12.00 rows=93 width=4)
         Distinct Prefix: 1
(3 rows)
```

We displayed the length of the distinct prefix because that is the sole parameter necessary for the HybridScan on the DocDB side.

However, users may find it more useful if we instead displayed the keys corresponding to the index columns instead. The new output looks

```
yugabyte=# explain select distinct k1 from t;
                                    QUERY PLAN
----------------------------------------------------------------------------------
 Unique  (cost=0.00..22.80 rows=200 width=4)
   ->  Distinct Index Scan using t_pkey on t  (cost=0.00..22.80 rows=200 width=4)
         Distinct Prefix Keys: k1
(3 rows)
```

### Solution

Pick the prefix length number of leading columns from the list of index targets and print them in explain.
Also, change the regression test output to reflect the change.
Jira: DB-9820

Test Plan:
Jenkins
./yb_build.sh --java-test TestPgRegressDistinctPushdown

Reviewers: tnayak

Reviewed By: tnayak

Subscribers: yql

Differential Revision: https://phorge.dev.yugabyte.com/D33042
@pao214 pao214 closed this as completed Mar 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/ysql Yugabyte SQL (YSQL) kind/enhancement This is an enhancement of an existing feature priority/medium Medium priority issue
Projects
None yet
Development

No branches or pull requests

5 participants