[YSQL] Unexpected SELECT DISTINCT Behaviour with mixed primary key, non-primary key clauses or aggregate targets. #17648

vbalvida · 2023-06-01T18:23:42Z

Jira Link: DB-6769

Description

We fail to retrieve certain rows with SELECT DISTINCT because we push the uniqueness constraint to lower layers without also pushing the required filters or aggregation functions past the de-duplication operator. Below we present one such example. See the comments for more such examples.

Example

Setup

CREATE TABLE t1(r1 INT, r2 INT, s INT, PRIMARY KEY(r1 ASC, r2 ASC));
INSERT INTO t1 (SELECT 1, i, i FROM GENERATE_SERIES(1, 1000) AS i);

Distinct Query

/*+Set(enable_hashagg false)*/ SELECT DISTINCT r1 FROM t1 WHERE s = 500 AND r1 <= 1000;

The expected behavior is to return one row with the value 1, but the current behavior does not yield any rows.

To further elaborate the issue, lets consider two more scenarios,

No clause on r1

/*+Set(enable_hashagg false)*/ SELECT DISTINCT r1 FROM t1 WHERE s = 1000;

This query returns the expected row since we do not use scan choices in this scenario.

No clause on s

/*+Set(enable_hashagg false)*/ SELECT DISTINCT r1 FROM t1 WHERE r1 <= 500;

This query returns the expected row since the scan choices module is aware of all the bounds.

However, scan choices is not aware of where clauses on the non-primary keys. Hence, in combination, the query behaves unexpectedly.

We just discussed an example involving a filter pushed down to the DocDB layer. There are more examples in comments where further processing on the postgres side makes it inaccurate to push down the uniqueness constraint to DocDB without also pushing down the surrounding constraints/functions.

Warning: Please confirm that this issue does not contain any sensitive information

I confirm this issue does not contain any sensitive information.

The text was updated successfully, but these errors were encountered:

vbalvida · 2023-06-13T21:15:51Z

Found another example that fails to return the correct results

With the following setup,

CREATE TABLE t2 (r1 INT, r2 INT, s timestamp, PRIMARY KEY(r1 ASC, r2 ASC));
INSERT INTO t2 VALUES (1, 1, NOW()), (1, 2, NOW() + INTERVAL '1 day');

the query

/*+Set(enable_hashagg false)*/ SELECT DISTINCT r1 FROM t2 WHERE s > now() AND r1 <= 1000;

Returns no rows when the query

/*+Set(enable_hashagg false)*/ SELECT r1 FROM t2 WHERE s > now() AND r1 <= 1000;

returns 1 row corresponding to the second tuple in the original table.

This particular query differs from the one present in the issue in that the filtering happens on the postgres side and not all filter clauses are pushed down to DocDB.

The above behavior has an notable implication. Avoiding pushing down DISTINCT past any filters (both on the DocDB side and the postgres side) is non-trivial. Instead, we approach the bug by being very apprehensive about when to use the prefix logic. This is not optimal and hence created #17741 for improvements increasing the applicability of prefix based skipping.

vbalvida · 2023-06-15T19:52:37Z

Found yet another interesting query that behaves unexpectedly

Setup

CREATE TABLE t(r1 INT, r2 INT, PRIMARY KEY(r1 ASC, r2 ASC));
INSERT INTO t (SELECT 1, i FROM GENERATE_SERIES(1, 1000) AS i);
SET yb_debug_log_docdb_requests TO true;

Query

/*+Set(enable_hashagg false)*/ SELECT DISTINCT COUNT(tableoid) FROM t WHERE r1 <= 1000 GROUP BY r1;

Returned result

 count 
-------
     1
(1 row)

Postgres result

 count
-------
  1000
(1 row)

DocDB log

{ READ active: 1 read_time: { read: <invalid> local_limit: <invalid> global_limit: <invalid> in_txn_limit: <invalid> serial_no: 0 } request: client: YQL_CLIENT_PGSQL stmt_id: 4698329376 schema_version: 0 targets { column_id: 10 } targets { column_id: -8 } column_refs { ids: 10 } is_forward_scan: 1 is_aggregate: 0 limit: 1024 return_paging_state: 1 ysql_catalog_version: 1 table_id: "000033f5000030008000000000004000" condition_expr { condition { op: QL_OP_AND operands { condition { op: QL_OP_LESS_THAN_EQUAL operands { column_id: 10 } operands { value { int32_value: 1000 } } } } } } upper_bound { key: "48800003E87E007E21" is_inclusive: 1 } col_refs { column_id: 10 attno: 1 } partition_key: "$~!" prefix_length: 1 size_limit: 0 }

Observe the length of the prefix. The prefix must encompass both columns of the row because the query involves an aggregation.

Additional Caveat

This issue manifests when we request a system column and not a regular column which might sound surprising. The truth is that the postgres optimization layer requests the complete tuple in situations where projecting the tuples is more expensive for the executor. This doesn't happen for system columns. More importantly, the optimization to return the whole tuple may not be accurate in our case since we transfer tuples over the network. Hence, the optimization maybe disabled in the future.

Conclusion

Ensuring that pushdown works with all aspects of SELECT DISTINCT is non-trivial because of the complexity of postgres grammar. Created #17801 for emergency scenarios for when we need to disable this particular feature.

…nditions Summary: We fail to retrieve certain rows with SELECT DISTINCT because we **push** the **uniqueness** constraint past some **filters** or some **aggregation** targets. **Context** The change https://phorge.dev.yugabyte.com/D20742 introduces pushdown logic for the SELECT DISTINCT clause. In this case, we ask the DocDB layer to not return duplicate values. The DocDB layer utilizes the fact that the primary key is unique to avoid executing any uniqueness constraints. Moreover, whenever the requested columns are a subset of a prefix of the primary key then an additional optimization is applied. This optimization skips over certain rows as long as these rows do not change the said prefix. //Example Table:// | r1 | r2 | v | | 1 | 1 | 1 | | 1 | 2 | 2 | Here, `r1` and `r2` combined constitute the primary key. `s` is not primary. If we request only the first column in a SELECT DISTINCT clause, the DocDB layer does not seek the second row since it is a duplicate as far as the key `r1` is concerned. **Issue** Observe that we cannot apply the uniqueness constraint (DISTINCTness) until all the filters have been applied. However, in the above example, returning only the first row leads to an incorrect result when we have an additional filter on the third column `v` such as `v = 2`. This is because, the scan logic in DocDB only returns the first row while it is in fact the second row that satisfies the condition. Additionally, the predicates may also be of the form `r1 < now()` where we are unable to pushdown the predicate to the DocDB layer or even push it down as an index scan predicate such as `r1 * r2 >= 0`. **Fix** Do not set a prefix whenever there is a reference to non primary keys. Also do not set a prefix whenever there are aggregate functions as target since an incorrect prefix there might produce inaccurate results. **Applicability** Prefix pushdown optimization is in effect when we have - An Index Scan - No additional filters on top of those required by the Index Scan - No aggregate functions as targets Jira: DB-6769 Test Plan: Jenkins ``` ybd --java-test org.yb.pgsql.TestPgSelect#testDistinctRemoteFilter ybd --java-test org.yb.pgsql.TestPgSelect#testDistinctLocalFilter ybd --java-test org.yb.pgsql.TestPgSelect#testDistinctAgg ``` Reviewers: tnayak, mihnea Reviewed By: tnayak Differential Revision: https://phorge.dev.yugabyte.com/D25980

…th non index conditions Summary: Original commit: db3d7f4 / D25980 We fail to retrieve certain rows with SELECT DISTINCT because we **push** the **uniqueness** constraint past some **filters** or some **aggregation** targets. **Context** The change https://phorge.dev.yugabyte.com/D20742 introduces pushdown logic for the SELECT DISTINCT clause. In this case, we ask the DocDB layer to not return duplicate values. The DocDB layer utilizes the fact that the primary key is unique to avoid executing any uniqueness constraints. Moreover, whenever the requested columns are a subset of a prefix of the primary key then an additional optimization is applied. This optimization skips over certain rows as long as these rows do not change the said prefix. //Example Table:// | r1 | r2 | v | | 1 | 1 | 1 | | 1 | 2 | 2 | Here, `r1` and `r2` combined constitute the primary key. `s` is not primary. If we request only the first column in a SELECT DISTINCT clause, the DocDB layer does not seek the second row since it is a duplicate as far as the key `r1` is concerned. **Issue** Observe that we cannot apply the uniqueness constraint (DISTINCTness) until all the filters have been applied. However, in the above example, returning only the first row leads to an incorrect result when we have an additional filter on the third column `v` such as `v = 2`. This is because, the scan logic in DocDB only returns the first row while it is in fact the second row that satisfies the condition. Additionally, the predicates may also be of the form `r1 < now()` where we are unable to pushdown the predicate to the DocDB layer or even push it down as an index scan predicate such as `r1 * r2 >= 0`. **Fix** Do not set a prefix whenever there is a reference to non primary keys. Also do not set a prefix whenever there are aggregate functions as target since an incorrect prefix there might produce inaccurate results. **Applicability** Prefix pushdown optimization is in effect when we have - An Index Scan - No additional filters on top of those required by the Index Scan - No aggregate functions as targets Jira: DB-6769 Test Plan: Jenkins ``` ybd --java-test org.yb.pgsql.TestPgSelect#testDistinctRemoteFilter ybd --java-test org.yb.pgsql.TestPgSelect#testDistinctLocalFilter ybd --java-test org.yb.pgsql.TestPgSelect#testDistinctAgg ``` Reviewers: tnayak, mihnea Reviewed By: tnayak Differential Revision: https://phorge.dev.yugabyte.com/D26351

Summary: Add a GUC option named `yb_enable_distinct_pushdown`. This option controls whether or not the pushdown feature for SELECT DISTINCT is in effect. **Context** The issue #17648 documents several issues found with the current implementation of SELECT DISTINCT. More importantly, the source of all these issues can be traced back to pushing down the DISTINCT operator too deep past operations such as filters and aggregations. These lead to correctness issues and we want to provide users with the flexibility to toggle the pushdown feature to mitigate issues with the current implementation as well as in anticipation of issues arising from future changes to the pushdown feature. Jira: DB-6890 Test Plan: ``` ./yb_build.sh --java-test TestPgRegressDistinctPushdown ``` Reviewers: tnayak Reviewed By: tnayak Subscribers: jason Differential Revision: https://phorge.dev.yugabyte.com/D26268

Summary: Original commit: 5104be4 / D26268 Add a GUC option named `yb_enable_distinct_pushdown`. This option controls whether or not the pushdown feature for SELECT DISTINCT is in effect. **Context** The issue #17648 documents several issues found with the current implementation of SELECT DISTINCT. More importantly, the source of all these issues can be traced back to pushing down the DISTINCT operator too deep past operations such as filters and aggregations. These lead to correctness issues and we want to provide users with the flexibility to toggle the pushdown feature to mitigate issues with the current implementation as well as in anticipation of issues arising from future changes to the pushdown feature. Jira: DB-6890 Test Plan: ``` ./yb_build.sh --java-test TestPgRegressDistinctPushdown ``` Reviewers: tnayak Reviewed By: tnayak Subscribers: jason Differential Revision: https://phorge.dev.yugabyte.com/D26589

…index conditions Summary: We fail to retrieve certain rows with SELECT DISTINCT because we **push** the **uniqueness** constraint past some **filters** or some **aggregation** targets. **Context** The change https://phorge.dev.yugabyte.com/D20742 introduces pushdown logic for the SELECT DISTINCT clause. In this case, we ask the DocDB layer to not return duplicate values. The DocDB layer utilizes the fact that the primary key is unique to avoid executing any uniqueness constraints. Moreover, whenever the requested columns are a subset of a prefix of the primary key then an additional optimization is applied. This optimization skips over certain rows as long as these rows do not change the said prefix. //Example Table:// | r1 | r2 | v | | 1 | 1 | 1 | | 1 | 2 | 2 | Here, `r1` and `r2` combined constitute the primary key. `s` is not primary. If we request only the first column in a SELECT DISTINCT clause, the DocDB layer does not seek the second row since it is a duplicate as far as the key `r1` is concerned. **Issue** Observe that we cannot apply the uniqueness constraint (DISTINCTness) until all the filters have been applied. However, in the above example, returning only the first row leads to an incorrect result when we have an additional filter on the third column `v` such as `v = 2`. This is because, the scan logic in DocDB only returns the first row while it is in fact the second row that satisfies the condition. Additionally, the predicates may also be of the form `r1 < now()` where we are unable to pushdown the predicate to the DocDB layer or even push it down as an index scan predicate such as `r1 * r2 >= 0`. **Fix** Do not set a prefix whenever there is a reference to non primary keys. Also do not set a prefix whenever there are aggregate functions as target since an incorrect prefix there might produce inaccurate results. **Applicability** Prefix pushdown optimization is in effect when we have - An Index Scan - No additional filters on top of those required by the Index Scan - No aggregate functions as targets Jira: DB-6769 Test Plan: Jenkins ``` ybd --java-test org.yb.pgsql.TestPgSelect#testDistinctRemoteFilter ybd --java-test org.yb.pgsql.TestPgSelect#testDistinctLocalFilter ybd --java-test org.yb.pgsql.TestPgSelect#testDistinctAgg ``` Reviewers: tnayak, mihnea Reviewed By: tnayak Differential Revision: https://phorge.dev.yugabyte.com/D25980

Summary: Add a GUC option named `yb_enable_distinct_pushdown`. This option controls whether or not the pushdown feature for SELECT DISTINCT is in effect. **Context** The issue yugabyte#17648 documents several issues found with the current implementation of SELECT DISTINCT. More importantly, the source of all these issues can be traced back to pushing down the DISTINCT operator too deep past operations such as filters and aggregations. These lead to correctness issues and we want to provide users with the flexibility to toggle the pushdown feature to mitigate issues with the current implementation as well as in anticipation of issues arising from future changes to the pushdown feature. Jira: DB-6890 Test Plan: ``` ./yb_build.sh --java-test TestPgRegressDistinctPushdown ``` Reviewers: tnayak Reviewed By: tnayak Subscribers: jason Differential Revision: https://phorge.dev.yugabyte.com/D26268

vbalvida added area/ysql Yugabyte SQL (YSQL) status/awaiting-triage Issue awaiting triage labels Jun 1, 2023

yugabyte-ci added kind/bug This issue is a bug priority/medium Medium priority issue labels Jun 1, 2023

vbalvida self-assigned this Jun 1, 2023

yugabyte-ci added priority/high High Priority and removed priority/medium Medium priority issue status/awaiting-triage Issue awaiting triage labels Jun 2, 2023

vbalvida mentioned this issue Jun 12, 2023

[YSQL] Support DISTINCT pushdown even in the presence of push down non-index predicates or aggregates #17741

Open

5 tasks

vbalvida changed the title ~~[YSQL] Unexpected SELECT DISTINCT Behaviour with mixed primary key, non-primary key clauses~~ [YSQL] Unexpected SELECT DISTINCT Behaviour with mixed primary key, non-primary key clauses and aggregate targets. Jun 15, 2023

vbalvida changed the title ~~[YSQL] Unexpected SELECT DISTINCT Behaviour with mixed primary key, non-primary key clauses and aggregate targets.~~ [YSQL] Unexpected SELECT DISTINCT Behaviour with mixed primary key, non-primary key clauses or aggregate targets. Jun 15, 2023

vbalvida mentioned this issue Jun 15, 2023

[YSQL] Create a flag to toggle the SELECT DISTINCT pushdown feature #17801

Closed

1 task

yugabyte-ci added priority/medium Medium priority issue and removed priority/high High Priority labels Jun 21, 2023

vbalvida mentioned this issue Jun 21, 2023

[YSQL] Support distinct pushdown in YSQL #17872

Closed

1 task

yugabyte-ci closed this as completed Jun 23, 2023

vbalvida mentioned this issue Aug 31, 2023

[YSQL] Distinct Pushdown Framework #18944

Open

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[YSQL] Unexpected SELECT DISTINCT Behaviour with mixed primary key, non-primary key clauses or aggregate targets. #17648

[YSQL] Unexpected SELECT DISTINCT Behaviour with mixed primary key, non-primary key clauses or aggregate targets. #17648

vbalvida commented Jun 1, 2023 •

edited

vbalvida commented Jun 13, 2023 •

edited

vbalvida commented Jun 15, 2023 •

edited

[YSQL] Unexpected SELECT DISTINCT Behaviour with mixed primary key, non-primary key clauses or aggregate targets. #17648

[YSQL] Unexpected SELECT DISTINCT Behaviour with mixed primary key, non-primary key clauses or aggregate targets. #17648

Comments

vbalvida commented Jun 1, 2023 • edited

Description

Example

No clause on r1

No clause on s

Warning: Please confirm that this issue does not contain any sensitive information

vbalvida commented Jun 13, 2023 • edited

vbalvida commented Jun 15, 2023 • edited

vbalvida commented Jun 1, 2023 •

edited

vbalvida commented Jun 13, 2023 •

edited

vbalvida commented Jun 15, 2023 •

edited