Skip to content

Some In/Exists Subqueries will generate wrong PhysicalPlan #5265

@ygf11

Description

@ygf11

Describe the bug

Datafusion can't execute non-correlated subquery now, like in/exists.
So these queries should return NotImplemented("Physical plan does not support logical expression In/Exists error. But currently the filter will be pushdown to the TableScan.

> explain select * from t1 where exists(select 1 from t2 where t2.t2_id > 0);
+---------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------+
| plan_type     | plan                                                                                                                                                        |
+---------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------+
| logical_plan  | Projection: t1.t1_id, t1.t1_name, t1.t1_int                                                                                                                 |
|               |   TableScan: t1 projection=[t1_id, t1_name, t1_int], full_filters=[EXISTS (<subquery>)]                                                                     |
|               |     Subquery:                                                                                                                                               |
|               |       Projection: Int64(1)                                                                                                                                  |
|               |         Filter: CAST(t2.t2_id AS Int64) > Int64(0)                                                                                                          |
|               |           TableScan: t2                                                                                                                                     |
| physical_plan | ProjectionExec: expr=[t1_id@0 as t1_id, t1_name@1 as t1_name, t1_int@2 as t1_int]                                                                           |
|               |   CsvExec: files={1 group: [[home/work/tools/datafusion-test-data/join-context/t1.csv]]}, has_header=false, limit=None, projection=[t1_id, t1_name, t1_int] |
|               |                                                                                                                                                             |
+---------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------+

For the above query, the subquery filter will be lost in the final physical_plan.

To Reproduce
As above.

Expected behavior
The query should return NotImplemented("Physical plan does not support logical expression In/Exists error.

Additional context
Add any other context about the problem here.

Metadata

Metadata

Assignees

Labels

bugSomething isn't workinghelp wantedExtra attention is needed

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions