Skip to content

Nested IN/EXISTS subqueries should not be converted into semi-joins if the outer context is a sub-query which wasn't unnested #8265

@dyemanov

Description

@dyemanov

Related to SubQueryConversion = true.

Currently semi-join is limited to the hash join algorithm only and it's known to often work bad if executed repeatedly (e.g. inside a correlated sub-query). Such cases should be detected and the conversion should not be attempted.

Examples:

(1)

select count(*) from A
where field1 in (
  select id from B
  where field2 in (
    select id from C
    where name like '%ABC%' ));

Here both sub-queries can (and should) be unnested.
Expected plan:

PLAN HASH (A NATURAL, HASH (B NATURAL, C NATURAL))

(2)

select count(*) from A
where field1 in (
  select id from B
  where 1=1 or field2 in (
    select id from C
    where name like '%ABC%' ));

Here the inner sub-query cannot be unnested due to OR condition present, but the outer sub-query can.
Expected plan:

PLAN (C INDEX (PK_C))
PLAN HASH (A NATURAL, B NATURAL)

(3)

select count(*) from A
where 1=1 or field1 in (
  select id from B
  where field2 in (
    select id from C
    where name like '%ABC%' ));

Here the outer sub-query cannot be unnested due to OR condition present, so the inner sub-query should not be unnested too.
Expected plan:

PLAN (B INDEX (PK_B))
PLAN (C INDEX (PK_C))
PLAN (A NATURAL)

(4)

select count(*) from A
where 1=1 or field1 in (
  select id from B
  where 1=1 or field2 in (
    select id from C
    where name like '%ABC%' ));

Here both sub-queries cannot be unnested due to OR conditions present.
Expected plan:

PLAN (B INDEX (PK_B))
PLAN (C INDEX (PK_C))
PLAN (A NATURAL)

In the future this heuristics should be replaced with a cost-based approach between hash and nested loop semi-join.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions