Skip to content

WHERE on join key isn't pushed into left-side primary key index for RIGHT / FULL JOIN #103038

@rienath

Description

@rienath

Problem

For RIGHT / FULL JOIN with a WHERE predicate on the USING column, the predicate isn't used as a primary key index condition on the left input. The left MergeTree reads all granules. INNER and LEFT optimise correctly.

Repro

CREATE TABLE mt (k UInt64) ENGINE = MergeTree ORDER BY k
AS SELECT number FROM numbers(100000000) ORDER BY rand();

EXPLAIN PLAN indexes = 1 SELECT k FROM mt AS l RIGHT JOIN (SELECT 1 AS k) AS r USING (k) WHERE k = 1;
-- ... Granules: 120/120

EXPLAIN PLAN indexes = 1 SELECT k FROM mt AS l FULL  JOIN (SELECT 1 AS k) AS r USING (k) WHERE k = 1;
-- ... Granules 120/120

Workaround

We can push the predicate manually using PREWHERE. The above queries become 6x and 14x times faster respectively.

RIGHT JOIN:

SELECT k FROM mt AS l RIGHT JOIN (SELECT 1 AS k) AS r USING (k) WHERE k = 1;
-- 1 row in set. Elapsed: 0.058 sec.

SELECT k FROM mt AS l RIGHT JOIN (SELECT 1 AS k) AS r USING (k) PREWHERE k = 1;
-- 1 row in set. Elapsed: 0.010 sec.

FULL JOIN:

SELECT k FROM mt AS l FULL  JOIN (SELECT 1 AS k) AS r USING (k) WHERE k = 1;
-- 1 row in set. Elapsed: 0.142 sec.

SELECT k FROM mt AS l FULL  JOIN (SELECT 1 AS k) AS r USING (k) PREWHERE k = 1;
-- 1 row in set. Elapsed: 0.010 sec.

Misc

A naive fix for won't do as we can have a JOIN that significantly reduces cardinality and a predicate that's not very selective. Joining first is a better approach then. This mostly applies to FULL JOIN

Metadata

Metadata

Assignees

No one assigned

    Labels

    comp-joinsJOINs end-to-end (planning hooks + runtime join operators/algorithms). Single bucket to avoid pla...comp-query-analyzerSemantic analysis + logical/physical planning + rewrites (Analyzer and Planner modules).performance

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions