Problem
For RIGHT / FULL JOIN with a WHERE predicate on the USING column, the predicate isn't used as a primary key index condition on the left input. The left MergeTree reads all granules. INNER and LEFT optimise correctly.
Repro
CREATE TABLE mt (k UInt64) ENGINE = MergeTree ORDER BY k
AS SELECT number FROM numbers(100000000) ORDER BY rand();
EXPLAIN PLAN indexes = 1 SELECT k FROM mt AS l RIGHT JOIN (SELECT 1 AS k) AS r USING (k) WHERE k = 1;
-- ... Granules: 120/120
EXPLAIN PLAN indexes = 1 SELECT k FROM mt AS l FULL JOIN (SELECT 1 AS k) AS r USING (k) WHERE k = 1;
-- ... Granules 120/120
Workaround
We can push the predicate manually using PREWHERE. The above queries become 6x and 14x times faster respectively.
RIGHT JOIN:
SELECT k FROM mt AS l RIGHT JOIN (SELECT 1 AS k) AS r USING (k) WHERE k = 1;
-- 1 row in set. Elapsed: 0.058 sec.
SELECT k FROM mt AS l RIGHT JOIN (SELECT 1 AS k) AS r USING (k) PREWHERE k = 1;
-- 1 row in set. Elapsed: 0.010 sec.
FULL JOIN:
SELECT k FROM mt AS l FULL JOIN (SELECT 1 AS k) AS r USING (k) WHERE k = 1;
-- 1 row in set. Elapsed: 0.142 sec.
SELECT k FROM mt AS l FULL JOIN (SELECT 1 AS k) AS r USING (k) PREWHERE k = 1;
-- 1 row in set. Elapsed: 0.010 sec.
Misc
A naive fix for won't do as we can have a JOIN that significantly reduces cardinality and a predicate that's not very selective. Joining first is a better approach then. This mostly applies to FULL JOIN
Problem
For
RIGHT/FULL JOINwith aWHEREpredicate on theUSINGcolumn, the predicate isn't used as a primary key index condition on the left input. The leftMergeTreereads all granules.INNERandLEFToptimise correctly.Repro
Workaround
We can push the predicate manually using
PREWHERE. The above queries become 6x and 14x times faster respectively.RIGHT JOIN:FULL JOIN:Misc
A naive fix for won't do as we can have a
JOINthat significantly reduces cardinality and a predicate that's not very selective. Joining first is a better approach then. This mostly applies toFULL JOIN