Describe the situation
When indexHint function is used with a complementary predicate, ClickHouse doesn't push the complementary predicates to Prewhere. The query needs to be restructured so that CH can use the prewhere optimization. Ideally, CH should remove the indexHint functions from the query if they are not needed to be executed and only used for filtering.
How to reproduce
CREATE TABLE test_indexHint_prewhere
(
`id` UInt32,
`colA` String,
`colB` String,
INDEX colA_tokens_idx tokens(colA) TYPE bloom_filter GRANULARITY 1,
INDEX colB_tokens_idx tokens(colB) TYPE bloom_filter GRANULARITY 1
)
ENGINE = MergeTree
ORDER BY id
SETTINGS index_granularity = 1;
insert into test_indexHint_prewhere SELECT number, randomPrintableASCII(30), randomPrintableASCII(40) FROM system.numbers LIMIT 100;
For the following query:
Query 1:
select * from test_indexHint_prewhere where id in (62,88,89,67) and ((indexHint(has(tokens(colA), 'ymo82')) and colA like '%ymo82%') OR (indexHint(has(tokens(colB), 'dKappNQY6I')) and colB like '%dKappNQY6I%'))
The corresponding test log shows the following:
2025.01.22 22:20:19.314012 [ 15942738 ] {ce06c3ac-f0ed-4fc3-9785-add535ecd49e} <Trace> MergeTreeSelectProcessor: PREWHERE condition was split into 1 steps
2025.01.22 22:20:19.314055 [ 15942738 ] {ce06c3ac-f0ed-4fc3-9785-add535ecd49e} <Test> MergeTreeSelectProcessor: PREWHERE conditions: "in(__table1.id, __set_UInt32_10358064706842451232_4242968589107113876)", Original PREWHERE DAG:
But if the query is restructured:
Query 2:
select * from test_indexHint_prewhere where id in (62,88,89,67) and (indexHint(has(tokens(colA), 'ymo82')) OR indexHint(has(tokens(colB), 'dKappNQY6I'))) and (colA like '%ymo82%' or colB like '%dKappNQY6I%')
The corresponding test log shows the following:
2025.01.22 22:21:56.223222 [ 15942738 ] {51b12568-580c-4765-a499-3217047d9428} <Trace> MergeTreeSelectProcessor: PREWHERE condition was split into 2 steps
2025.01.22 22:21:56.223298 [ 15942738 ] {51b12568-580c-4765-a499-3217047d9428} <Test> MergeTreeSelectProcessor: PREWHERE conditions: "or(like(__table1.colA, '%ymo82%'_String), like(__table1.colB, '%dKappNQY6I%'_String))", "and(or(like(__table1.colA, '%ymo82%'_String), like(__table1.colB, '%dKappNQY6I%'_String)), in(__table1.id, __set_UInt32_10358064706842451232_4242968589107113876))", Original PREWHERE DAG:
Which pushes the predicates to prewhere. Ideally, combining the indexHint with its complementary predicate is natural, like query 1. Is it possible to exclude the indexHint predicates before QueryPlanOptimizePrewhere?
Expected performance
No response
Additional context
No response
Describe the situation
When indexHint function is used with a complementary predicate, ClickHouse doesn't push the complementary predicates to Prewhere. The query needs to be restructured so that CH can use the prewhere optimization. Ideally, CH should remove the indexHint functions from the query if they are not needed to be executed and only used for filtering.
How to reproduce
For the following query:
Query 1:
The corresponding test log shows the following:
But if the query is restructured:
Query 2:
The corresponding test log shows the following:
Which pushes the predicates to prewhere. Ideally, combining the indexHint with its complementary predicate is natural, like query 1. Is it possible to exclude the indexHint predicates before QueryPlanOptimizePrewhere?
Expected performance
No response
Additional context
No response