Performance of WHERE vs PREWHERE when selecting by primary key #2601

filimonov · 2018-07-06T09:49:12Z

According to docs: "Keep in mind that it does not make much sense for PREWHERE to only specify those columns that have an index, because when using an index, only the data blocks that match the index are read."

In practice there is a significant difference between WHERE and PREWHERE when selecting by PK. If PK conditions is inside WHERE Clickhouse read much more data and responses much slower that in case with PREWHERE.

If it's expected behaviour when why optimize_move_to_prewhere skips PK fields?

Test case:

CREATE TABLE where_vs_prewhere_test Engine=MergeTree() ORDER BY number PARTITION BY tuple() SETTINGS index_granularity = 128
AS select     
    number,
    arrayStringConcat( arrayMap( x -> toString( rand64(x) ), range( 20 )) ) as a,
    arrayStringConcat( arrayMap( x -> toString( rand64(100+x) ), range( 20 )) ) as b,
    arrayStringConcat( arrayMap( x -> toString( rand64(200+x) ), range( 20 )) ) as c
FROM numbers(10000);

set max_bytes_to_read=20000;

SELECT * FROM where_vs_prewhere_test PREWHERE number<10; # works
SELECT * FROM where_vs_prewhere_test WHERE number<10;    # reads too much, and fails

DROP TABLE where_vs_prewhere_test;

The text was updated successfully, but these errors were encountered:

fillest · 2018-10-29T10:58:36Z

Another example.
CREATE TABLE IF NOT EXISTS testdb.test (time DateTime, text String) ENGINE = MergeTree() PARTITION BY toDate(time) ORDER BY (time)
I've inserted 17280000000 rows for total 99G uncompressed (resulted in ~2G compressed).
SELECT time, text FROM testdb.test WHERE time > 1540581226 ORDER BY time ASC LIMIT 100 takes for example 40s to complete. (The passed time value matches the whole dataset).
SELECT time, text FROM testdb.test PREWHERE time > 1540581226 ORDER BY time ASC LIMIT 100 takes 10s (4x faster). With where I also see memcpy in perf top on top. With prewhere - not (most other symbols are not loaded..). Two cores are fully utilized during the query in both cases.

alex-zaitsev · 2018-11-16T13:34:31Z

Table has an index (tags_id, created_at), 100M rows. Prewhere on index columns improves performance 3 times.

:) SELECT sum(cityHash64(*)) a FROM benchmark_cpu  where (tags_id, created_at) in (select tags_id, max(created_at) from benchmark_cpu group by tags_id);

1 rows in set. Elapsed: 2.132 sec. Processed 33.52 million rows, 4.46 GB (15.72 million rows/s., 2.09 GB/s.)

:) SELECT sum(cityHash64(*)) a FROM benchmark_cpu  prewhere (tags_id, created_at) in (select tags_id, max(created_at) from benchmark_cpu group by tags_id);

1 rows in set. Elapsed: 0.680 sec. Processed 33.52 million rows, 268.69 MB (49.33 million rows/s., 395.37 MB/s.)

alex-zaitsev · 2018-11-21T08:25:06Z

I checked the trace, everything is the same, the only difference is in query pipeline that has 'Filter' stage in case with 'where':

where, slow:

<Debug> executeQuery: Query pipeline:
Expression
 Expression
  ParallelAggregating
   Expression × 4
    Filter
     MergeTreeThread

prewhere, fast:

<Debug> executeQuery: Query pipeline:
Expression
Expression
 ParallelAggregating
  Expression × 4
   MergeTreeThread

…RE even by indexed columns #2601

alexey-milovidov · 2018-12-04T19:56:18Z

According to docs: "Keep in mind that it does not make much sense for PREWHERE to only specify those columns that have an index, because when using an index, only the data blocks that match the index are read."

This is obsolete statement. I have removed it from the docs.

Sometimes it does make sense to use PREWHERE even for indexed columns.
Index allows to select ranges of data (granules that contain index_granularity number of rows) that can match the condition.

Then PREWHERE will read columns specified in PREWHERE conditions, filter them by this condition. And it can possibly narrow down ranges to be read by cutting their "tail".

Then other columns will be read for narrower ranges that can span less than index_granularity rows. It will help especially if other columns are heavy enough.

(possibility to read ranges less than one granule was implemented about one year ago in #903)

If it's expected behaviour when why optimize_move_to_prewhere skips PK fields?

Actually the case when PREWHERE helps for indexed fields is not frequent. If we will allow to move conditions on indexed fields to PREWHERE, they will be almost always selected for PREWHERE, because these fields are usually high compressed. But it will usually make less profit than moving other columns.

We will leave current behaviour as is before we can implement some smarter solution.

Test case

Your test case also have some complications, because amount of filtered data is not accounted correctly for max_bytes_to_read limit.

filimonov · 2018-12-06T15:54:42Z

This is obsolete statement. I have removed it from the docs.
Sometimes it does make sense to use PREWHERE even for indexed columns.
Index allows to select ranges of data (granules that contain index_granularity number of rows) that can match the condition.
possibility to read ranges less than one granule was implemented about one year ago in #903

Cool. :) Didn't know that.

Actually the case when PREWHERE helps for indexed fields is not frequent. If we will allow to move conditions on indexed fields to PREWHERE, they will be almost always selected for PREWHERE, because these fields are usually high compressed. But it will usually make less profit than moving other columns.

I see. So generally it will lead to read of PK for all the granules, and if PK matches a lot of granules it will do a lot of extra work.

We will leave current behaviour as is before we can implement some smarter solution.

That have sense. Probably possible optimization is something like check PREWHERE for those granules which match PK partially. (i.e. if marks[n-1] <= PK_condition <= marks[n] and marks[n-1] < marks[n]). I.e. skip prewhere check if whole granule matches (or not matches) the PK condition.

Your test case also have some complications, because amount of filtered data is not accounted correctly for max_bytes_to_read limit.

In those cases when PREWHERE is more efficient than WHERE the biggest difference is the size of data read (i was checking in server logs / querylog). So that's why i've used max_bytes_to_read to make test fail (and it fails even if estimation is inaccurate).

filimonov · 2020-02-11T06:55:29Z

#7769

alexey-milovidov self-assigned this Nov 24, 2018

alexey-milovidov added the performance label Nov 24, 2018

alexey-milovidov added a commit that referenced this issue Nov 24, 2018

Whitespace #2601

cc06488

proller pushed a commit to proller/ClickHouse that referenced this issue Nov 30, 2018

Whitespace ClickHouse#2601

c460186

alexey-milovidov added a commit that referenced this issue Dec 4, 2018

Adjusted documentation because sometimes it makes sense to use PREWHE…

c2ea11d

…RE even by indexed columns #2601

alexey-milovidov closed this as completed Dec 4, 2018

neilxxx mentioned this issue Apr 9, 2022

why optimize_move_to_prewhere skips PK fields? #36096

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance of WHERE vs PREWHERE when selecting by primary key #2601

Performance of WHERE vs PREWHERE when selecting by primary key #2601

filimonov commented Jul 6, 2018 •

edited

fillest commented Oct 29, 2018 •

edited

alex-zaitsev commented Nov 16, 2018 •

edited

alex-zaitsev commented Nov 21, 2018

alexey-milovidov commented Dec 4, 2018

filimonov commented Dec 6, 2018 •

edited

filimonov commented Feb 11, 2020

Performance of WHERE vs PREWHERE when selecting by primary key #2601

Performance of WHERE vs PREWHERE when selecting by primary key #2601

Comments

filimonov commented Jul 6, 2018 • edited

fillest commented Oct 29, 2018 • edited

alex-zaitsev commented Nov 16, 2018 • edited

alex-zaitsev commented Nov 21, 2018

alexey-milovidov commented Dec 4, 2018

filimonov commented Dec 6, 2018 • edited

filimonov commented Feb 11, 2020

filimonov commented Jul 6, 2018 •

edited

fillest commented Oct 29, 2018 •

edited

alex-zaitsev commented Nov 16, 2018 •

edited

filimonov commented Dec 6, 2018 •

edited