Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance of WHERE vs PREWHERE when selecting by primary key #2601

Closed
filimonov opened this issue Jul 6, 2018 · 6 comments
Closed

Performance of WHERE vs PREWHERE when selecting by primary key #2601

filimonov opened this issue Jul 6, 2018 · 6 comments
Assignees

Comments

@filimonov
Copy link
Contributor

filimonov commented Jul 6, 2018

According to docs: "Keep in mind that it does not make much sense for PREWHERE to only specify those columns that have an index, because when using an index, only the data blocks that match the index are read."

In practice there is a significant difference between WHERE and PREWHERE when selecting by PK. If PK conditions is inside WHERE Clickhouse read much more data and responses much slower that in case with PREWHERE.

If it's expected behaviour when why optimize_move_to_prewhere skips PK fields?

Test case:

CREATE TABLE where_vs_prewhere_test Engine=MergeTree() ORDER BY number PARTITION BY tuple() SETTINGS index_granularity = 128
AS select     
    number,
    arrayStringConcat( arrayMap( x -> toString( rand64(x) ), range( 20 )) ) as a,
    arrayStringConcat( arrayMap( x -> toString( rand64(100+x) ), range( 20 )) ) as b,
    arrayStringConcat( arrayMap( x -> toString( rand64(200+x) ), range( 20 )) ) as c
FROM numbers(10000);

set max_bytes_to_read=20000;

SELECT * FROM where_vs_prewhere_test PREWHERE number<10; # works
SELECT * FROM where_vs_prewhere_test WHERE number<10;    # reads too much, and fails

DROP TABLE where_vs_prewhere_test;
@fillest
Copy link

fillest commented Oct 29, 2018

Another example.
CREATE TABLE IF NOT EXISTS testdb.test (time DateTime, text String) ENGINE = MergeTree() PARTITION BY toDate(time) ORDER BY (time)
I've inserted 17280000000 rows for total 99G uncompressed (resulted in ~2G compressed).
SELECT time, text FROM testdb.test WHERE time > 1540581226 ORDER BY time ASC LIMIT 100 takes for example 40s to complete. (The passed time value matches the whole dataset).
SELECT time, text FROM testdb.test PREWHERE time > 1540581226 ORDER BY time ASC LIMIT 100 takes 10s (4x faster). With where I also see memcpy in perf top on top. With prewhere - not (most other symbols are not loaded..). Two cores are fully utilized during the query in both cases.

@alex-zaitsev
Copy link
Contributor

alex-zaitsev commented Nov 16, 2018

Table has an index (tags_id, created_at), 100M rows. Prewhere on index columns improves performance 3 times.

:) SELECT sum(cityHash64(*)) a FROM benchmark_cpu  where (tags_id, created_at) in (select tags_id, max(created_at) from benchmark_cpu group by tags_id);

1 rows in set. Elapsed: 2.132 sec. Processed 33.52 million rows, 4.46 GB (15.72 million rows/s., 2.09 GB/s.) 
:) SELECT sum(cityHash64(*)) a FROM benchmark_cpu  prewhere (tags_id, created_at) in (select tags_id, max(created_at) from benchmark_cpu group by tags_id);

1 rows in set. Elapsed: 0.680 sec. Processed 33.52 million rows, 268.69 MB (49.33 million rows/s., 395.37 MB/s.) 

@alex-zaitsev
Copy link
Contributor

I checked the trace, everything is the same, the only difference is in query pipeline that has 'Filter' stage in case with 'where':

where, slow:

<Debug> executeQuery: Query pipeline:
Expression
 Expression
  ParallelAggregating
   Expression × 4
    Filter
     MergeTreeThread

prewhere, fast:

<Debug> executeQuery: Query pipeline:
Expression
Expression
 ParallelAggregating
  Expression × 4
   MergeTreeThread

@alexey-milovidov alexey-milovidov self-assigned this Nov 24, 2018
alexey-milovidov added a commit that referenced this issue Nov 24, 2018
proller pushed a commit to proller/ClickHouse that referenced this issue Nov 30, 2018
alexey-milovidov added a commit that referenced this issue Dec 4, 2018
@alexey-milovidov
Copy link
Member

According to docs: "Keep in mind that it does not make much sense for PREWHERE to only specify those columns that have an index, because when using an index, only the data blocks that match the index are read."

This is obsolete statement. I have removed it from the docs.

Sometimes it does make sense to use PREWHERE even for indexed columns.
Index allows to select ranges of data (granules that contain index_granularity number of rows) that can match the condition.

Then PREWHERE will read columns specified in PREWHERE conditions, filter them by this condition. And it can possibly narrow down ranges to be read by cutting their "tail".

Then other columns will be read for narrower ranges that can span less than index_granularity rows. It will help especially if other columns are heavy enough.

(possibility to read ranges less than one granule was implemented about one year ago in #903)

If it's expected behaviour when why optimize_move_to_prewhere skips PK fields?

Actually the case when PREWHERE helps for indexed fields is not frequent. If we will allow to move conditions on indexed fields to PREWHERE, they will be almost always selected for PREWHERE, because these fields are usually high compressed. But it will usually make less profit than moving other columns.

We will leave current behaviour as is before we can implement some smarter solution.

Test case

Your test case also have some complications, because amount of filtered data is not accounted correctly for max_bytes_to_read limit.

@filimonov
Copy link
Contributor Author

filimonov commented Dec 6, 2018

This is obsolete statement. I have removed it from the docs.
Sometimes it does make sense to use PREWHERE even for indexed columns.
Index allows to select ranges of data (granules that contain index_granularity number of rows) that can match the condition.
possibility to read ranges less than one granule was implemented about one year ago in #903

Cool. :) Didn't know that.

Actually the case when PREWHERE helps for indexed fields is not frequent. If we will allow to move conditions on indexed fields to PREWHERE, they will be almost always selected for PREWHERE, because these fields are usually high compressed. But it will usually make less profit than moving other columns.

I see. So generally it will lead to read of PK for all the granules, and if PK matches a lot of granules it will do a lot of extra work.

We will leave current behaviour as is before we can implement some smarter solution.

That have sense. Probably possible optimization is something like check PREWHERE for those granules which match PK partially. (i.e. if marks[n-1] <= PK_condition <= marks[n] and marks[n-1] < marks[n]). I.e. skip prewhere check if whole granule matches (or not matches) the PK condition.

Your test case also have some complications, because amount of filtered data is not accounted correctly for max_bytes_to_read limit.

In those cases when PREWHERE is more efficient than WHERE the biggest difference is the size of data read (i was checking in server logs / querylog). So that's why i've used max_bytes_to_read to make test fail (and it fails even if estimation is inaccurate).

@filimonov
Copy link
Contributor Author

#7769

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants