-
Notifications
You must be signed in to change notification settings - Fork 8.1k
Open
Labels
Description
Read in order optimization reads at least one granule from every source part. This can be improved in case if data between parts is almost globally sorted, which is typical for time-series usecase.
As an example, create a table.
create table t (x UInt64, y String) engine = MergeTree order by x;
insert into t select number, toString(number) from numbers(8192 * 3);
insert into t select number + 8192 * 3, toString(number) from numbers(8192 * 3);
This table has two parts which do not intersect by primary key.
select name, marks, rows from system.parts where table = 't' and database = 'default'
SELECT
name,
marks,
rows
FROM system.parts
WHERE (table = 't') AND (database = 'default')
Query id: f4d71ad3-e89f-4eaa-9218-21c25ba9c038
┌─name──────┬─marks─┬──rows─┐
│ all_1_1_0 │ 4 │ 24576 │
│ all_2_2_0 │ 4 │ 24576 │
└───────────┴───────┴───────┘
Run a query.
select y from t where toUInt64(y) > 8192 * 2 order by x limit 4 settings max_block_size = 8192
SELECT y
FROM t
WHERE toUInt64(y) > (8192 * 2)
ORDER BY x ASC
LIMIT 4
SETTINGS max_block_size = 8192
Query id: ae5c18f8-efa2-4093-814b-bc9448e7374f
┌─y─────┐
│ 16385 │
│ 16386 │
│ 16387 │
│ 16388 │
└───────┘
This query uses read-in-order optimization, and planner is going to read all the data from parts.
2022.10.25 15:32:22.021478 [ 40507 ] {ae5c18f8-efa2-4093-814b-bc9448e7374f} <Trace> MergeTreeInOrderSelectProcessor: Reading 2 ranges in order from part all_2_2_0, approx. 24576 rows starting from 0
2022.10.25 15:32:22.021520 [ 40507 ] {ae5c18f8-efa2-4093-814b-bc9448e7374f} <Trace> MergeTreeInOrderSelectProcessor: Reading 2 ranges in order from part all_1_1_0, approx. 24576 rows starting from 0
In this case, a query is a full scan.
2022.10.25 15:32:22.023843 [ 40507 ] {ae5c18f8-efa2-4093-814b-bc9448e7374f} <Information> executeQuery: Read 49152 rows, 778.30 KiB in 0.003676755 sec., 13368309 rows/sec., 206.72 MiB/sec.
Actually, all the data needed contains only in the first part. We did not have to read all_2_2_0 at all. We can improve it by analyzing PK.
cp clickhouse/data/default/t/all_1_1_0/primary.idx clickhouse/user_files/1.primary.idx
cp clickhouse/data/default/t/all_2_2_0/primary.idx clickhouse/user_files/2.primary.idx
select x from file('1.primary.idx', 'RowBinary', 'x UInt64')
SELECT x
FROM file('1.primary.idx', 'RowBinary', 'x UInt64')
Query id: 25a84d26-c8a6-4c20-8665-c1e8b48e9703
┌─────x─┐
│ 0 │
│ 8192 │
│ 16384 │
│ 24575 │
└───────┘
select x from file('2.primary.idx', 'RowBinary', 'x UInt64')
SELECT x
FROM file('2.primary.idx', 'RowBinary', 'x UInt64')
Query id: 0bc6dd52-17d2-4588-b3b1-f4b7a00370fc
┌─────x─┐
│ 24576 │
│ 32768 │
│ 40960 │
│ 49151 │
└───────┘
From PK we can decide that we don't even need to start reading all_2_2_0 till we read until x = 24575 from all_1_1_0.
Reactions are currently unavailable