You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Select query with aggregation and join on external dictionary works slow when no cast in join expression provided. Table and the dictionary both have UInt64 id's. When using explicit cast like table.dict_id = toUInt64(dict.id) the query works fast.
Does it reproduce on recent release?
Yes
How to reproduce
ClickHouse server version to use: 21.4.6.55
create table test (key UInt64, value String) engine=MergeTree order by key;
insert into test select number, '' from numbers(1000000);
create dictionary test_dict (key UInt64, value String)
primary key key source(clickhouse(table test db 'default' user 'default'))
lifetime(min 0 max 0) layout(hashed());
-- Select without GROUP BY
-- 1.1
select test.key, test_dict.value from test join test_dict on test.key = test_dict.key limit 1;
-- Elapsed: 0.038 sec.
-- 1.2
select test.key, test_dict.value from test join test_dict on test.key = toUInt64(test_dict.key) limit 1;
-- Elapsed: 0.316 sec. Processed 1.00 million rows, 17.00 MB (3.17 million rows/s., 53.86 MB/s.)
-- 1.3
select test.key, dictGetString('test_dict', 'value', test.key) from test limit 1;
-- Elapsed: 0.007 sec.
-- Select with GROUP BY
-- 2.1
select test.key from test join test_dict on test.key = test_dict.key group by test.key limit 1;
-- Elapsed: 1286.466 sec. Processed 1.00 million rows, 8.00 MB (777.32 rows/s., 6.22 KB/s.)
-- 2.2
select test.key from test join test_dict on test.key = toUInt64(test_dict.key) group by test.key limit 1;
-- Elapsed: 0.169 sec. Processed 2.00 million rows, 16.00 MB (11.81 million rows/s., 94.49 MB/s.)
-- 2.3
select test.key, dictGetString('test_dict', 'value', test.key) from test group by test.key limit 1;
-- Elapsed: 0.056 sec. Processed 1.00 million rows, 8.00 MB (17.91 million rows/s., 143.32 MB/s.)
Expected behavior
Both queries with explicit cast in join and without it expected to work fast.
The text was updated successfully, but these errors were encountered:
vkosh
added
the
bug
Confirmed user-visible misbehaviour in official release
label
May 17, 2021
@vkosh You have LIMIT 1 in queries without GROUP BY, they return first record as soon as it has been read.
In contrast, the queries with GROUP BY have to finish reading and aggregating all data before returning result.
Workaround: remove GROUP BY and replace it with DISTINCT.
Select query with aggregation and join on external dictionary works slow when no cast in join expression provided. Table and the dictionary both have UInt64 id's. When using explicit cast like
table.dict_id = toUInt64(dict.id)
the query works fast.Does it reproduce on recent release?
Yes
How to reproduce
Expected behavior
Both queries with explicit cast in join and without it expected to work fast.
The text was updated successfully, but these errors were encountered: