You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, I try to use AggregatingMergeTree to save the uniq users every day to calculate the intersection
but when the size of users becomes larger, the speed of query become slower.
minimal sample:
Simulation Data
DROP TABLE test.dayliyuniqusers;
CREATE TABLE test.dayliyuniqusers
(
intotime Date,
pageid UInt32,
users AggregateFunction(groupUniqArray, UInt64)
)
ENGINE = AggregatingMergeTree
PARTITION BY (intotime)
ORDER BY intotime
SETTINGS index_granularity = 512 ;
INSERT INTO test.dayliyuniqusers (intotime, pageid, users)
SELECT
'2018-02-21',
2538,
groupUniqArrayState(number) AS users
FROM
(
select number from system.numbers limit 30000000
);
INSERT INTO test.dayliyuniqusers (intotime, pageid, users)
SELECT
'2018-02-22',
2538,
groupUniqArrayState(number) AS users
FROM
(
select number from system.numbers limit 5000000
);
Query
SELECT
day1,
day2
FROM
(
SELECT
groupUniqArrayMergeIf(users, intotime = '2018-02-21') AS aa,
groupUniqArrayMergeIf(users, intotime = '2018-02-22') AS bb,
length(arrayFilter(x -> (x >= 2), arrayEnumerateUniq(arrayConcat(aa, aa)))) AS day1,
length(arrayFilter(x -> (x >= 2), arrayEnumerateUniq(arrayConcat(aa, bb)))) AS day2
FROM test.dayliyuniqusers
)
┌─────day1─┬────day2─┐
│ 30000000 │ 5000000 │
└──────────┴─────────┘
1 rows in set. Elapsed: 11.073 sec.
The text was updated successfully, but these errors were encountered:
lamberken
changed the title
[Question] groupUniqArrayMergeIf, When the array length becomes longer, the speed becomes slower
[Question] groupUniqArrayMergeIf, when the array length becomes longer, the speed becomes slower
Mar 27, 2018
Hi, I try to use
AggregatingMergeTree
to save the uniq users every day to calculate the intersectionbut when the size of users becomes larger, the speed of query become slower.
minimal sample:
Simulation Data
Query
Question
How to optimizate the query, thank you
The text was updated successfully, but these errors were encountered: