MultiChannelGroupByHash waste too much cpu resource compare with impala #16443

dengweisysu · 2021-07-20T02:45:59Z

for the query counting distinct 10 billion data.
like below:
`
SELECT *
FROM
(SELECT fdate as a_ds,
count(distinct(userid)) as index_0_7366
FROM
(SELECT t1.fdate as fdate,
t1.userid as userid
FROM
(SELECT t0.fdate as fdate,
t0.userid as userid
FROM test.test_table t0
WHERE t0.fdate>=20210401 AND t0.fdate<20210531 ) t1) a
GROUP BY a_ds) t_ret
order by a_ds desc LIMIT 5001

`
running with the same hardware(83 node, 48 core with 96 processor, 256GB mem ).

presto (0.242): take 22 seconds
impala (3.4 with multi thread): take 28 seconds

although presto run faster than impala, but presto waste too much cpu resource than impala.
Is the disadvantage of java (presto) compare with C++ ( impala)

one of presto host Cpu Utilization (50%+)

impala cluster cpu Utilization( one line for one machine) (15%+)

I capture thread stack when running query, and get top 10 class (first line of runnable thread) below:
class full name ----- occurrence count in thread stack

alluxio.shaded.client.io.netty.channel.epoll.Native-----382
com.facebook.presto.operator. MultiChannelGroupByHash-----59
io.airlift.slice.Slices-----31
sun.nio.ch.EPoll-----20
com.facebook.presto.common.block.AbstractVariableWidthBlock-----13
io.airlift.slice.DynamicSliceOutput-----10
com.facebook.presto.common.type.AbstractLongType-----9
com.facebook.airlift.http.client.jetty.BufferingResponseListener-----7
com.facebook.presto.common.block.VariableWidthBlock-----6
sun.management.ThreadImpl-----5

ps: High cpu resource has noting to do with alluxio because runnable thread of alluxio stop at epollWait:
alluxio.shaded.client.io.netty.channel.epoll.Native.epollWait0(Native.java:-2)-----241
alluxio.shaded.client.io.netty.channel.epoll.Native.epollWait(Native.java:-2)-----141

In Impala, impala use code generation to accelerate， why presto not?

kaikalur · 2021-11-19T20:11:19Z

We need a more reproducible test Also Presto has the mark_distinct operator for count disitncct. See if turning that off makes any difference.

dengweisysu · 2021-11-23T11:02:38Z

"use_mark_distinct=fasle" make no difference. And single distinct query will be optimized to group-by query.
The problem is similar with this issue : #13015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MultiChannelGroupByHash waste too much cpu resource compare with impala #16443

MultiChannelGroupByHash waste too much cpu resource compare with impala #16443

dengweisysu commented Jul 20, 2021 •

edited

kaikalur commented Nov 19, 2021

dengweisysu commented Nov 23, 2021

MultiChannelGroupByHash waste too much cpu resource compare with impala #16443

MultiChannelGroupByHash waste too much cpu resource compare with impala #16443

Comments

dengweisysu commented Jul 20, 2021 • edited

kaikalur commented Nov 19, 2021

dengweisysu commented Nov 23, 2021

dengweisysu commented Jul 20, 2021 •

edited