[Bug] Paimon Table produce more reduce task for same data volume #864

JingsongLi · 2023-04-10T07:36:24Z

Search before asking

I searched in the issues and found nothing similar.

Paimon version

0.4

Compute Engine

hive

Minimal reproduce step

Compare the paimon table and hive table with the same statement and data amount using the hive client:
painmon table:

hive table：

What doesn't meet your expectations?

read paimon table should have lower reducer number.

Anything else?

No response

Are you willing to submit a PR?

I'm willing to submit a PR!

calvinjiang · 2023-04-16T01:11:17Z

I'd like to fix this issue.

JingsongLi · 2023-04-23T08:33:08Z

I'd like to fix this issue.

How to?

wg1026688210 · 2023-04-28T01:40:29Z

hi~ @JingsongLi Is it because there are more files in the Paimon table. Is it effective to reduce the number of reduce tasks, If we set number of the upstream map by mapred.map.tasks ,which reduce the number of shuffle files.

JingsongLi · 2023-05-11T02:11:22Z

hi~ @JingsongLi Is it because there are more files in the Paimon table. Is it effective to reduce the number of reduce tasks, If we set number of the upstream map by mapred.map.tasks ,which reduce the number of shuffle files.

We should figure out what is the mechanism of the task number inference in Hive, and try to work around.

JingsongLi added the bug label Apr 10, 2023

JingsongLi mentioned this issue Apr 10, 2023

Excessive resources are requested in reduce phase after hive queries the paimon table. Procedure JingsongLi/paimon-trino#11

Closed

wg1026688210 mentioned this issue Jul 3, 2023

[hive] Support syncing partition into Hive metastore when using Hive catalog #1445

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug] Paimon Table produce more reduce task for same data volume #864

[Bug] Paimon Table produce more reduce task for same data volume #864

JingsongLi commented Apr 10, 2023 •

edited

Loading

calvinjiang commented Apr 16, 2023

JingsongLi commented Apr 23, 2023

wg1026688210 commented Apr 28, 2023

JingsongLi commented May 11, 2023

[Bug] Paimon Table produce more reduce task for same data volume #864

[Bug] Paimon Table produce more reduce task for same data volume #864

Comments

JingsongLi commented Apr 10, 2023 • edited Loading

Search before asking

Paimon version

Compute Engine

Minimal reproduce step

What doesn't meet your expectations?

Anything else?

Are you willing to submit a PR?

calvinjiang commented Apr 16, 2023

JingsongLi commented Apr 23, 2023

wg1026688210 commented Apr 28, 2023

JingsongLi commented May 11, 2023

JingsongLi commented Apr 10, 2023 •

edited

Loading