[HUDI-6950] Query should process listed partitions to avoid driver oom due to large number files in table first partition by xuzifu666 · Pull Request #9875 · apache/hudi

xuzifu666 · 2023-10-17T09:35:00Z

Change Logs

query should process listed partitions avoid driver oom due to large number files in table first partition
https://issues.apache.org/jira/browse/HUDI-6950

Impact

currently if multiple partition table，would cause oom easy
eg:
CREATE TABLE hudi_test.tmp_hudi_test_1 (
id string,
name string,
dt bigint,
day STRING COMMENT '日期分区',
hour INT COMMENT '小时分区'
)using hudi
OPTIONS ('hoodie.datasource.write.hive_style_partitioning' 'false', 'hoodie.datasource.meta.sync.enable' 'false', 'hoodie.datasource.hive_sync.enable' 'false')
tblproperties (
'primaryKey' = 'id',
'type' = 'mor',
'preCombineField'='dt',
'hoodie.index.type' = 'BUCKET',
'hoodie.bucket.index.hash.field' = 'id',
'hoodie.bucket.index.num.buckets'=512
)
PARTITIONED BY (day,hour);

select count(1) from hudi_test.tmp_hudi_test_1 where day='2023-10-17' would list much filestatus to driver，and driver would oom（such as table with hundreds billion records in a partition（day='2023-10-17'））

but table in hive can be queried rightly
so submit the pr to fix it

Risk level (write none, low medium or high below)

If medium or high, explain what verification was done to mitigate the risks.

Documentation Update

Describe any necessary documentation update if there is any new feature, config, or user-facing change

The config description must be updated if new configs are added or the default value of the configs are changed
Any new feature or user-facing change requires updating the Hudi website. Please create a Jira ticket, attach the
ticket number here and follow the instruction to make
changes to the website.

Contributor's checklist

Read through contributor's guide
Change Logs and Impact were stated clearly
Adequate tests were added if applicable
CI passed

danny0405 · 2023-10-17T09:48:14Z

Should be already fixed in: #9863 ?

xuzifu666 · 2023-10-17T10:02:35Z

Should be already fixed in: #9863 ?

no，this pr is not relate to #9863，from #9366 if query a multiple parttion table，but the table is large enough，select count(1) from tb where day='2023-10-12'，would not process sub partition hour one by one. it would list all files in day parititon, cause drive oom @danny0405

xuzifu666 · 2023-10-17T10:08:08Z

@codope @danny0405 @wecharyu conside revert #9366 though this pr for stability

danny0405 · 2023-10-17T11:27:10Z

@wecharyu It is great if you have the review, @xuzifu666 can you supplement with more details, expecially the spark stages difference.

xuzifu666 · 2023-10-17T11:48:19Z

@wecharyu It is great if you have the review, @xuzifu666 can you supplement with more details, expecially the spark stages difference.

sure，had add stages detail in issue https://issues.apache.org/jira/browse/HUDI-6950 @danny0405 oom_stages and fix_stages

wecharyu · 2023-10-17T15:20:35Z

hudi-common/src/main/java/org/apache/hudi/metadata/FileSystemBackedTableMetadata.java

+            if (HoodiePartitionMetadata.hasPartitionMetadata(fileSystem, fileStatus.getPath())) {
+              return Pair.of(Option.of(FSUtils.getRelativePartitionPath(dataBasePath.get(), fileStatus.getPath())), Option.empty());
+            } else if (!fileStatus.getPath().getName().equals(HoodieTableMetaClient.METAFOLDER_NAME)) {
+              return Pair.of(Option.empty(), Option.of(fileStatus.getPath()));


@xuzifu666 "Processing listed partitions" will left the intermediate path to call listStatus in the next iterator, which is the same as community version now.

I have test the query select count(1) from hudi_test where day='2023-10-17', which only list the partition directory underlying partition '2023-10-17'. Could you provide more details on how to reproduce the driver OOM issue?

ok，in a condition that day = 2023-10-13 partition are 200000000000 records(1kb per record)，driver memory is 4gb ，sub parition 'hour' from 1 to 24，than query select count(1) from table where day='2023-10-13' or select * from table where day='2023-10-13'，driver would oom directly. at the same time revert the #9366 would query success in 1 min @wecharyu @danny0405

ok，in a condition that day = 2023-10-13 partition are 200000000000 records(1kb per record)，driver memory is 4gb ，sub parition 'hour' from 1 to 24，than query select count(1) from table where day='2023-10-13' or select * from table where day='2023-10-13'，driver would oom directly. at the same time revert the #9366 would query success in 1 min @wecharyu @danny0405

even if driver memory raise to 12GB，still oom. from dag， would list all file status of day partition to driver cause it. after revert it recover，and dag is to get each sub hour partition compute one by one，it is stable

hudi-bot · 2023-10-17T15:38:56Z

CI report:

eeb64f5 UNKNOWN
118b8ea Azure: SUCCESS

Bot commands

@hudi-bot supports the following commands:

@hudi-bot run azure re-run the last Azure build

wecharyu · 2023-10-18T15:45:17Z

The dump lists many Pairs of FileStatus, which are not generated in getPartitionPathWithPathPrefixUsingFilterExpression().
We may need more deeper investigation on how the OOM occurs.

boneanxs · 2023-10-19T06:17:27Z

Agree with @wecharyu that this pr should not use extra driver memory than before, we don't need to revert it.

In this method, obtaining fileStatus only happens in executor side, and executors will return partition paths to driver, which is exactly like before.

@xuzifu666 Appreciate it if you could provide more details to help us track it.

xuzifu666 · 2023-10-19T06:21:44Z

Agree with @wecharyu that this pr should not use extra driver memory than before, we don't need to revert it.

In this method, obtaining fileStatus only happens in executor side, and executors will return partition paths to driver, which is exactly like before.

@xuzifu666 Appreciate it if you could provide more details to help us track it.

ok，i add more details in issue，use master branch（7c79ebee1ff1c9a0f5117252cb12fa2f03ac4b24） and build a table each partition of 4000000 parquet files，driver memory is 2gb，before revert it，driver would oom directly，and after revert it，2gb driver is run success，dump added to issue @boneanxs

danny0405 · 2023-10-19T07:40:14Z

@boneanxs Would you like to take some time to look into this?

…m due to large number files in table first partition (#9875)

xuzifu666 added 2 commits October 17, 2023 17:30

fix

eeb64f5

fix

b783334

xuzifu666 added 3 commits October 17, 2023 17:48

style changed

e0f5b1e

style changed

4110ad1

style changed

18151a4

style changed

29b7054

xuzifu666 changed the title ~~[HUDI-6959] query should process listed partitions avoid driver oom due to large number files in table~~ [HUDI-6959] Query should process listed partitions avoid driver oom due to large number files in table Oct 17, 2023

style changed

1d8c05e

xuzifu666 changed the title ~~[HUDI-6959] Query should process listed partitions avoid driver oom due to large number files in table~~ [HUDI-6950] Query should process listed partitions avoid driver oom due to large number files in table Oct 17, 2023

style changed

118b8ea

xuzifu666 changed the title ~~[HUDI-6950] Query should process listed partitions avoid driver oom due to large number files in table~~ [HUDI-6950] Query should process listed partitions avoid driver oom due to large number files in table first partition Oct 17, 2023

wecharyu reviewed Oct 17, 2023

View reviewed changes

danny0405 added issue:performance-regression Performance degradation release-0.14.1 priority:critical Production degraded; pipelines stalled labels Oct 18, 2023

danny0405 approved these changes Oct 18, 2023

View reviewed changes

danny0405 changed the title ~~[HUDI-6950] Query should process listed partitions avoid driver oom due to large number files in table first partition~~ [HUDI-6950] Query should process listed partitions to avoid driver oom due to large number files in table first partition Oct 18, 2023

danny0405 merged commit fae20cd into apache:master Oct 18, 2023

nsivabalan pushed a commit that referenced this pull request Nov 21, 2023

[HUDI-6950] Query should process listed partitions to avoid driver oo…

e60690a

…m due to large number files in table first partition (#9875)

Conversation

xuzifu666 commented Oct 17, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Change Logs

Impact

Risk level (write none, low medium or high below)

Documentation Update

Contributor's checklist

Uh oh!

danny0405 commented Oct 17, 2023

Uh oh!

xuzifu666 commented Oct 17, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

xuzifu666 commented Oct 17, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

danny0405 commented Oct 17, 2023

Uh oh!

xuzifu666 commented Oct 17, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

wecharyu Oct 17, 2023

Choose a reason for hiding this comment

Uh oh!

xuzifu666 Oct 17, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

xuzifu666 Oct 17, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

hudi-bot commented Oct 17, 2023

CI report:

Uh oh!

wecharyu commented Oct 18, 2023

Uh oh!

boneanxs commented Oct 19, 2023

Uh oh!

xuzifu666 commented Oct 19, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

danny0405 commented Oct 19, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

xuzifu666 commented Oct 17, 2023 •

edited

Loading

xuzifu666 commented Oct 17, 2023 •

edited

Loading

xuzifu666 commented Oct 17, 2023 •

edited

Loading

xuzifu666 commented Oct 17, 2023 •

edited

Loading

xuzifu666 Oct 17, 2023 •

edited

Loading

xuzifu666 Oct 17, 2023 •

edited

Loading

xuzifu666 commented Oct 19, 2023 •

edited

Loading