[HUDI-6950] Query should process listed partitions to avoid driver oom due to large number files in table first partition#9875
Conversation
|
Should be already fixed in: #9863 ? |
no,this pr is not relate to #9863,from #9366 if query a multiple parttion table,but the table is large enough,select count(1) from tb where day='2023-10-12',would not process sub partition hour one by one. it would list all files in day parititon, cause drive oom @danny0405 |
|
@codope @danny0405 @wecharyu conside revert #9366 though this pr for stability |
|
@wecharyu It is great if you have the review, @xuzifu666 can you supplement with more details, expecially the spark stages difference. |
sure,had add stages detail in issue https://issues.apache.org/jira/browse/HUDI-6950 @danny0405 oom_stages and fix_stages |
| if (HoodiePartitionMetadata.hasPartitionMetadata(fileSystem, fileStatus.getPath())) { | ||
| return Pair.of(Option.of(FSUtils.getRelativePartitionPath(dataBasePath.get(), fileStatus.getPath())), Option.empty()); | ||
| } else if (!fileStatus.getPath().getName().equals(HoodieTableMetaClient.METAFOLDER_NAME)) { | ||
| return Pair.of(Option.empty(), Option.of(fileStatus.getPath())); |
There was a problem hiding this comment.
@xuzifu666 "Processing listed partitions" will left the intermediate path to call listStatus in the next iterator, which is the same as community version now.
I have test the query select count(1) from hudi_test where day='2023-10-17', which only list the partition directory underlying partition '2023-10-17'. Could you provide more details on how to reproduce the driver OOM issue?
There was a problem hiding this comment.
ok,in a condition that day = 2023-10-13 partition are 200000000000 records(1kb per record),driver memory is 4gb ,sub parition 'hour' from 1 to 24,than query select count(1) from table where day='2023-10-13' or select * from table where day='2023-10-13',driver would oom directly. at the same time revert the #9366 would query success in 1 min @wecharyu @danny0405
There was a problem hiding this comment.
ok,in a condition that day = 2023-10-13 partition are 200000000000 records(1kb per record),driver memory is 4gb ,sub parition 'hour' from 1 to 24,than query select count(1) from table where day='2023-10-13' or select * from table where day='2023-10-13',driver would oom directly. at the same time revert the #9366 would query success in 1 min @wecharyu @danny0405
even if driver memory raise to 12GB,still oom. from dag, would list all file status of day partition to driver cause it. after revert it recover,and dag is to get each sub hour partition compute one by one,it is stable
|
The dump lists many Pairs of |
|
Agree with @wecharyu that this pr should not use extra driver memory than before, we don't need to revert it. In this method, obtaining fileStatus only happens in executor side, and executors will return partition paths to driver, which is exactly like before. @xuzifu666 Appreciate it if you could provide more details to help us track it. |
ok,i add more details in issue,use master branch(7c79ebee1ff1c9a0f5117252cb12fa2f03ac4b24) and build a table each partition of 4000000 parquet files,driver memory is 2gb,before revert it,driver would oom directly,and after revert it,2gb driver is run success,dump added to issue @boneanxs |
|
@boneanxs Would you like to take some time to look into this? |
…m due to large number files in table first partition (#9875)
Change Logs
query should process listed partitions avoid driver oom due to large number files in table first partition
https://issues.apache.org/jira/browse/HUDI-6950
Impact
currently if multiple partition table,would cause oom easy
eg:
CREATE TABLE
hudi_test.tmp_hudi_test_1(idstring,namestring,dtbigint,daySTRING COMMENT '日期分区',hourINT COMMENT '小时分区')using hudi
OPTIONS ('hoodie.datasource.write.hive_style_partitioning' 'false', 'hoodie.datasource.meta.sync.enable' 'false', 'hoodie.datasource.hive_sync.enable' 'false')
tblproperties (
'primaryKey' = 'id',
'type' = 'mor',
'preCombineField'='dt',
'hoodie.index.type' = 'BUCKET',
'hoodie.bucket.index.hash.field' = 'id',
'hoodie.bucket.index.num.buckets'=512
)
PARTITIONED BY (
day,hour);select count(1) from
hudi_test.tmp_hudi_test_1where day='2023-10-17' would list much filestatus to driver,and driver would oom(such as table with hundreds billion records in a partition(day='2023-10-17'))but table in hive can be queried rightly
so submit the pr to fix it
Risk level (write none, low medium or high below)
If medium or high, explain what verification was done to mitigate the risks.
Documentation Update
Describe any necessary documentation update if there is any new feature, config, or user-facing change
ticket number here and follow the instruction to make
changes to the website.
Contributor's checklist