[SUPPORT] ExecutorLostFailure - Try Processing 1TB Of Data

Hello,

I am making some tests with Hudi, now I tried to process a 1TB parquet dataset.

18 Nodes
R5.4xlarge
128Gb - 16 cores

My spark Submit
spark-submit --deploy-mode cluster --conf spark.executor.cores=5 --conf spark.executor.memoryOverhead=3000 --conf spark.executor.memory=33g --packages org.apache.hudi:hudi-spark-bundle_2.11:0.6.0,org.apache.spark:spark-avro_2.11:2.4.4 --conf spark.serializer=org.apache.spark.serializer.KryoSerializer s3://data-lake/temp/apache_hudi.py

Hudi Options:
hudi_options = {
  'hoodie.table.name': tableName,
  'hoodie.datasource.write.recordkey.field': 'id',
  'hoodie.datasource.write.table.name': tableName,
  'hoodie.datasource.write.operation': 'bulk_insert',
  'hoodie.datasource.write.precombine.field': 'LineCreatedTimestamp',
  'hoodie.bulkinsert.shuffle.parallelism': 4000,
  'hoodie.parquet.small.file.limit': 536870912,
  'hoodie.parquet.max.file.size': 1073741824,
  'hoodie.parquet.block.size': 536870912,
  'hoodie.copyonwrite.record.size.estimate':1024,
  'hoodie.datasource.write.hive_style_partitioning': 'true',
  'hoodie.datasource.hive_sync.enable': 'true',
  'hoodie.datasource.hive_sync.database': 'true',
  'hoodie.datasource.hive_sync.table': 'stockout_test',
  'hoodie.datasource.hive_sync.database': 'raw_courier_api',
  'hoodie.datasource.write.partitionpath.field': 'created_date_brt',
  'hoodie.datasource.hive_sync.partition_fields': 'created_date_brt',
  'hoodie.datasource.hive_sync.partition_extractor_class': 'org.apache.hudi.hive.MultiPartKeysValueExtractor',
  'hoodie.datasource.hive_sync.jdbcurl': 'jdbc:hive2://emr:10000'
}

Error:
ExecutorLostFailure (executor 53 exited caused by one of the running tasks) Reason: Container killed by YARN for exceeding memory limits. 36.4 GB of 35.9 GB physical memory used. Consider boosting spark.yarn.executor.memoryOverhead or disabling yarn.nodemanager.vmem-check-enabled because of YARN-4714.


<img width="1680" alt="Captura de Tela 2021-01-20 às 01 44 54" src="https://user-images.githubusercontent.com/36298331/105128131-2c19ac00-5ac1-11eb-9304-645766359823.png">
<img width="1680" alt="Captura de Tela 2021-01-20 às 01 44 25" src="https://user-images.githubusercontent.com/36298331/105128135-2e7c0600-5ac1-11eb-8194-1c0177ca292a.png">

<img width="1680" alt="Captura de Tela 2021-01-20 às 01 47 24" src="https://user-images.githubusercontent.com/36298331/105128292-7f8bfa00-5ac1-11eb-8154-81f4f7949843.png">


Could you help to process this dataset?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SUPPORT] ExecutorLostFailure - Try Processing 1TB Of Data #2464

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[SUPPORT] ExecutorLostFailure - Try Processing 1TB Of Data #2464

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions