Skip to content

[SUPPORT] Metadata compaction periodically fails/hangs #12261

@liiang-huang

Description

@liiang-huang

Describe the problem you faced

Hi Hudi community, I have a glue job that is ingesting data to a Hudi MOR table. However, this job periodically failed in the below stage
image
image
image

Could you help to investigate this issue? I have went through this issue, but doesn't seem like the same issue. When I deleted the requested/inflight deltacommit, also tried to increase resources, the errors still persisted. Thanks!

Environment Description

  • Hudi version : 0.13.1

  • Spark version : 3.1

  • Storage (HDFS/S3/GCS..) : S3

Additional context

Add any other context about the problem here.

Stacktrace

Exception in User Class: jp.ne.paypay.daas.data.exceptions.JobFatalError : Streaming batch load failed with error: Could not compact s3://pay2-datalake-prod-standard/datasets/bronze/payment-accounting-db1-20241010-aurora-prod/payment_accounting/sub_payments_accounting-1761348391


Job aborted due to stage failure: Task 169 in stage 87.0 failed 4 times, most recent failure: Lost task 169.3 in stage 87.0 (TID 21675) (10.12.56.40 executor 13): ExecutorLostFailure (executor 13 exited caused by one of the running tasks) Reason: Executor heartbeat timed out after 508519 ms
--



Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    Status

    👤 User Action

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions