-
Notifications
You must be signed in to change notification settings - Fork 2.5k
Open
Labels
area:metadata-tableMetadata table relatedMetadata table relatedpriority:criticalProduction degraded; pipelines stalledProduction degraded; pipelines stalled
Description
Describe the problem you faced
Hi Hudi community, I have a glue job that is ingesting data to a Hudi MOR table. However, this job periodically failed in the below stage



Could you help to investigate this issue? I have went through this issue, but doesn't seem like the same issue. When I deleted the requested/inflight deltacommit, also tried to increase resources, the errors still persisted. Thanks!
Environment Description
-
Hudi version : 0.13.1
-
Spark version : 3.1
-
Storage (HDFS/S3/GCS..) : S3
Additional context
Add any other context about the problem here.
Stacktrace
Exception in User Class: jp.ne.paypay.daas.data.exceptions.JobFatalError : Streaming batch load failed with error: Could not compact s3://pay2-datalake-prod-standard/datasets/bronze/payment-accounting-db1-20241010-aurora-prod/payment_accounting/sub_payments_accounting-1761348391
Job aborted due to stage failure: Task 169 in stage 87.0 failed 4 times, most recent failure: Lost task 169.3 in stage 87.0 (TID 21675) (10.12.56.40 executor 13): ExecutorLostFailure (executor 13 exited caused by one of the running tasks) Reason: Executor heartbeat timed out after 508519 ms
--
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
area:metadata-tableMetadata table relatedMetadata table relatedpriority:criticalProduction degraded; pipelines stalledProduction degraded; pipelines stalled
Type
Projects
Status
👤 User Action