Skip to content

Fix EMR Serverless task failure on transient AWS throttling errors#67222

Merged
eladkal merged 2 commits into
apache:mainfrom
Subham-KRLX:fix/aws-waiter-throttling
May 22, 2026
Merged

Fix EMR Serverless task failure on transient AWS throttling errors#67222
eladkal merged 2 commits into
apache:mainfrom
Subham-KRLX:fix/aws-waiter-throttling

Conversation

@Subham-KRLX
Copy link
Copy Markdown
Contributor

This pull request fixes premature task failures in EMR Serverless and other AWS operators using the waiter utility. The custom waiter now checks the AWS error code and continues polling on transient server errors or throttling errors instead of failing immediately.

closes: #67178

Was generative AI tooling used to co-author this PR?
Yes — Claude (for PR description)

@Subham-KRLX Subham-KRLX requested a review from o-nikolas as a code owner May 20, 2026 04:22
@boring-cyborg boring-cyborg Bot added area:providers provider:amazon AWS/Amazon - related issues labels May 20, 2026
@ROOBALJINDAL
Copy link
Copy Markdown

Thanks @Subham-KRLX for taking this up on priority. We are blocked to release our builds to prod environments due to this issue. Do you have any work-around for this? If this gets merged, how soon it will be rolled out?

@eladkal eladkal requested a review from vincbeck May 20, 2026 05:40
@eladkal
Copy link
Copy Markdown
Contributor

eladkal commented May 20, 2026

#Thanks @Subham-KRLX for taking this up on priority. We are blocked to release our builds to prod environments due to this issue. Do you have any work-around for this? If this gets merged, how soon it will be rolled out?

You are not blocked.
see #67178 (comment)

Comment thread providers/amazon/src/airflow/providers/amazon/aws/utils/waiter_with_logging.py Outdated
@Subham-KRLX Subham-KRLX force-pushed the fix/aws-waiter-throttling branch from 74f2a3d to bcf19dd Compare May 21, 2026 03:29
@eladkal eladkal force-pushed the fix/aws-waiter-throttling branch from f674d96 to 5047a65 Compare May 21, 2026 17:11
@eladkal eladkal merged commit 8199e76 into apache:main May 22, 2026
94 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:providers provider:amazon AWS/Amazon - related issues

Projects

None yet

Development

Successfully merging this pull request may close these issues.

EmrServerlessStartJobOperator task fails randomly for few tasks in 20-21s even though job is submitted and succeeds fine in emr serverless in background

4 participants