Skip to content

Extend EMR System Test waiter timeout#31071

Merged
o-nikolas merged 2 commits intoapache:mainfrom
aws-mwaa:ferruzzi/system-tests/emr-test-timeout
May 4, 2023
Merged

Extend EMR System Test waiter timeout#31071
o-nikolas merged 2 commits intoapache:mainfrom
aws-mwaa:ferruzzi/system-tests/emr-test-timeout

Conversation

@ferruzzi
Copy link
Copy Markdown
Contributor

@ferruzzi ferruzzi commented May 4, 2023

On rare occasion this system test has timed out. Extending the max wait to try to mitigate that.

cc: @o-nikolas @vincbeck @vandonr-amz @syedahsn

task_id="add_steps",
job_flow_id=create_job_flow.output,
steps=SPARK_STEPS,
wait_for_completion=True,
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just moved this outside of the example code block (to line 170 below) as done elsewhere. We've been trying to clean up the snippets that get pasted into the docs but must have missed this one. No functionality change

Comment on lines +173 to +174
add_steps.waiter_delay = 60
add_steps.waiter_max_attempts = 90
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Default values are 30 and 60

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new config means we'll wait an hour and half, which feels a bit too long? That would block the whole test suite from completing for that long if this step hung.

If we used to wait 30min total I'd bump it up to 40min or 45min, anything beyond that we really need to find some other fix I think 😬

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, sounds good. I'll dial these back a bit. I feel like maybe adding more retries at the default duration then? That would mean we can still succeed sooner, but get to try longer.

@o-nikolas o-nikolas merged commit 68f0a3a into apache:main May 4, 2023
@ferruzzi ferruzzi deleted the ferruzzi/system-tests/emr-test-timeout branch May 4, 2023 21:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants