-
Notifications
You must be signed in to change notification settings - Fork 15.2k
Adjusted the EMRServerlessStartJobOperator to cancel failed jobs #51883
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like this is a very opinionated decision. I am wondering if this is not something the user should set by using on_failure_callback
and not us to take this decision.
I'd like to hear more thoughts on that from others.
providers/amazon/src/airflow/providers/amazon/aws/operators/emr.py
Outdated
Show resolved
Hide resolved
…rlessStartJobTrigger
Apologies if there is an obvious answer, but is there a use case where you would want the job to not be cancelled in EMR if a new one is created due to retries, now running / pending concurrently? |
Hard to know all the different user use cases but I think you're correct, I do not see any, so I am probably wrong in my perception :) |
That’s true, there’s definetly a point in letting the user decide, albeit introducing more complexity. As you said lets wait on other opinions :) |
closes: #42401
I have introduced a
cancel_job
method to the EMRServerlessHook, which wraps the cancel_job_run method from boto3.In cases of a non deferrable job run, if an Exception that waiter_max_attempts has been reached is thrown,
cancel_job
is executed. If deferrable is set to True, the cancellation logic is placed insideexecute_complete
, as this method evaluates the job state in this case.