Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

livy provider can not kill the spark job when the trigger is timeout. #37898

Closed
1 of 2 tasks
coffee34 opened this issue Mar 5, 2024 · 3 comments · Fixed by #38916
Closed
1 of 2 tasks

livy provider can not kill the spark job when the trigger is timeout. #37898

coffee34 opened this issue Mar 5, 2024 · 3 comments · Fixed by #38916

Comments

@coffee34
Copy link

coffee34 commented Mar 5, 2024

Apache Airflow Provider(s)

apache-livy

Versions of Apache Airflow Providers

3.5.4

Apache Airflow version

2.7.2

Operating System

Amazon MWAA

Deployment

Amazon (AWS) MWAA

Deployment details

I used LivyOperator (deferrable = True) to create spark job by livy, and execution_timeout is 2h.

What happened

When the spark job run for more than 2h, airflow will detect it's timeout and cancel the trigger.
And then wake up the task to execute on_kill function.
Then livy operator failed with
AttributeError: 'LivyOperator' object has no attribute '_batch_id'
when it execute on_kill function.

What you think should happen instead

The Livy operator should fail due to a timeout, and simultaneously, it should kill the Spark job.

How to reproduce

Utilize the LivyOperator in your workflow.
Configure the operator with deferrable=True and execution_timeout=600 (for a 10-minute timeout).
Execute a Spark job that is expected to run for longer than 10 minutes.

Anything else

Seems like when the livy operator is rewaked up after trigger is time out, the _batch_id here is not initialized, which cause it failed to run on_kill function.

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

@coffee34 coffee34 added area:providers kind:bug This is a clearly a bug needs-triage label for new issues that we didn't triage yet labels Mar 5, 2024
Copy link

boring-cyborg bot commented Mar 5, 2024

Thanks for opening your first issue here! Be sure to follow the issue template! If you are willing to raise PR to address this issue please do so, no need to wait for approval.

@Taragolis Taragolis added provider:apache-livy good first issue and removed needs-triage label for new issues that we didn't triage yet labels Mar 5, 2024
@mateuslatrova
Copy link
Contributor

Hi, I would like to work on this issue!

@coffee34
Copy link
Author

Oh thank you so much

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment