Race condition when running multiple schedulers #19038

timgriffiths · 2021-10-18T01:31:34Z

Apache Airflow version

2.2.0 (latest released)

Operating System

Debian GNU/Linux 10 (buster)

Versions of Apache Airflow Providers

apache-airflow-providers-amazon==2.3.0
apache-airflow-providers-celery==2.1.0
apache-airflow-providers-cncf-kubernetes==2.0.3
apache-airflow-providers-docker==2.2.0
apache-airflow-providers-elasticsearch==2.0.3
apache-airflow-providers-ftp==2.0.1
apache-airflow-providers-google==6.0.0
apache-airflow-providers-grpc==2.0.1
apache-airflow-providers-hashicorp==2.1.1
apache-airflow-providers-http==2.0.1
apache-airflow-providers-imap==2.0.1
apache-airflow-providers-microsoft-azure==3.2.0
apache-airflow-providers-mysql==2.1.1
apache-airflow-providers-odbc==2.0.1
apache-airflow-providers-postgres==2.3.0
apache-airflow-providers-redis==2.0.1
apache-airflow-providers-sendgrid==2.0.1
apache-airflow-providers-sftp==2.1.1
apache-airflow-providers-slack==4.1.0
apache-airflow-providers-sqlite==2.0.1
apache-airflow-providers-ssh==2.2.0

Deployment

Official Apache Airflow Helm Chart

Deployment details

Helm deployed using the official Apache Airflow Helm chart

What happened

We recently upgraded to 2.2.0 but have now noticed some of the jobs being killed by the scheduler not log after they start.

So we are using KubernetesPodOperator to launch all our tasks.

What I can see happening is:

Scheduler 1 -> queue job ... it then launches the intermediate pod
Scheduler 2 -> oh a queued job that i haven't seen before ... let me re-schedule that for you
Scheduler 1 -> I can't queue that again ... somethings gone wrong let's me clean up what i was doing
Scheduler 1 -> kill pod, which kills the successfully running pod
Scheduler 1 -> let's queue that again .. it then launches the intermediate pod
Scheduler 2 -> oh a queued job that i haven't seen before ... let me re-schedule that for you
and repeat

Tracking it back it seems to be introduced in #18152, as this function is now scheduled it looks like you can get into a situation where a job has been launched correctly but the scheduler who kicked that off hasn't had time to update the state from queued to scheduled

ie this function

airflow/airflow/executors/kubernetes_executor.py

Line 438 in 5dd690b

def clear_not_launched_queued_tasks(self, session=None) -> None:

What you expected to happen

Tasks that have been scheduled shouldn't be killed

How to reproduce

Startup at least 2 schedulers

Launch a set of tasks using the Kubernetes pod operator (or something that will cause a delay a job moving from queued to scheduled)

Anything else

Work around at the moment seems to just use 1 scheduler but it would be great if this could be patched.

Are you willing to submit PR?

Yes I am willing to submit a PR!

Code of Conduct

I agree to follow this project's Code of Conduct

uranusjr · 2021-10-18T01:36:45Z

cc @andrewgodwin

andrewgodwin · 2021-10-18T03:11:56Z

Hm, yes, this is probably a lot more likely after my commit but would also have been present before (since that function would still run on boot).

I'm curious why it queues it though if the pod is there - the function was meant to, by my understanding, only move it from QUEUED to SCHEDULED if there's no pod that would match. Is the pod around in this case when the taskinstance gets moved back to SCHEDULED?

george-zubrienko · 2021-12-07T14:20:52Z

Seeing this issue a lot with 3 schedulers on 2.2.2, KubernetesExecutor on k8s 1.21. Code is PythonOperator calling a webservice. We do get some logs logs on startup though - this is right before job code starts:

[2021-12-07, 13:32:00 UTC] {taskinstance.py:1262} INFO - Executing <Task(PythonOperator): ...> on 2021-12-07 06:00:00+00:00
[2021-12-07, 13:32:00 UTC] {standard_task_runner.py:52} INFO - Started process 13 to run task
[2021-12-07, 13:32:00 UTC] {standard_task_runner.py:76} INFO - Running: ['airflow', 'tasks', 'run', ....]
[2021-12-07, 13:32:00 UTC] {standard_task_runner.py:77} INFO - Job 129472: Subtask ...
[2021-12-07, 13:32:05 UTC] {local_task_job.py:211} WARNING - State of this instance has been externally set to queued. Terminating instance.
[2021-12-07, 13:32:05 UTC] {process_utils.py:100} INFO - Sending Signals.SIGTERM to GPID 13
[2021-12-07, 13:32:07 UTC] {process_utils.py:66} INFO - Process psutil.Process(pid=13, status='terminated', exitcode=1, started='13:32:00') (13) terminated with exit code 1

Next try for this task started running, but then:

[2021-12-07, 13:32:04 UTC] {chained.py:84} INFO - DefaultAzureCredential acquired a token from EnvironmentCredential
[2021-12-07, 13:32:06 UTC] {taskinstance.py:1411} ERROR - Received SIGTERM. Terminating subprocesses.
[2021-12-07, 13:32:06 UTC] {taskinstance.py:1703} ERROR - Task failed with exception

And then retry 3 goes through finally. Only seeing this issue on pipelines with high number of parallel tasks - in our case, 3 task pools 48 + 48 + 90 total capacity. Also for >1 scheduler, scheduler pods sometimes print this to logs:

sqlalchemy.exc.OperationalError: (psycopg2.errors.DeadlockDetected) deadlock detected
DETAIL:  Process 1368 waits for ShareLock on transaction 30815670; blocked by process 1160.
Process 1160 waits for ShareLock on transaction 30815664; blocked by process 1368.
HINT:  See server log for query details.
CONTEXT:  while updating tuple (13513,3) in relation "task_instance"

jedcunningham · 2022-01-04T18:26:58Z

This should be fixed on 2.2.3 via #19904.

timgriffiths added area:core kind:bug This is a clearly a bug labels Oct 18, 2021

jedcunningham added the affected_version:2.2 Issues Reported for 2.2 label Oct 19, 2021

eladkal changed the title ~~I believe merge 18152 has introduced a race condition when running multiple schedulers~~ Race condition when running multiple schedulers Nov 22, 2021

eladkal added area:HA-Scheduler provider:cncf-kubernetes Kubernetes provider related issues labels Nov 22, 2021

jedcunningham closed this as completed Jan 4, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Race condition when running multiple schedulers #19038

Race condition when running multiple schedulers #19038

timgriffiths commented Oct 18, 2021 •

edited

Loading

uranusjr commented Oct 18, 2021

andrewgodwin commented Oct 18, 2021

george-zubrienko commented Dec 7, 2021

jedcunningham commented Jan 4, 2022

Race condition when running multiple schedulers #19038

Race condition when running multiple schedulers #19038

Comments

timgriffiths commented Oct 18, 2021 • edited Loading

Apache Airflow version

Operating System

Versions of Apache Airflow Providers

Deployment

Deployment details

What happened

What you expected to happen

How to reproduce

Anything else

Are you willing to submit PR?

Code of Conduct

uranusjr commented Oct 18, 2021

andrewgodwin commented Oct 18, 2021

george-zubrienko commented Dec 7, 2021

jedcunningham commented Jan 4, 2022

timgriffiths commented Oct 18, 2021 •

edited

Loading