Skip to content

EcsRunTaskOperator is blocking in deferrable mode #47312

@m1racoli

Description

@m1racoli

Apache Airflow Provider(s)

amazon

Versions of Apache Airflow Providers

apache-airflow-providers-amazon==9.4.0
apache-airflow-providers-common-compat==1.3.0
apache-airflow-providers-common-io==1.5.0
apache-airflow-providers-common-sql==1.21.0
apache-airflow-providers-fab==1.5.2
apache-airflow-providers-ftp==3.12.0
apache-airflow-providers-http==4.13.3
apache-airflow-providers-imap==3.8.0
apache-airflow-providers-postgres==6.0.0
apache-airflow-providers-smtp==1.9.0
apache-airflow-providers-snowflake==6.0.0
apache-airflow-providers-sqlite==3.9.1

Apache Airflow version

2.10.5

Operating System

Debian GNU/Linux 12 (bookworm)

Deployment

Astronomer

Deployment details

No response

What happened

We observe the following message with values of 2-11 seconds in frequency of 10-19 times per minute.

Triggerer's async thread was blocked for 8.11 seconds, likely by a badly-written trigger. Set PYTHONASYNCIODEBUG=1 to get more information on overrunning coroutines.

using only EcsRunTaskOperator in deferrable mode.

What you think should happen instead

No response

How to reproduce

Run EcsRunTaskOperator in deferrable mode.

Anything else

I've traced the calling sequence of TaskDoneTrigger to a potentially blocking network call.

  1. EcsHook().async_conn and AwsLogsHook.async_conn
  2. AwsGenericHook.async_conn
  3. AwsGenericHook.get_client_type()
  4. AwsGenericHook.get_session()
  5. AwsGenericHook.conn_config
  6. BaseHook.get_connection()

Basically any use of async_conn in an async environment (for example in AwsBaseWaiterTrigger) results in blocking code execution.

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions