Skip to content

Parallel tasks fail randomly #22255

@abdullahodibat

Description

@abdullahodibat

Apache Airflow version

2.2.4 (latest released)

What happened

Hello Eeveryone,
I am creating 6 instances of airflow tasks to run in parallel. however most of the times one or two of them fail without so many logs explaining the reason!

    sim_num = 6
    sim_all_paths = []
    for i in range(sim_num):
        sim_single_path = ECSOperator(
            task_id="sim_single_path_" + str(i),
            task_definition_id="airflow-python-jobs",
            command=PythonCommand(
                module="jobs.inventory_optimization.sim_single_path",
            ),
            container_config=EC2ContainerConfig.x_large,
            execution_timeout=timedelta(hours=15),
            task_role="tai_main_role"
            )
        sim_all_paths.append(sim_single_path)

The error in the logs say: airflow.providers.amazon.aws.exceptions.ECSOperatorError: {'tasks': [], 'failures': [{'arn': 'arn:aws:ecs:eu-west-1:xxxx:container-instance/zzzzz', 'reason': 'RESOURCE:CPU'}], 'ResponseMetadata': {'RequestId': 'xxxx', 'HTTPStatusCode': 200, 'HTTPHeaders': {'x-amzn-requestid': 'xxxxx', 'content-type': 'application/x-amz-json-1.1', 'content-length': '146', 'date': 'Mon, 14 Mar 2022 16:18:11 GMT'}, 'RetryAttempts': 0}}

Although when i checked Container Instance i found that cpu and memory Registered values are same as available. and no errors in the task logs.

Things i already tried:

  1. upgrade to airflow 2.2.4
  2. added "AIRFLOW__CORE__KILLED_TASK_CLEANUP_TIME": "604800" to airflow env variables.

Any idea how to tackle this issue?

Thanks in advance.

What you expected to happen

No response

How to reproduce

No response

Operating System

Linux

Versions of Apache Airflow Providers

No response

Deployment

Docker-Compose

Deployment details

No response

Anything else

No response

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions