Skip to content

KubernetesExecutor: Task stuck in queued state when using deprecated executor_config dict format #67643

@kunaljubce

Description

@kunaljubce

Under which category would you file this issue?

Providers

Apache Airflow version

3.3

What happened and how to reproduce it?

When we try to run a dag with executor_config={"KubernetesExecutor": {"config_file": "/some/path/kubeconfig.yaml"}}, the task gets stuck in a Queued state on the UI for a long time and often takes over 40 minutes to fail. Further inspecting the scheduler logs shows the below:

→ export KUBECONFIG=/Users/81045729/Documents/constant_variables/airflow/.build/.k8s-clusters/airflow-python-3.10-v1.30.13/.kubeconfig && kubectl logs -n airflow airflow-scheduler-58dcc9f6cf-qnbvj -c scheduler 2>&1 | grep -i "Invalid executor_config\|TypeError\|error.*executor\|execute_async\|run_next\|Cannot convert\|failed\|Error\|Failed" | tail -200000
2026-05-28T12:18:23.173735Z [error    ] Invalid executor_config for TaskInstanceKey(dag_id='test_k8s_executor_config_legacy', task_id='check_for_failure', run_id='manual__2026-05-28T12:18:22.589053+00:00', try_number=1, map_index=-1). Executor_config: {'KubernetesExecutor': {'config_file': '/some/path/kubeconfig.yaml'}} [airflow.providers.cncf.kubernetes.executors.kubernetes_executor.KubernetesExecutor] loc=kubernetes_executor.py:221
2026-05-28T12:18:23.174607Z [info     ] Received executor event with state failed for task instance TaskInstanceKey(dag_id='test_k8s_executor_config_legacy', task_id='check_for_failure', run_id='manual__2026-05-28T12:18:22.589053+00:00', try_number=1, map_index=-1) [airflow.jobs.scheduler_job_runner.SchedulerJobRunner] loc=scheduler_job_runner.py:1251
2026-05-28T12:18:23.179662Z [info     ] TaskInstance Finished: dag_id=test_k8s_executor_config_legacy, task_id=check_for_failure, run_id=manual__2026-05-28T12:18:22.589053+00:00, map_index=-1, ti_id=019e6e85-7d17-7605-aac9-7506f0cea68b, run_start_date=None, run_end_date=None, run_duration=None, state=queued, executor=KubernetesExecutor(parallelism=32), executor_state=failed, try_number=1, max_tries=0, pool=default_pool, queue=default, priority_weight=1, operator=PythonOperator, queued_dttm=2026-05-28 12:18:23.168826+00:00, scheduled_dttm=2026-05-28 12:18:23.158209+00:00,queued_by_job_id=3, pid=None [airflow.jobs.scheduler_job_runner.SchedulerJobRunner] loc=scheduler_job_runner.py:1340
Image

What you think should happen instead?

Since the TaskInstance Finished is reported by the scheduler almost instantly after triggering the DAG, the task should fail immediately on the UI, if possible, with a message to switch to using pod_override attribute inside executor_config.

Operating System

MacOS

Deployment

Virtualenv installation

Apache Airflow Provider(s)

cncf-kubernetes

Versions of Apache Airflow Providers

Issue not restricted to a specific provider version

Official Helm Chart version

Not Applicable

Kubernetes Version

No response

Helm Chart configuration

No response

Docker Image customizations

No response

Anything else?

To reproduce, you can create the below dag, place it in airflow-core/src/airflow/example_dags/, and deploy the airflow server with KubernetesExecutor as mentioned in contribution docs:

from airflow.providers.standard.operators.python import PythonOperator
from airflow.sdk import DAG
import pendulum

def useless_func():
    pass

with DAG(
    dag_id = "test_k8s_executor_config_legacy",
    schedule = None,
    start_date=pendulum.datetime(2021, 1, 1, tz="UTC"),
    catchup=False,
) as dag:
    
    run_this = PythonOperator(
        task_id = "check_for_failure",
        python_callable = useless_func,
        executor_config={"KubernetesExecutor": {"config_file": "/some/path/kubeconfig.yaml"}}
    )

    run_this

Once all commands have executed successfully, you should see the URL for the airflow UI in the console logs, looking like below (your port might be different to mine, verify from console logs):

KinD Cluster API server URL: http://localhost:44498
Connecting to localhost:34814. Num try: 1
Error when connecting to localhost:34814 : ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))
Sleeping for 5 seconds.
Connecting to localhost:34814. Num try: 2
Error when connecting to localhost:34814 : ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))
Sleeping for 5 seconds.
Connecting to localhost:34814. Num try: 3
Established connection to api server at http://localhost:34814/api/v2/monitor/health and it is healthy.
Airflow API server URL: http://localhost:34814 (admin/admin)                <<<<---- Airflow UI URL


NEXT STEP: You might now run tests or interact with airflow via shell (kubectl, pytest etc.) or k9s commands:

Copy and paste the URL on your browser, trigger the DAG and watch it go into queued state

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions