Skip to content

Feat/kubernetes executor callback support#67449

Draft
sjyangkevin wants to merge 7 commits into
apache:mainfrom
sjyangkevin:feat/kubernetes-executor-callback-support
Draft

Feat/kubernetes executor callback support#67449
sjyangkevin wants to merge 7 commits into
apache:mainfrom
sjyangkevin:feat/kubernetes-executor-callback-support

Conversation

@sjyangkevin
Copy link
Copy Markdown
Contributor

Implements supports_callbacks on KubernetesExecutor by running each ExecuteCallback workload as its own pod, alongside the existing task-pod pipeline. Gated on AIRFLOW_V_3_3_PLUS.

Approach

A callback pod reuses the existing task-pod mechanism, the same task_queue, watcher, adoption sweep, and event_buffer channel back to the scheduler.

  1. Pod annotation: callback pods carry callback_id in place of (dag_id, task_id, run_id, try_number, map_index)
  2. Workload key type: annotations_to_key() returns a WorkloadKey, which is a union of TaskInstanceKey and CallbackKey. Executor sites that need task-only fields narrow via isinstance(key, TaskInstanceKey).

Was generative AI tooling used to co-author this PR?
  • Yes (Claude Code (Opus 4.7))

Generated-by: [Tool Name] following the guidelines


  • Read the Pull Request Guidelines for more information. Note: commit author/co-author name and email in commits become permanently public when merged.
  • For fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
  • When adding dependency, check compliance with the ASF 3rd Party License Policy.
  • For significant user-facing changes create newsfragment: {pr_number}.significant.rst, in airflow-core/newsfragments. You can add this file in a follow-up commit after the PR is created so you know the PR number.

sjyangkevin and others added 7 commits May 24, 2026 17:45
Run synchronous executor callbacks (e.g. deadline alerts) as Kubernetes
pods, using the same pod pipeline as task execution. Callbacks are
dispatched via annotation-based key discrimination in the watcher, and
their pod exit code maps to CallbackState.SUCCESS/FAILED.

Also extends execute_workload.py (task-sdk) to handle ExecuteCallback
workloads inside pods, making it the unified entrypoint for both tasks
and callbacks in containerised executors.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@boring-cyborg boring-cyborg Bot added area:providers area:task-sdk provider:cncf-kubernetes Kubernetes (k8s) provider related issues labels May 24, 2026
Copy link
Copy Markdown
Contributor

@ferruzzi ferruzzi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The approach is solid and the test coverage looks good. The main issues are backward compatibility issues. Look at how ECS (#63657) and Celery (#63888) did these and follow their leads.


RUNNING_POD_LOG_LINES = 100
supports_ad_hoc_ti_run: bool = True
supports_callbacks: bool = AIRFLOW_V_3_3_PLUS
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is clever but I don't know if it actually works. In the other executors we used

     if AIRFLOW_V_3_3_PLUS:
         supports_callbacks: bool = True

Which leaves supports_callbacks undefined rather than False.

from sqlalchemy import select
from urllib3.exceptions import HTTPError

from airflow.models.callback import CallbackKey
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pretty sure this needs an if AIRFLOW_V_3_3_PLUS: wrapper, right?


from airflow.cli.cli_config import GroupCommand
from airflow.executors import workloads
from airflow.executors.workloads.types import WorkloadKey
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pretty sure this needs a back-compat wrapper. Check in the ECS executor they did:

     if AIRFLOW_V_3_3_PLUS:
         from airflow.executors.workloads.types import WorkloadKey as _EcsWorkloadKey
         WorkloadKey: TypeAlias = _EcsWorkloadKey
     else:
         WorkloadKey: TypeAlias = TaskInstanceKey

key: TaskInstanceKey
command: Sequence[str]
key: WorkloadKey
command: Sequence[Any]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I liked the way the ECS Executor handled this; they did

  if AIRFLOW_V_3_3_PLUS:
      CommandType: TypeAlias = Sequence[str] | Sequence[ExecuteTask] | Sequence[ExecuteCallback]
  else:
      CommandType: TypeAlias = Sequence[str]

then defined command as a CommandType instead of Sequence[All]

Comment on lines -235 to -241
def queue_workload(self, workload: workloads.All, session: Session | None) -> None:
from airflow.executors import workloads

if not isinstance(workload, workloads.ExecuteTask):
raise RuntimeError(f"{type(self)} cannot handle workloads of type {type(workload)}")
ti = workload.ti
self.queued_tasks[ti.key] = workload
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ECS and Celery both had to keep this for back-compat and added a note # TODO: Remove this once the minimum supported version is 3.3+, and defer to BaseExecutor.queue_workload. I'm reasonably certain that should be applied here as well unless you know of a reason to drop it that I've missed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:providers area:task-sdk provider:cncf-kubernetes Kubernetes (k8s) provider related issues

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants