feat: implement DagTaskGroupsExistence and DagTasksExistence endpoints#67832
Open
dzpan0 wants to merge 2 commits into
Open
feat: implement DagTaskGroupsExistence and DagTasksExistence endpoints#67832dzpan0 wants to merge 2 commits into
dzpan0 wants to merge 2 commits into
Conversation
Adds two GET endpoints to the Execution API for batched existence checks against a Dag's tasks and task groups: Each takes a list of ids and returns them partitioned into 'existing' and 'missing' with 200 or 404 only when the Dag is missing. Passing an empty list works as a Dag existence probe. These allow the clients to get the actual information of a Dag, returning correct information even when the Dag hasn’t been ran once related: apache#40745. Co-authored-by: Diogo Callado <diogo.callado@tecnico.ulisboa.pt>
The new existence check and tests for ExternalTaskSensor uses endpoints that were newly added, so these will be excluded for lower versions.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Follow-up to #64394, which fixed
ExternalTaskSensor'scheck_existence=Truefor the AF2. On AF3, the deferrable path still defers silently when the external Dag, task, or task group does not exist, because the existence check has no access to the metadata database.This PR adds two Execution API endpoints that let the sensor verify external references in a single round trip, and wires the sensor to use them before deferring:
GET /dags/{dag_id}/tasks/existence?task_ids=...GET /dags/{dag_id}/task-groups/existence?task_group_ids=...Each returns its input partitioned into
existingandmissing. Passing an emptytask_idslist works as a Dag-existence probe (a missing Dag surfaces asDAG_NOT_FOUND, which the sensor translates toExternalDagNotFoundError). Missing tasks or task groups surface asExternalTaskNotFoundErrorandExternalTaskGroupNotFoundError, respectively, matching the AF2 path's behavior.The Task SDK exposes them via two new
RuntimeTaskInstanceaccessors:get_dag_tasks_existenceandget_dag_task_groups_existence.Why
Existing dag-level Execution API routes can't distinguish a Dag that doesn't exist from a Dag that exists, but has no DagRuns:
GET /dag_runs/countreturns 0 andGET /dag_runs/previousreturns None in both cases.GET /dags/{dag_id}settles the existence question on its own, but says nothing about specific tasks or task groups. These new endpoints fill that gap by combining the existence check with a per-id partition.These endpoints can answer that cheaply and explicitly, which is useful for any future operator or sensor that needs to validate references against an external dag before deferring or scheduling work. The partitioned response shape (
existing/missing) is also more informative than per-id checks when validating a list. Callers get one round trip instead of N, and a single response covers all four outcomes (all present, all missing, partial overlap, dag itself missing).Testing
Unit tests added across the API route, datamodels, SDK client, supervisor handler, runtime accessor and sensor layers.
related: #40745
Was generative AI tooling used to co-author this PR?
Claude (Opus 4.7) following the project guidelines.
{pr_number}.significant.rst, in airflow-core/newsfragments. You can add this file in a follow-up commit after the PR is created so you know the PR number.