Introduce BaseTaskInstanceDTO and duplicate it across core and task-sdk#67174
Merged
uranusjr merged 1 commit intoMay 19, 2026
Merged
Conversation
Extract the executor/Task-SDK common fields of ``TaskInstanceDTO`` into ``BaseTaskInstanceDTO`` so the same minimal schema can be defined in both: - ``airflow-core/src/airflow/executors/workloads/task.py`` - ``task-sdk/src/airflow/sdk/execution_time/workloads/task.py`` The classes are intentionally duplicated rather than shared, because the core ``TaskInstanceDTO`` subclass needs the executor-specific ``external_executor_id`` field and a ``key`` property that depends on ``airflow.models.taskinstancekey``, which the Task SDK distribution cannot import. A new prek hook ``check-task-instance-dto-sync`` AST-compares the bases and annotated fields of ``BaseTaskInstanceDTO`` in both files and fails when they drift, so the duplication stays in sync. The hook runs only when either ``task.py`` changes.
Contributor
There was a problem hiding this comment.
Pull request overview
This PR introduces a duplicated BaseTaskInstanceDTO model to define the minimal TaskInstance schema shared between executor workloads (airflow-core) and execution-time communication in the Task SDK, while allowing each side’s TaskInstanceDTO to add its own extras. It also adds a new prek/pre-commit hook to enforce that the duplicated base definitions remain in sync.
Changes:
- Extract
BaseTaskInstanceDTOfromTaskInstanceDTOinairflow-core, and make the coreTaskInstanceDTOa subclass that adds executor-only fields/behavior. - Add a matching
BaseTaskInstanceDTO(and SDKTaskInstanceDTOsubclass) to the Task SDK execution-time workloads package. - Add
check-task-instance-dto-syncprek hook that AST-compares bases and annotated fields ofBaseTaskInstanceDTOacross core and task-sdk.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| task-sdk/src/airflow/sdk/execution_time/workloads/task.py | Adds BaseTaskInstanceDTO and SDK-side TaskInstanceDTO for execution-time workload schemas. |
| task-sdk/src/airflow/sdk/execution_time/workloads/init.py | Exposes TaskInstanceDTO from the new workloads package. |
| scripts/ci/prek/check_task_instance_dto_sync.py | New prek hook to ensure core/SDK BaseTaskInstanceDTO definitions don’t drift. |
| airflow-core/src/airflow/executors/workloads/task.py | Extracts shared fields into BaseTaskInstanceDTO; keeps executor-specific pieces in core TaskInstanceDTO. |
| .pre-commit-config.yaml | Registers the new check-task-instance-dto-sync hook and scopes it to the two task.py files. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Why
Extract the
BaseTaskInstanceDTOfrom #65958 PR, as the pointed out in #65958 (comment) comment that theairflow.sdk.api.datamodels._generated.TaskInstanceusage in Task-SDK since Airflow 3.0 is inaccurate.Additionally, this improvement should not be blocked by the AIP-108 vote.
What
Extract the executor/Task-SDK common fields of
TaskInstanceDTOintoBaseTaskInstanceDTOso the same minimal schema can be defined in both:airflow-core/src/airflow/executors/workloads/task.pytask-sdk/src/airflow/sdk/execution_time/workloads/task.pyThe classes are intentionally duplicated rather than shared, because the core
TaskInstanceDTOsubclass needs the executor-specificexternal_executor_idfield and akeyproperty that depends onairflow.models.taskinstancekey, which the Task SDK distribution cannot import.A new prek hook
check-task-instance-dto-syncAST-compares the bases and annotated fields ofBaseTaskInstanceDTOin both files and fails when they drift, so the duplication stays in sync. The hook runs only when eithertask.pychanges.Was generative AI tooling used to co-author this PR?