Skip to content

Introduce BaseTaskInstanceDTO and duplicate it across core and task-sdk#67174

Merged
uranusjr merged 1 commit into
apache:mainfrom
jason810496:task-sdk/feature/base-task-instance-dto
May 19, 2026
Merged

Introduce BaseTaskInstanceDTO and duplicate it across core and task-sdk#67174
uranusjr merged 1 commit into
apache:mainfrom
jason810496:task-sdk/feature/base-task-instance-dto

Conversation

@jason810496
Copy link
Copy Markdown
Member

Why

Extract the BaseTaskInstanceDTO from #65958 PR, as the pointed out in #65958 (comment) comment that the airflow.sdk.api.datamodels._generated.TaskInstance usage in Task-SDK since Airflow 3.0 is inaccurate.

Additionally, this improvement should not be blocked by the AIP-108 vote.

What

Extract the executor/Task-SDK common fields of TaskInstanceDTO into BaseTaskInstanceDTO so the same minimal schema can be defined in both:

  • airflow-core/src/airflow/executors/workloads/task.py
  • task-sdk/src/airflow/sdk/execution_time/workloads/task.py

The classes are intentionally duplicated rather than shared, because the core TaskInstanceDTO subclass needs the executor-specific external_executor_id field and a key property that depends on airflow.models.taskinstancekey, which the Task SDK distribution cannot import.

A new prek hook check-task-instance-dto-sync AST-compares the bases and annotated fields of BaseTaskInstanceDTO in both files and fails when they drift, so the duplication stays in sync. The hook runs only when either task.py changes.


Was generative AI tooling used to co-author this PR?

Extract the executor/Task-SDK common fields of ``TaskInstanceDTO`` into
``BaseTaskInstanceDTO`` so the same minimal schema can be defined in
both:

- ``airflow-core/src/airflow/executors/workloads/task.py``
- ``task-sdk/src/airflow/sdk/execution_time/workloads/task.py``

The classes are intentionally duplicated rather than shared, because the
core ``TaskInstanceDTO`` subclass needs the executor-specific
``external_executor_id`` field and a ``key`` property that depends on
``airflow.models.taskinstancekey``, which the Task SDK distribution
cannot import.

A new prek hook ``check-task-instance-dto-sync`` AST-compares the bases
and annotated fields of ``BaseTaskInstanceDTO`` in both files and fails
when they drift, so the duplication stays in sync. The hook runs only
when either ``task.py`` changes.
@boring-cyborg boring-cyborg Bot added area:dev-tools area:Executors-core LocalExecutor & SequentialExecutor area:task-sdk backport-to-v3-2-test Mark PR with this label to backport to v3-2-test branch labels May 19, 2026
@jason810496 jason810496 self-assigned this May 19, 2026
@jason810496 jason810496 removed the backport-to-v3-2-test Mark PR with this label to backport to v3-2-test branch label May 19, 2026
@jason810496 jason810496 requested a review from Copilot May 19, 2026 11:49
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a duplicated BaseTaskInstanceDTO model to define the minimal TaskInstance schema shared between executor workloads (airflow-core) and execution-time communication in the Task SDK, while allowing each side’s TaskInstanceDTO to add its own extras. It also adds a new prek/pre-commit hook to enforce that the duplicated base definitions remain in sync.

Changes:

  • Extract BaseTaskInstanceDTO from TaskInstanceDTO in airflow-core, and make the core TaskInstanceDTO a subclass that adds executor-only fields/behavior.
  • Add a matching BaseTaskInstanceDTO (and SDK TaskInstanceDTO subclass) to the Task SDK execution-time workloads package.
  • Add check-task-instance-dto-sync prek hook that AST-compares bases and annotated fields of BaseTaskInstanceDTO across core and task-sdk.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
task-sdk/src/airflow/sdk/execution_time/workloads/task.py Adds BaseTaskInstanceDTO and SDK-side TaskInstanceDTO for execution-time workload schemas.
task-sdk/src/airflow/sdk/execution_time/workloads/init.py Exposes TaskInstanceDTO from the new workloads package.
scripts/ci/prek/check_task_instance_dto_sync.py New prek hook to ensure core/SDK BaseTaskInstanceDTO definitions don’t drift.
airflow-core/src/airflow/executors/workloads/task.py Extracts shared fields into BaseTaskInstanceDTO; keeps executor-specific pieces in core TaskInstanceDTO.
.pre-commit-config.yaml Registers the new check-task-instance-dto-sync hook and scopes it to the two task.py files.

Comment thread airflow-core/src/airflow/executors/workloads/task.py
Copy link
Copy Markdown
Member

@uranusjr uranusjr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should work

@uranusjr uranusjr merged commit 16ebf0b into apache:main May 19, 2026
103 of 137 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants