Skip to content

Fix missing task.duration metric for tasks that get skipped after starting.#67943

Open
myps6415 wants to merge 1 commit into
apache:mainfrom
myps6415:fix-task-duration-skipped-61849
Open

Fix missing task.duration metric for tasks that get skipped after starting.#67943
myps6415 wants to merge 1 commit into
apache:mainfrom
myps6415:fix-task-duration-skipped-61849

Conversation

@myps6415
Copy link
Copy Markdown

@myps6415 myps6415 commented Jun 3, 2026

Summary

task.duration (and its registry-derived legacy name dag.<dag_id>.<task_id>.duration) was emitted for tasks ending in SUCCESS or FAILED but missing entirely for SKIPPED tasks, despite the metric being documented as available for all terminal states. Reproducible with ShortCircuitOperator, BranchPythonOperator, and BashOperator with skip_on_exit_code (narrowed in #61849 by @shivaam).

Fix

The metric is emitted from finalize() in the Task SDK task_runner, gated on if ti.start_date and ti.end_date:. The SUCCESS and FAILED exception handlers set ti.end_date on the local RuntimeTaskInstance before constructing the outbound TaskState message, so the guard passes. The two SKIPPED handlers (AirflowSkipException, and the SKIPPED branch of DagRunTriggerException with skip_when_already_exists=True) set end_date only on the outbound TaskState message, leaving ti.end_date as None — so finalize()'s guard failed for skipped tasks.

Set ti.end_date on the local instance in both handlers, mirroring the AirflowFailException and _handle_current_task_success patterns. The TaskState message references ti.end_date to keep the local instance and the outbound message in sync.

Test plan

  • New parametrized test_task_duration_metric_emitted_for_terminal_states covers success / skipped / failed — verifies stats.timing("task.duration", ...) is called with the correct tags and that the legacy dotted form is also emitted via the metrics_template.yaml registry.
  • Self-verified by stashing the production change: the [skipped] parametrize case fails without the fix and passes with it; [success] and [failed] pass in both cases.
  • Existing SKIPPED-related tests (test_run_basic_skipped, test_task_runner_calls_listeners_skipped, etc.) still pass — no regression.
  • ruff format / ruff check / mypy-task-sdk / prek run --from-ref upstream/main --stage pre-commit all green.

Note: test_handle_trigger_dag_run_conflict[True-skipped] was already failing on upstream/main due to a mock-assertion mismatch (the test uses mock.call.send(msg=TriggerDagRun(...)) as kwarg but _handle_trigger_dag_run calls SUPERVISOR_COMMS.send(TriggerDagRun(...)) positionally). Unrelated to this PR.

closes: #61849


Was generative AI tooling used to co-author this PR?
  • Yes — Claude Code (Opus 4.7)

Generated-by: Claude Code (Opus 4.7) following the guidelines

task.duration (and its registry-derived legacy name
dag.<dag_id>.<task_id>.duration) was emitted for tasks ending in
SUCCESS or FAILED but missing for SKIPPED tasks, despite being
documented as available for all terminal states.

The metric is emitted from finalize() in the Task SDK task_runner,
gated on `if ti.start_date and ti.end_date:`. The SUCCESS and FAILED
exception handlers set ti.end_date on the local RuntimeTaskInstance
before constructing the outbound TaskState message, so the guard
passes. The AirflowSkipException handler (and the SKIPPED branch of
DagRunTriggerException with skip_when_already_exists=True) set
end_date only on the outbound TaskState message, leaving
ti.end_date as None — so finalize()'s guard failed and the metric
was never emitted for skipped tasks.

Set ti.end_date on the local instance in both SKIPPED handlers,
mirroring the AirflowFailException and _handle_current_task_success
patterns. The TaskState message references ti.end_date to keep the
local instance and the outbound message in sync.
@ashb ashb changed the title Fix missing task.duration metric for skipped tasks Fix missing task.duration metric for tasks that get skipped after starting. Jun 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Metric dag.<dag_id>.<task_id>.duration is missing

1 participant