Skip to content

Fix dagrun.duration.failed missing run_type tag on dagrun_timeout#64768

Open
mahirhiro wants to merge 2 commits intoapache:mainfrom
mahirhiro:fix/dagrun-timeout-stats-tags
Open

Fix dagrun.duration.failed missing run_type tag on dagrun_timeout#64768
mahirhiro wants to merge 2 commits intoapache:mainfrom
mahirhiro:fix/dagrun-timeout-stats-tags

Conversation

@mahirhiro
Copy link
Copy Markdown

When a dag run fails due to dagrun_timeout, the DualStatsManager.timing call for dagrun.duration.failed used tags={"dag_id": dag_run.dag_id} instead of dag_run.stats_tags. This meant run_type was always absent from timeout-caused failures, making it impossible to filter the metric by run_type in monitoring queries.

The normal finish path in dagrun.py correctly uses self.stats_tags (which includes both dag_id and run_type). This aligns the timeout path to match.

closes: #64765


Was generative AI tooling used to co-author this PR?
  • Yes — Claude Code (claude-sonnet-4-6)

Generated-by: Claude Code (claude-sonnet-4-6) following the guidelines

When a dag run fails due to dagrun_timeout, the Stats.timing call for
dagrun.duration.failed used tags={"dag_id": dag_run.dag_id} instead of
dag_run.stats_tags. This meant run_type was always absent from timeout-
caused failures, making it impossible to filter the metric by run_type
in monitoring queries.

Aligns the timeout path with the normal finish path in dagrun.py which
correctly uses stats_tags.

closes: apache#64765
@mahirhiro mahirhiro requested review from XD-DENG and ashb as code owners April 6, 2026 15:32
@boring-cyborg
Copy link
Copy Markdown

boring-cyborg bot commented Apr 6, 2026

Congratulations on your first Pull Request and welcome to the Apache Airflow community! If you have any issues or are unsure about any anything please check our Contributors' Guide (https://github.com/apache/airflow/blob/main/contributing-docs/README.rst)
Here are some useful points:

  • Pay attention to the quality of your code (ruff, mypy and type annotations). Our prek-hooks will help you with that.
  • In case of a new feature add useful documentation (in docstrings or in docs/ directory). Adding a new operator? Check this short guide Consider adding an example DAG that shows how users should use it.
  • Consider using Breeze environment for testing locally, it's a heavy docker but it ships with a working Airflow and a lot of integrations.
  • Be patient and persistent. It might take some time to get a review or get the final approval from Committers.
  • Please follow ASF Code of Conduct for all communication including (but not limited to) comments on Pull Requests, Mailing list and Slack.
  • Be sure to read the Airflow Coding style.
  • Always keep your Pull Requests rebased, otherwise your build might fail due to changes not related to your commits.
    Apache Airflow is a community-driven project and together we are making it better 🚀.
    In case of doubts contact the developers at:
    Mailing List: dev@airflow.apache.org
    Slack: https://s.apache.org/airflow-slack

@boring-cyborg boring-cyborg bot added the area:Scheduler including HA (high availability) scheduler label Apr 6, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Aligns dagrun.duration.failed metric tagging for DagRun timeout failures with the normal DagRun completion path so run_type is consistently included.

Changes:

  • Emit dagrun.duration.failed on dagrun_timeout using dag_run.stats_tags (includes dag_id + run_type).
  • Add a unit regression test ensuring run_type is present in the emitted metric tags for timeout-caused failures.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
airflow-core/src/airflow/jobs/scheduler_job_runner.py Uses dag_run.stats_tags when timing dagrun.duration.failed in the timeout failure path to include run_type.
airflow-core/tests/unit/jobs/test_scheduler_job.py Adds a regression test asserting run_type is included in dagrun.duration.failed tags for timeout failures.

Comment on lines +3395 to +3404
with dag_maker(
dag_id="test_scheduler_fail_dagrun_timeout_stats",
dagrun_timeout=datetime.timedelta(seconds=60),
schedule="@daily",
session=session,
):
EmptyOperator(task_id="dummy")

dr = dag_maker.create_dagrun(start_date=timezone.utcnow() - datetime.timedelta(days=1))

Copy link

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dag_maker.create_dagrun(...) defaults run_type to DagRunType.MANUAL unless explicitly provided (see devel-common/src/tests_common/pytest_plugin.py:1118-1120), so schedule="@daily" here doesn’t make this DagRun “scheduled”. To avoid the test being misleading (and to better match the reported impact), consider either passing run_type=DagRunType.SCHEDULED and asserting the expected value, or dropping the schedule argument.

Copilot uses AI. Check for mistakes.
…SCHEDULED run type

Pass run_type=DagRunType.SCHEDULED explicitly to create_dagrun so the test
actually exercises the scheduled run type scenario. Also assert the tag value
rather than just its presence, and drop the schedule="@daily" arg which had
no effect on run_type.
@kaxil kaxil requested a review from Copilot April 10, 2026 19:55
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:Scheduler including HA (high availability) scheduler

Projects

None yet

Development

Successfully merging this pull request may close these issues.

dagrun.duration.failed metric missing run_type tag when failure caused by dagrun_timeout

2 participants