Fix airflow dags clear clearing the wrong day for non-UTC partitioned timetables#67717
Fix airflow dags clear clearing the wrong day for non-UTC partitioned timetables#67717Lee-W wants to merge 1 commit into
airflow dags clear clearing the wrong day for non-UTC partitioned timetables#67717Conversation
2fd152c to
c675fe8
Compare
airflow dags clear clearing the wrong day for non-UTC partitioned timetables
fe52efa to
effb74c
Compare
- `airflow dags clear --partition-date-start/end` compared the UTC-parsed bounds straight against DagRun.partition_date, so non-UTC timetables cleared the wrong day (UTC+ zones dropped the start day, UTC- zones the end day). - Convert the bounds through the timetable timezone into a half-open UTC range; add a public CronMixin.timezone accessor.
effb74c to
cab5a31
Compare
| query = query.where(DagRun.partition_date >= args.partition_date_start) | ||
| if args.partition_date_end is not None: | ||
| query = query.where(DagRun.partition_date <= args.partition_date_end) | ||
| tt_tz = getattr(dag.timetable, "timezone", None) if dag.timetable.partitioned else None |
There was a problem hiding this comment.
tt_tz is resolved by probing for a .timezone attribute, which only CronMixin-based timetables have (this PR adds the property). PartitionedAssetTimetable (timetables/simple.py:267) is also partitioned = True but has no .timezone, so it silently takes the no-tz branch. That's correct today if asset-partition dates are genuinely UTC-anchored, but the dispatch is duck-typed: any future tz-aware partitioned timetable that doesn't expose .timezone will silently fall back to the UTC branch and reintroduce the exact off-by-one this PR fixes, with no error to flag it. Worth either putting the tz accessor on the partitioned-timetable contract so it's explicit which timetables are tz-aware, or branching on a known type. Minor, related: the two day-bound blocks are nearly identical across the tz and no-tz paths and could share a _day_bounds(label, tz) helper to keep them from drifting.
| # Partitioned runs are stored as local-midnight UTC instants; compare at day | ||
| # granularity in the timetable's timezone rather than at the raw UTC instant. | ||
| if args.partition_date_start is not None: | ||
| start_label = args.partition_date_start.date() |
There was a problem hiding this comment.
parsedate returns the parsed instant in UTC (naive input is read as UTC), so .date() takes the UTC calendar day and then re-anchors it to midnight in the timetable tz. For naive values that's intuitive, but a tz-aware CLI value can shift the day: --partition-date-start 2026-02-19T07:00:00+08:00 parses to 2026-02-18T23:00Z, and .date() yields 2026-02-18, not the user's local 2026-02-19. The help text says time-of-day is ignored, but not that the calendar day is read from the parsed (UTC) instant rather than re-projected into the timetable tz. A one-line note would make the as-typed behaviour explicit. Same applies to the end bound at line 182.
| if args.partition_date_end is not None: | ||
| end_label = args.partition_date_end.date() | ||
| # Half-open upper bound: include all of the end local calendar day. | ||
| next_day = datetime.date(end_label.year, end_label.month, end_label.day) + datetime.timedelta( |
There was a problem hiding this comment.
end_label is already a datetime.date (from .date() above), so datetime.date(end_label.year, end_label.month, end_label.day) just rebuilds the same date. This can be next_day = end_label + datetime.timedelta(days=1), which is the simpler form the no-tz branch below already uses.
Why
airflow dags clear --partition-date-start/--partition-date-endparsed the user-supplied dates as UTC and compared those bounds directly againstDagRun.partition_date. For a timetable whose timezone is not UTC, that comparison is off by a day:A user asking to clear a local-calendar day expects the runs of that local day, not a UTC-shifted window.
What
--partition-date-start dateis interpreted as local midnight in the timetable timezone, converted to UTC, and used as an inclusive lowerbound (
partition_date >= lower_utc).--partition-date-end dateis interpreted as local midnight of the next day in the timetable timezone, converted to UTC, and used as anexclusive upper bound (
partition_date < upper_utc). This half-open range keeps the whole requested end day inside the window._timezoneattribute.Was generative AI tooling used to co-author this PR?
Generated-by: [Claude] following the guidelines
{pr_number}.significant.rst, in airflow-core/newsfragments. You can add this file in a follow-up commit after the PR is created so you know the PR number.