Apache Airflow version
3.1.2
What happened and how to reproduce it?
What happened
The REST endpoint that lists task instances supports start_date_* / end_date_* query parameters (start_date_gte, start_date_lte, end_date_gte, end_date_lte, etc.). For start_date and end_date, Airflow generates SQL predicates on COALESCE(task_instance.start_date OR end_date, now()) instead of comparing those columns directly.
That matches the intended semantics: when start_date or end_date is NULL (for example still-running work), treating the effective timestamp as now() lets those rows intersect a bounded time filter.
On deployments with large task_instance tables, Postgres often chooses a Parallel Seq Scan (or similar full-table-style scan of most of the heap) because WHERE COALESCE(column, now()) is generally hard to satisfy with btree indexes on start_date / end_date.
The observable effect is very slow LIST responses or client timeouts, especially for GET calls that use dags/~/dagRuns/~ alongside end_date_gte / end_date_lte.
Relevant layers (conceptual pointers):
airflow-core/src/airflow/api_fastapi/common/parameters.py — datetime_range_filter_factory; for filter_name in ("start_date", "end_date") uses func.coalesce(attr, func.now()); RangeFilter.to_orm() turns that into WHERE inequalities.
airflow-core/src/airflow/api_fastapi/core_api/routes/public/task_instances.py — list handler injects those dependencies (e.g. Depends(datetime_range_filter_factory("end_date", TI))).
How to reproduce
Prerequisites
- PostgreSQL-backed metadata DB.
task_instance wide enough that sequential scans dominate (typically many millions of rows).
Steps
- Invoke the REST list API with
end_date bounds. Example (adapt base URL, /api/ vs /public/, auth, bundle path as applicable):
GET /api/v2/dags/~/dagRuns/~/taskInstances?limit=1000&end_date_gte=2026-05-03T19:49:33Z&end_date_lte=2026-05-03T22:00:00Z&order_by=end_date
What you think should happen instead?
No response
Operating System
debian
Deployment
Astronomer
Apache Airflow Provider(s)
No response
Versions of Apache Airflow Providers
No response
Official Helm Chart version
Not Applicable
Kubernetes Version
No response
Helm Chart configuration
No response
Docker Image customizations
No response
Anything else?
No response
Are you willing to submit PR?
Code of Conduct
Apache Airflow version
3.1.2
What happened and how to reproduce it?
What happened
The REST endpoint that lists task instances supports
start_date_*/end_date_*query parameters (start_date_gte,start_date_lte,end_date_gte,end_date_lte, etc.). Forstart_dateandend_date, Airflow generates SQL predicates onCOALESCE(task_instance.start_date OR end_date, now())instead of comparing those columns directly.That matches the intended semantics: when
start_dateorend_dateisNULL(for example still-running work), treating the effective timestamp asnow()lets those rows intersect a bounded time filter.On deployments with large
task_instancetables, Postgres often chooses aParallel Seq Scan(or similar full-table-style scan of most of the heap) becauseWHERE COALESCE(column, now())is generally hard to satisfy with btree indexes onstart_date/end_date.The observable effect is very slow LIST responses or client timeouts, especially for
GETcalls that usedags/~/dagRuns/~alongsideend_date_gte/end_date_lte.Relevant layers (conceptual pointers):
airflow-core/src/airflow/api_fastapi/common/parameters.py—datetime_range_filter_factory; forfilter_namein("start_date", "end_date")usesfunc.coalesce(attr, func.now());RangeFilter.to_orm()turns that intoWHEREinequalities.airflow-core/src/airflow/api_fastapi/core_api/routes/public/task_instances.py— list handler injects those dependencies (e.g.Depends(datetime_range_filter_factory("end_date", TI))).How to reproduce
Prerequisites
task_instancewide enough that sequential scans dominate (typically many millions of rows).Steps
end_datebounds. Example (adapt base URL,/api/vs/public/, auth, bundle path as applicable):What you think should happen instead?
No response
Operating System
debian
Deployment
Astronomer
Apache Airflow Provider(s)
No response
Versions of Apache Airflow Providers
No response
Official Helm Chart version
Not Applicable
Kubernetes Version
No response
Helm Chart configuration
No response
Docker Image customizations
No response
Anything else?
No response
Are you willing to submit PR?
Code of Conduct