fix(api): Improve historical_metrics_data endpoint performance (#62021)#62152
Open
aksdevs wants to merge 1 commit intoapache:mainfrom
Open
fix(api): Improve historical_metrics_data endpoint performance (#62021)#62152aksdevs wants to merge 1 commit intoapache:mainfrom
aksdevs wants to merge 1 commit intoapache:mainfrom
Conversation
…e#62021) Two critical optimizations to reduce database queries and enable query optimizer: 1. Combine DagRun queries from 2 to 1 - Before: 2 separate queries (one for run_type, one for state) with identical WHERE clauses - After: 1 query grouped by (run_type, state), pivoted in Python to separate counts - Impact: Eliminates 1 DB round trip from 4 to 3 total queries 2. Add explicit TaskInstance.dag_id filter before JOIN - Before: TI query only filtered dag_id indirectly via JOIN with DagRun - After: Added explicit TaskInstance.dag_id.in_(permitted_dag_ids) filter before JOIN - Impact: Enables database optimizer to use ti_dag_run index (dag_id, run_id) before the join - Critical: With millions of TaskInstances, this dramatically reduces query execution time These optimizations specifically address the slow retrieval of historical metrics on large installations where TaskInstance counts are in the millions. Updated test assertion from 4 to 3 queries to reflect the optimization.
1 task
viiccwen
reviewed
Feb 19, 2026
Comment on lines
+62
to
+65
| dag_run_date_filter = ( | ||
| func.coalesce(DagRun.start_date, current_time) >= start_date, | ||
| func.coalesce(DagRun.end_date, current_time) <= func.coalesce(end_date, current_time), | ||
| ) |
Contributor
There was a problem hiding this comment.
IMO, it's a bottleneck. It makes the query non-SARGable and may cause a full table scan.
maybe we should replace coalesce with boolean logic (_or and IS NULL checks)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Two critical optimizations to reduce database queries and enable query optimizer:
Combine DagRun queries from 2 to 1
Add explicit TaskInstance.dag_id filter before JOIN
These optimizations specifically address the slow retrieval of historical metrics on large installations where TaskInstance counts are in the millions.
Updated test assertion from 4 to 3 queries to reflect the optimization.