branch-4.1:[improvement](executor) unify current query runtime statistics and expose task progress (#60567)#63130
Merged
Conversation
…pose task progress (apache#60567) **PR Summary** - This PR unifies current-query runtime statistics onto the `BE -> FE` reporting pipeline, replacing the previous ad-hoc `RuntimeProfile` traversal path, and enriches `current_queries` with task-level progress plus broader resource metrics. - The goal is to make current-query visibility more real-time and consistent with audit statistics while simplifying and consolidating FE proc/REST surfaces. **What It Solves** - Unifies statistics source: `QeProcessorImpl` now reads aggregated `TQueryStatistics` from `WorkloadRuntimeStatusMgr` instead of relying on the legacy `CurrentQueryInfoProvider` path. - Improves progress observability: introduces `process_rows`, `total_tasks_num`, and `finished_tasks_num`, and exposes computed `Progress`. - Expands runtime metrics coverage: `current_queries` now includes richer scan/cpu/memory/shuffle/spill/cache counters. - Consolidates query views: `/current_queries` and `/current_query_stmts` now share the same statistics view; legacy per-query/per-fragment proc drill-down implementation is removed. **Implementation Details** - Protocol layer: - Extends `TQueryStatistics` with `process_rows`, `finished_tasks_num`, and `total_tasks_num`. - BE collection/reporting: - Accumulates `process_rows` in the execution path. - Records `total_tasks_num` at pipeline task graph initialization and increments `finished_tasks_num` in real time when tasks close. - Mirrors task-progress counters into `QueryTaskController` so counters remain available even after `QueryContext` teardown. - Exports new fields in `ResourceContext::to_thrift_query_statistics`. - FE aggregation/retention: - `WorkloadRuntimeStatusMgr` merges additional fields (including task progress) and refines timeout cleanup: remove query stats only when they are timed out and the query no longer exists in FE. - `QueryStatisticsItem` now carries `TQueryStatistics` as the unified data carrier for proc/REST. - Presentation layer: - `CurrentQueryStatisticsProcDir` adds expanded columns and computes `Progress`. - `/rest/v2/manager/query/current_queries` in `QueryProfileAction` now serves the same unified stats view. - Removes legacy classes: `CurrentQueryInfoProvider`, `CurrentQuerySqlProcDir`, `CurrentQueryFragmentProcNode`, and `CurrentQueryStatementsProcNode`. ``` *************************** 1. row *************************** QueryId: e00b00b1155d4042-98862b60016a768a ConnectionId: 394 Catalog: internal Database: wzhtest User: root ExecTime: 20717 SqlHash: cf263b08302d8be436c97dd5e6f0d283 Statement: INSERT INTO test_query_progress_tb SELECT DISTINCT k, CONCAT(v, CAST(k AS STRING)) FROM test_query_progress_tb WHERE k % 2 = 0 ScanRows: 45400000 Rows ScanBytes: 2.70 GB ProcessRows: 75598123 Rows CpuMs: 178336 MaxPeakMemoryBytes: 13.03 GB CurrentUsedMemoryBytes: 8.69 GB WorkloadGroupId: 1777125330381 ShuffleSendBytes: 0.00 ShuffleSendRows: 0 Rows ScanBytesFromLocalStorage: 31.48 MB ScanBytesFromRemoteStorage: 0.00 SpillWriteBytesToLocalStorage: 0.00 SpillReadBytesFromLocalStorage: 0.00 BytesWriteIntoCache: 0.00 TotalTasks: 74 FinishedTasks: 51 Progress: 68% ------------------------ -- first-- QueryId: e2b8c99658a94743-9ebbf0d036d83295 ConnectionId: 9 Catalog: hive_test Database: tpch100_parquet User: root ExecTime: 6093 SqlHash: f8a30e4182d72cce3eff6cb385005b1f Statement: select ... from supplier, lineitem l1, orders, nation ... limit 100 ScanRows: 621466194 Rows ScanBytes: 5.37 GB ProcessRows: 79079742 Rows CpuMs: 31655 MaxPeakMemoryBytes: 2.32 GB CurrentUsedMemoryBytes: 2.18 GB WorkloadGroupId: 1777253545394 ShuffleSendBytes: 0.00 ShuffleSendRows: 0 Rows ScanBytesFromLocalStorage: 0.00 ScanBytesFromRemoteStorage: 5.37 GB SpillWriteBytesToLocalStorage: 0.00 SpillReadBytesFromLocalStorage: 0.00 BytesWriteIntoCache: 0.00 TotalTasks: 138 FinishedTasks: 49 Progress: 35% --second-- QueryId: e2b8c99658a94743-9ebbf0d036d83295 ConnectionId: 9 Catalog: hive_test Database: tpch100_parquet User: root ExecTime: 10807 SqlHash: f8a30e4182d72cce3eff6cb385005b1f Statement: select ... from supplier, lineitem l1, orders, nation ... limit 100 ScanRows: 1102562592 Rows ScanBytes: 9.20 GB ProcessRows: 112176670 Rows CpuMs: 53808 MaxPeakMemoryBytes: 3.13 GB CurrentUsedMemoryBytes: 2.50 GB WorkloadGroupId: 1777253545394 ShuffleSendBytes: 0.00 ShuffleSendRows: 0 Rows ScanBytesFromLocalStorage: 0.00 ScanBytesFromRemoteStorage: 9.20 GB SpillWriteBytesToLocalStorage: 0.00 SpillReadBytesFromLocalStorage: 0.00 BytesWriteIntoCache: 0.00 TotalTasks: 138 FinishedTasks: 65 Progress: 47% ``` None - Test <!-- At least one of them must be included. --> - [x] Regression test - [x] Unit Test - [x] Manual test (add detailed scripts or steps below) - [ ] No need to test or manual test. Explain why: - [ ] This is a refactor/code format and no logic has been changed. - [ ] Previous test can cover this change. - [ ] No code files have been changed. - [ ] Other reason <!-- Add your reason? --> - Behavior changed: - [ ] No. - [x] Yes. <!-- Explain the behavior change --> - Does this need documentation? - [ ] No. - [x] Yes. <!-- Add document PR link here. eg: apache/doris-website#1214 --> - [ ] Confirm the release note - [ ] Confirm test cases - [ ] Confirm document - [ ] Add branch pick label <!-- Add branch pick label that this PR should merge into --> --------- Co-authored-by: yiguolei <guolei@selectdb.com> Co-authored-by: xuchenhao <419062425@qq.com> Co-authored-by: xuchenhao <48084123+xuchenhao@users.noreply.github.com>
Contributor
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
Contributor
Author
|
run buildall |
Contributor
Cloud UT Coverage ReportIncrement line coverage Increment coverage report
|
Contributor
FE UT Coverage ReportIncrement line coverage |
Contributor
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
Contributor
|
skip buildall |
yiguolei
approved these changes
May 11, 2026
Contributor
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
Contributor
FE Regression Coverage ReportIncrement line coverage |
Contributor
|
PR approved by at least one committer and no changes requested. |
Contributor
|
PR approved by anyone and no changes requested. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
checkpick from #60567
PR Summary
BE -> FEreporting pipeline, replacing the previous ad-hocRuntimeProfiletraversal path, and enrichescurrent_querieswith task-level progress plus broader resource metrics.What It Solves
QeProcessorImplnow reads aggregatedTQueryStatisticsfromWorkloadRuntimeStatusMgrinstead of relying on the legacyCurrentQueryInfoProviderpath.process_rows,total_tasks_num, andfinished_tasks_num, and exposes computedProgress.current_queriesnow includes richer scan/cpu/memory/shuffle/spill/cache counters./current_queriesand/current_query_stmtsnow share the same statistics view; legacy per-query/per-fragment proc drill-down implementation is removed.Implementation Details
TQueryStatisticswithprocess_rows,finished_tasks_num, andtotal_tasks_num.process_rowsin the execution path.total_tasks_numat pipeline task graph initialization and incrementsfinished_tasks_numin real time when tasks close.QueryTaskControllerso counters remain available even afterQueryContextteardown.ResourceContext::to_thrift_query_statistics.WorkloadRuntimeStatusMgrmerges additional fields (including task progress) and refines timeout cleanup: remove query stats only when they are timed out and the query no longer exists in FE.QueryStatisticsItemnow carriesTQueryStatisticsas the unified data carrier for proc/REST.CurrentQueryStatisticsProcDiradds expanded columns and computesProgress./rest/v2/manager/query/current_queriesinQueryProfileActionnow serves the same unified stats view.CurrentQueryInfoProvider,CurrentQuerySqlProcDir,CurrentQueryFragmentProcNode, andCurrentQueryStatementsProcNode.Check List (For Author)
Test
Behavior changed:
Does this need documentation?
Check List (For Reviewer who merge this PR)