Skip to content

[To dev/1.3] Fix kill query doesn't take effect bug#17358

Merged
JackieTien97 merged 5 commits intodev/1.3from
ty/kill
Mar 26, 2026
Merged

[To dev/1.3] Fix kill query doesn't take effect bug#17358
JackieTien97 merged 5 commits intodev/1.3from
ty/kill

Conversation

@JackieTien97
Copy link
Contributor

This pull request introduces improvements to query execution management and resource cleanup in the IoTDB data node, with a focus on more accurate tracking of query execution time, robust handling of killed or missing queries, and improved resource cleanup logic. The most significant changes include the addition of per-RPC execution time tracking, better error responses for killed queries, and refactoring of resource cleanup mechanisms.

Query Execution Time Tracking and Management

  • Added updateCurrentRpcStartTime(long startTime) method to the IQueryExecution interface and its implementations (QueryExecution, ConfigExecution) to enable accurate measurement of per-RPC execution time and total query execution time, including in-progress RPCs. Adjusted recordExecutionTime and getTotalExecutionTime accordingly. [1] [2] [3] [4]
  • Updated test stubs to implement the new updateCurrentRpcStartTime method.

Query Resource Cleanup and Error Handling

  • Improved resource cleanup logic by removing the now-unnecessary stopAndCleanup() method and ensuring that cleanup is triggered only when appropriate, such as when a client connection is inactive. Added a method to clear the coordinator context map if needed. [1] [2] [3] [4] [5]
  • Modified Coordinator.cleanupQueryExecution to remove the query execution from the map at the start of the cleanup process, ensuring proper resource release.

Client RPC Service Improvements

  • Enhanced error handling in ClientRPCServiceImpl for the fetchResults and fetchResultsV2 methods: now returns a specific error status and message when a query is not found (e.g., killed, timed out), instead of a generic response. [1] [2] [3] [4]

Code Quality and Testability

  • Added @TestOnly annotations to certain constructors in MPPQueryContext to clarify their intended usage and refactored constructor chaining for better maintainability. [1] [2]

Miscellaneous

  • Minor improvements and clarifications in comments and code organization related to result handle cleanup and metrics. [1] [2] [3]

These changes collectively improve the reliability, observability, and maintainability of query execution and resource management in the IoTDB data node.

@codecov
Copy link

codecov bot commented Mar 25, 2026

Codecov Report

❌ Patch coverage is 9.33333% with 68 lines in your changes missing coverage. Please review.
✅ Project coverage is 41.37%. Comparing base (0b67cb8) to head (e125abc).
⚠️ Report is 4 commits behind head on dev/1.3.

Files with missing lines Patch % Lines
.../db/queryengine/plan/execution/QueryExecution.java 0.00% 23 Missing ⚠️
...lan/analyze/schema/ClusterSchemaFetchExecutor.java 0.00% 15 Missing ⚠️
.../apache/iotdb/db/queryengine/plan/Coordinator.java 20.00% 12 Missing ⚠️
.../db/protocol/thrift/impl/ClientRPCServiceImpl.java 0.00% 11 Missing ⚠️
...yengine/plan/execution/config/ConfigExecution.java 12.50% 7 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             dev/1.3   #17358   +/-   ##
==========================================
  Coverage      41.36%   41.37%           
  Complexity       227      227           
==========================================
  Files           3594     3594           
  Lines         236637   236603   -34     
  Branches       28641    28631   -10     
==========================================
- Hits           97891    97887    -4     
+ Misses        138746   138716   -30     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves DataNode query lifecycle management to ensure killed/missing queries are handled deterministically, resource cleanup is more robust, and query execution time metrics better reflect per-RPC behavior (especially around fetchResults).

Changes:

  • Extend IQueryExecution with per-RPC timing (updateCurrentRpcStartTime) and update QueryExecution/ConfigExecution implementations accordingly.
  • Refactor query cleanup: remove stopAndCleanup() overload, adjust Coordinator.cleanupQueryExecution() to remove executions from the map earlier, and add stale-query cleanup driven from FragmentInstanceManager.
  • Improve fetchResults/fetchResultsV2 behavior when a query execution is missing (e.g., killed/cleaned up), returning an explicit error status instead of a “successful but empty” response.

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
iotdb-core/datanode/src/test/java/org/apache/iotdb/db/queryengine/execution/operator/MergeSortOperatorTest.java Updates test stub to satisfy new IQueryExecution interface and removes removed API usage.
iotdb-core/datanode/src/main/java/org/apache/iotdb/db/queryengine/plan/execution/config/ConfigExecution.java Adds per-RPC timing tracking fields/methods for config queries.
iotdb-core/datanode/src/main/java/org/apache/iotdb/db/queryengine/plan/execution/QueryExecution.java Adds per-RPC timing and adjusts cancellation/cleanup interactions with Coordinator.
iotdb-core/datanode/src/main/java/org/apache/iotdb/db/queryengine/plan/execution/IQueryExecution.java Removes stopAndCleanup() overload and adds updateCurrentRpcStartTime.
iotdb-core/datanode/src/main/java/org/apache/iotdb/db/queryengine/plan/Coordinator.java Changes cleanup ordering (remove from map first) and adds stale-query cleanup.
iotdb-core/datanode/src/main/java/org/apache/iotdb/db/queryengine/execution/fragment/FragmentInstanceManager.java Invokes coordinator stale-query cleanup on the existing timeout-cancel path.
iotdb-core/datanode/src/main/java/org/apache/iotdb/db/protocol/thrift/impl/ClientRPCServiceImpl.java Improves missing-query handling in fetch RPCs and wires per-RPC start time updates.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@sonarqubecloud
Copy link

@JackieTien97 JackieTien97 merged commit 4903f3c into dev/1.3 Mar 26, 2026
17 of 20 checks passed
@JackieTien97 JackieTien97 deleted the ty/kill branch March 26, 2026 12:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants