[Improvement](cte) Reduce cases where fragments in recursive CTEs are not released in a timely manner. #60313

BiteTheDDDDt · 2026-01-28T10:20:25Z

What problem does this PR solve?

The query_ctx might not be found in rerun_fragment, which could result in some fragments not being promptly notified for release.
set _need_notify_close to false when cancel_query, make fragment do not waitting for wait_close

Check List (For Author)

Test
- Regression test
- Unit Test
- Manual test (add detailed scripts or steps below)
- No need to test or manual test. Explain why:
  - This is a refactor/code format and no logic has been changed.
  - Previous test can cover this change.
  - No code files have been changed.
  - Other reason
Behavior changed:
- No.
- Yes.
Does this need documentation?
- No.
- Yes.

Check List (For Reviewer who merge this PR)

Confirm the release note
Confirm test cases
Confirm document
Add branch pick label

Thearas · 2026-01-28T10:20:33Z

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

What problem was fixed (it's best to include specific error reporting information). How it was fixed.
Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
What features were added. Why was this function added?
Which code was refactored and why was this part of the code refactored?
Which functions were optimized and what is the difference before and after the optimization?

BiteTheDDDDt · 2026-01-28T10:20:43Z

run buildall

Copilot

Pull request overview

This PR improves recursive CTE fragment rerun/release behavior by avoiding reliance on QueryContext lookup in FragmentMgr::rerun_fragment, reducing cases where fragments aren’t notified/released promptly.

Changes:

Update FragmentMgr::rerun_fragment to operate purely via the pipeline fragment context map rather than get_query_ctx().
Add SCOPED_ATTACH_TASK(_query_ctx.get()) inside several PipelineFragmentContext rerun-related methods to ensure proper task context attachment.
Adjust recursive CTE RPC rerun loop to continue across fragments and return a consolidated status.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File	Description
be/src/runtime/fragment_mgr.cpp	Removes dependency on `QueryContext` lookup for rerun operations, using `_pipeline_map` directly.
be/src/pipeline/pipeline_fragment_context.cpp	Attaches task context within wait/release/rebuild rerun operations to support the new caller behavior.
be/src/pipeline/exec/rec_cte_source_operator.h	Changes rerun-fragment RPC error handling to continue across targets and return a final status.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-01-28T10:26:00Z

be/src/runtime/fragment_mgr.cpp

+        return fragment_ctx->set_to_rerun();
+    } else if (stage == PRerunFragmentParams::rebuild) {
+        return fragment_ctx->rebuild(_thread_pool.get());
+    } else if (stage == PRerunFragmentParams::submit) {


rerun_fragment() no longer attaches a task context, and the submit stage now calls fragment_ctx->submit() without any SCOPED_ATTACH_TASK. Other stages (wait_close/set_to_rerun/rebuild) now attach inside the callee, but submit() does not, making thread context / signal task id / mem tracking inconsistent for the submit path. Consider attaching in this caller for the submit branch via SCOPED_ATTACH_TASK(fragment_ctx->get_query_ctx()) (or equivalent) before calling submit().

Suggested change

} else if (stage == PRerunFragmentParams::submit) {

} else if (stage == PRerunFragmentParams::submit) {

SCOPED_ATTACH_TASK(fragment_ctx->get_query_ctx());

Copilot · 2026-01-28T10:26:01Z

be/src/runtime/fragment_mgr.cpp

-                print_id(query_id));
+        return Status::InvalidArgument("Unknown rerun fragment opcode: {}", stage);
    }
    return Status::OK();


This return Status::OK(); is unreachable because every branch above returns. Removing it would avoid dead code and make control flow clearer.

Copilot · 2026-01-28T10:26:01Z

be/src/pipeline/exec/rec_cte_source_operator.h

+                st = Status::InternalError(controller.ErrorText());
            }

            auto rpc_st = Status::create(result.status());
            if (!rpc_st.ok()) {


_send_rerun_fragments() now continues on RPC failures but overwrites st on each error, so the final returned status depends on the last failing fragment and may lose the original/root-cause error (and which fragment failed). Also, when controller.Failed() is true, result.status() is not meaningful; consider skipping the Status::create(result.status()) path for that iteration (e.g., continue) and preserving the first failure (or aggregating errors with fragment id/address) instead of overwriting.

Suggested change

st = Status::InternalError(controller.ErrorText());

}

auto rpc_st = Status::create(result.status());

if (!rpc_st.ok()) {

if (st.ok()) {

st = Status::InternalError(controller.ErrorText());

}

// When the RPC controller reports failure, the result status is not reliable.

continue;

}

auto rpc_st = Status::create(result.status());

if (!rpc_st.ok() && st.ok()) {

BiteTheDDDDt · 2026-01-28T11:13:37Z

run buildall

hello-stephen · 2026-01-29T03:05:03Z

BE Regression && UT Coverage Report

Increment line coverage 75.00% (21/28) 🎉

Increment coverage report
Complete coverage report

Category	Coverage
Function Coverage	71.45% (25583/35804)
Line Coverage	54.08% (267372/494424)
Region Coverage	51.71% (222682/430664)
Branch Coverage	53.11% (95635/180057)

hello-stephen · 2026-01-29T03:21:52Z

BE Regression && UT Coverage Report

Increment line coverage 75.00% (21/28) 🎉

Increment coverage report
Complete coverage report

Category	Coverage
Function Coverage	71.45% (25583/35804)
Line Coverage	54.08% (267377/494424)
Region Coverage	51.72% (222730/430664)
Branch Coverage	53.12% (95652/180057)

…timely manner. update update

BiteTheDDDDt · 2026-01-30T09:41:51Z

run buildall

Copilot AI review requested due to automatic review settings January 28, 2026 10:20

Copilot started reviewing on behalf of BiteTheDDDDt January 28, 2026 10:21 View session

Copilot AI reviewed Jan 28, 2026

View reviewed changes

BiteTheDDDDt added the rec_cte label Jan 29, 2026

Reduce cases where fragments in recursive CTEs are not released in a …

81583c0

…timely manner. update update

BiteTheDDDDt force-pushed the dev_0128 branch from 8f08d33 to 81583c0 Compare January 30, 2026 09:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Improvement](cte) Reduce cases where fragments in recursive CTEs are not released in a timely manner. #60313

[Improvement](cte) Reduce cases where fragments in recursive CTEs are not released in a timely manner. #60313

BiteTheDDDDt commented Jan 28, 2026 •

edited

Loading

Uh oh!

Thearas commented Jan 28, 2026

Uh oh!

BiteTheDDDDt commented Jan 28, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Jan 28, 2026

Uh oh!

Copilot AI Jan 28, 2026

Uh oh!

Copilot AI Jan 28, 2026

Uh oh!

BiteTheDDDDt commented Jan 28, 2026

Uh oh!

hello-stephen commented Jan 29, 2026

Uh oh!

hello-stephen commented Jan 29, 2026

Uh oh!

BiteTheDDDDt commented Jan 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	} else if (stage == PRerunFragmentParams::submit) {
	} else if (stage == PRerunFragmentParams::submit) {
	SCOPED_ATTACH_TASK(fragment_ctx->get_query_ctx());

[Improvement](cte) Reduce cases where fragments in recursive CTEs are not released in a timely manner. #60313

Are you sure you want to change the base?

[Improvement](cte) Reduce cases where fragments in recursive CTEs are not released in a timely manner. #60313

Conversation

BiteTheDDDDt commented Jan 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What problem does this PR solve?

Check List (For Author)

Check List (For Reviewer who merge this PR)

Uh oh!

Thearas commented Jan 28, 2026

Uh oh!

BiteTheDDDDt commented Jan 28, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

BiteTheDDDDt commented Jan 28, 2026

Uh oh!

hello-stephen commented Jan 29, 2026

BE Regression && UT Coverage Report

Uh oh!

hello-stephen commented Jan 29, 2026

BE Regression && UT Coverage Report

Uh oh!

BiteTheDDDDt commented Jan 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

BiteTheDDDDt commented Jan 28, 2026 •

edited

Loading