[refactor](fragment mgr) move report logic to pipeline fragment context to remove callback parameter from ctor#62500
Conversation
…xt to remove callback parameter from ctor
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
|
/review |
There was a problem hiding this comment.
No blocking findings after reviewing the full reporting flow refactor from FragmentMgr into PipelineFragmentContext.
Critical checkpoints:
- Goal / correctness: The goal is to move FE report submission logic into
PipelineFragmentContextso callers no longer inject a report callback. The patch accomplishes that, and the end-to-end behavior remains equivalent:send_report()still submits async work onFragmentMgr's thread pool, attaches the query mem tracker in the callback, reports to FE, refreshes periodic-report state only after a non-final report completes, and preserves recursive-CTE final-close behavior. - Scope / clarity: The change is focused and mostly mechanical. The constructor surface is reduced and the reporting helpers now live with the state they operate on, which improves locality without broad behavioral churn.
- Concurrency: This path is concurrent, but the refactor preserves the existing model. Periodic reporting is still serialized by
_disable_period_reportacquire/release CAS, async callbacks still run on the fragment manager thread pool, and no new lock nesting or heavy work under_task_mutexwas introduced. The existing lifetime protection also remains intact because the submitted closure captures a sharedctx. - Lifecycle management: No new cross-TU static initialization hazards were introduced. Object lifetime remains safe for the moved callback because the async task keeps the
PipelineFragmentContextalive until reporting completes. - Config changes: None.
- Compatibility / storage format: None.
- Parallel code paths: The two fragment creation paths in
FragmentMgr(exec_plan_fragmentandrerun_fragment) were updated consistently. - Special conditions / invariants: The existing report gating rules for success, cancel, periodic reporting, and external queries are preserved.
- Tests: No new tests were added. For this refactor that is tolerable, but the reporting / rerun lifecycle remains an area where regression coverage would still be valuable.
- Test result files: Not applicable.
- Observability: Existing logging and debug hooks were preserved.
- Transaction / persistence: Not involved.
- Data writes / atomicity: Not involved beyond status reporting; no new atomicity risk found.
- FE/BE variable propagation: Not involved.
- Performance: Neutral to slightly positive. The patch removes an indirection through an injected callback and keeps the async execution model unchanged.
- Other issues: None identified that rise to blocking severity.
Overall opinion: This is a reasonable, focused refactor and I did not find a correctness or concurrency issue that should block the PR.
|
run buildall |
|
run buildall |
|
run buildall |
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
|
skip check_coverage |
|
/review |
There was a problem hiding this comment.
No blocking issues found.
Critical checkpoint conclusions:
- Goal / correctness: The PR moves fragment status reporting from
FragmentMgrintoPipelineFragmentContextand removes the callback plumbing from the constructor. The refactor preserves the report payload construction, retry/cancel behavior, and thesend_report()trigger points. I did not find a behavioral regression in the normal reporting path or in the recursive CTEWAIT_FOR_DESTROY/FINAL_CLOSEflow. - Scope / clarity: The change is small, focused, and mostly mechanical around the reporting path and constructor signature cleanup.
- Concurrency: Reporting still runs on
FragmentMgr's thread pool. The submitted lambda attaches the query mem tracker at callback entry, and it captures a strongctxbefore usingthis, so thePipelineFragmentContextstays alive through_coordinator_callback()andrefresh_next_report_time(). I did not find a new lock-order or deadlock issue. - Lifecycle / static init: No new cross-translation-unit static initialization risk. The added static strings are function-local statics.
- Config / compatibility: No new config, protocol, storage-format, or rolling-upgrade compatibility issue.
- Parallel paths / conditions: The existing special cases (
coord_addr.hostname == "external", cancel-on-RPC-failure, load vs non-load payload selection, periodic-report gating) are preserved in the moved code. - Test coverage: The updated tests only cover the constructor signature change. There is still no focused coverage for the moved async reporting path or for recursive CTE final-close reporting, so that is the main residual risk.
- Test outputs: No result files changed.
- Observability: Existing logs and identifiers remain sufficient for this refactor; no new observability gap was introduced.
- Transaction / persistence / data writes / FE-BE variable passing: Not applicable for this refactor.
- Performance: Net-neutral. The same thread-pool submission model is preserved without introducing extra locking or an obvious hot-path regression.
Residual risk:
- A dedicated unit or integration test around async report submission, including delayed or failed FE RPC, would make this refactor safer long-term, especially for recursive CTE close/rebuild paths.
|
PR approved by at least one committer and no changes requested. |
|
PR approved by anyone and no changes requested. |
…xt to remove callback parameter from ctor (apache#62500)
What problem does this PR solve?
Issue Number: close #xxx
Related PR: #xxx
Problem Summary:
Release note
None
Check List (For Author)
Test
Behavior changed:
Does this need documentation?
Check List (For Reviewer who merge this PR)