Skip to content

branch-4.1: [improvement](recycler) Avoid single-point read/write during sequentially reading key #62476#63123

Merged
yiguolei merged 1 commit into
branch-4.1from
auto-pick-62476-branch-4.1
May 11, 2026
Merged

branch-4.1: [improvement](recycler) Avoid single-point read/write during sequentially reading key #62476#63123
yiguolei merged 1 commit into
branch-4.1from
auto-pick-62476-branch-4.1

Conversation

@github-actions
Copy link
Copy Markdown
Contributor

Cherry-picked from #62476

…ally reading key (#62476)

fix: #58459


This PR reduces point-read overhead in Cloud recycler rowset cleanup.
Previously, each scanned rowset could immediately trigger metadata
reads/writes to mark it as recycled or abort its related
transaction/job. The new flow records rowset keys during scanning, then
batch-processes recycled marks and deferred abort tasks in worker
batches. Prepare rowset deletion is also deferred so the recycler
re-reads the latest metadata before deleting data.

This keeps the existing recycle safety semantics while reducing
per-rowset KV operations during large recycle scans.

**Release mode test**
Recycling 10,000 rowsets, after enabling
`enable_mark_delete_rowset_before_recycle` and
`enable_abort_txn_and_job_for_delete_rowset_before_recycle`, the
processing time increased by approximately 10%.


**3514 ms -> 175 ms(mark) + 3811(abort and recycle)**
@github-actions github-actions Bot requested a review from yiguolei as a code owner May 11, 2026 02:19
@hello-stephen
Copy link
Copy Markdown
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@hello-stephen
Copy link
Copy Markdown
Contributor

run buildall

@hello-stephen
Copy link
Copy Markdown
Contributor

Cloud UT Coverage Report

Increment line coverage 83.22% (238/286) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 78.11% (1848/2366)
Line Coverage 64.83% (33216/51232)
Region Coverage 65.29% (16432/25167)
Branch Coverage 55.86% (8776/15710)

@yiguolei yiguolei merged commit 043735e into branch-4.1 May 11, 2026
28 of 31 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants