Skip to content

branch-4.1: [refactor](cloud) Replace Tablet references with tabletId in CloudTabletRebalancer (#61233)#63844

Merged
yiguolei merged 2 commits into
apache:branch-4.1from
mymeiyi:branch-4.1-pick-61233
May 29, 2026
Merged

branch-4.1: [refactor](cloud) Replace Tablet references with tabletId in CloudTabletRebalancer (#61233)#63844
yiguolei merged 2 commits into
apache:branch-4.1from
mymeiyi:branch-4.1-pick-61233

Conversation

@mymeiyi
Copy link
Copy Markdown
Contributor

@mymeiyi mymeiyi commented May 28, 2026

pick #61233

…letRebalancer (apache#61233)

- Replace all `Set<Tablet>` data structures with `Set<Long>` (tabletId)
in `CloudTabletRebalancer`
- When tablet metadata (tableId, partitionId, indexId, dbId) is needed,
look it up via `TabletInvertedIndex.getTabletMeta(tabletId)` instead of
holding Tablet object references
- When the actual `CloudReplica` is needed (e.g., in
`updateClusterToBeMap`, `migrateTablets`), navigate through the catalog
hierarchy: db → table → partition → index → tablet
- Update `InfightTask.pickedTablet` and `WarmupTabletTask.pickedTablet`
fields from `Tablet` to `long pickedTabletId`
- Update `isConflict()` and conflict-check helpers to use `TabletMeta`
instead of `CloudReplica`

Storing `Tablet` objects in long-lived `ConcurrentHashMap` structures
creates strong references that:
1. Prevent GC of dropped/replaced tablets, inflating heap usage
2. Risk serving stale tablet state after schema changes or drops
3. Hold unnecessarily large object graphs when only the tablet ID is
needed for set membership and counting

- [x] Updated `CloudTabletRebalancerTest` to use `Set<Long>` and `Long`
instead of mocked `Tablet` objects
- [ ] Verify FE compiles successfully
- [ ] Run cloud tablet rebalancer regression tests

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings May 28, 2026 11:27
@mymeiyi mymeiyi requested a review from yiguolei as a code owner May 28, 2026 11:27
@hello-stephen
Copy link
Copy Markdown
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@mymeiyi
Copy link
Copy Markdown
Contributor Author

mymeiyi commented May 28, 2026

run buildall

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

@mymeiyi
Copy link
Copy Markdown
Contributor Author

mymeiyi commented May 28, 2026

run buildall

@hello-stephen
Copy link
Copy Markdown
Contributor

FE Regression Coverage Report

Increment line coverage 20.92% (41/196) 🎉
Increment coverage report
Complete coverage report

@mymeiyi
Copy link
Copy Markdown
Contributor Author

mymeiyi commented May 29, 2026

run nonConcurrent

@hello-stephen
Copy link
Copy Markdown
Contributor

FE Regression Coverage Report

Increment line coverage 6.02% (41/681) 🎉
Increment coverage report
Complete coverage report

@yiguolei yiguolei merged commit 2a018ed into apache:branch-4.1 May 29, 2026
28 of 31 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants