[feature](cloud) Support segment id list#65190
Draft
mymeiyi wants to merge 2 commits into
Draft
Conversation
Issue Number: None
Related PR: None
Problem Summary: Cloud rowsets historically assumed physical segment file names were continuous from 0 to num_segments - 1. Partial update append, cleanup, warmup, cache recycle, snapshot, binlog, reader, delete bitmap, collection statistics, and index compaction paths all depended on that implicit naming and could mix rowset segment position with the real physical segment id. This change adds rowset segment_ids and next_segment_id metadata behind enable_segment_list, preserves legacy continuous ids when the list is absent, introduces shared segment id helpers and RowsetSegmentMetaView/RowsetSegmentView abstractions, and migrates affected BE and Cloud paths to use rowset position for position-indexed metadata and real segment id for physical files, cache keys, delete bitmap keys, and RowLocation-facing code. It also removes unsafe convenience APIs that encoded the position equals segment id assumption, clarifies cooldown upload naming and merged next segment id calculation, and documents RowsetSegmentView as a non-owning view that must not cross async boundaries without an owning RowsetSharedPtr or copied values.
None
- Test: Manual test
- Ran git diff --cached --check
- Attempted build-support/clang-format.sh for changed C++ files, but it failed because llvm@16 is not installed
- Did not run build or unit tests per request
- Behavior changed: Yes. When enable_segment_list is enabled for cloud rowsets, new writes can persist non-contiguous segment file ids while reads remain compatible with legacy rowsets.
- Does this need documentation: No
### What problem does this PR solve?
Issue Number: None
Related PR: None
Problem Summary: Segment-list rowsets are only written for cloud rowsets, while the binlog paths covered here are local-mode only. The existing binlog code intentionally uses contiguous segment indexes as file ids. Add comments around these paths to prevent future changes from incorrectly applying segment-list mapping to local binlog files.
### Release note
None
### Check List (For Author)
- Test: Manual test
- Ran git diff --check
- Behavior changed: No
- Does this need documentation: No
Contributor
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
Contributor
Author
|
run buildall |
Contributor
TPC-H: Total hot run time: 29991 ms |
Contributor
TPC-DS: Total hot run time: 173687 ms |
Contributor
ClickBench: Total hot run time: 25.36 s |
Contributor
FE Regression Coverage ReportIncrement line coverage |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.