[fix](compaction) Fix incorrect memory availability check in RowSourceBuffer during vertical compaction#63152
Merged
liutang123 merged 1 commit intoMay 14, 2026
Conversation
…eBuffer during vertical compaction
Exception Log:
thread_mem_tracker_mgr.h:248] alloc large memory: 4294967296, not in query or load, this is just a warning, not prevent memory alloc, stacktrace:
0# doris::ThreadMemTrackerMgr::consume(long, int)
1# Allocator<false, false, false, DefaultMemoryAllocator>::realloc_impl(void*, unsigned long, unsigned long, unsigned long)
2# void doris::vectorized::PODArrayBase<2ul, 4096ul, Allocator<false, false, false, DefaultMemoryAllocator>, 16ul, 15ul>::reserve_for_next_size<>()
3# doris::vectorized::RowSourcesBuffer::append(std::vector<doris::vectorized::RowSource, std::allocator<doris::vectorized::RowSource> > const&)
4# doris::vectorized::VerticalHeapMergeIterator::next_batch(doris::vectorized::Block*)
5# doris::vectorized::VerticalBlockReader::_direct_next_block(doris::vectorized::Block*, bool*)
6# doris::vectorized::VerticalBlockReader::next_block_with_aggregation(doris::vectorized::Block*, bool*)
7# doris::Merger::vertical_compact_one_group(std::shared_ptr<doris::BaseTablet>, doris::ReaderType, doris::TabletSchema const&, bool, std::vector<unsigned int, std::allocator<unsigned int> > const&, doris::vectorized::RowSourcesBuffer*, std::vector<std::shared_ptr<doris::RowsetReader>, std::allocator<std::shared_ptr<doris::RowsetReader> > > const&, doris::RowsetWriter*, long, doris::Merger::Statistics*, std::vector<unsigned int, std::allocator<unsigned int> >, long, doris::CompactionSampleInfo*)
8# doris::Merger::vertical_merge_rowsets(std::shared_ptr<doris::BaseTablet>, doris::ReaderType, doris::TabletSchema const&, std::vector<std::shared_ptr<doris::RowsetReader>, std::allocator<std::shared_ptr<doris::RowsetReader> > > const&, doris::RowsetWriter*, long, long, doris::Merger::Statistics*)
9# doris::Compaction::merge_input_rowsets()
10# doris::CloudCompactionMixin::execute_compact_impl(long)
11# doris::CloudCompactionMixin::execute_compact()
12# doris::CloudCumulativeCompaction::execute_compact()
13# std::_Function_handler<void (), doris::CloudStorageEngine::_submit_cumulative_compaction_task(std::shared_ptr<doris::CloudTablet> const&)::$_2>::_M_invoke(std::_Any_data const&)
14# doris::ThreadPool::dispatch_thread()
15# doris::Thread::supervise_thread(void*)
16# ?
17# ?
Reason: PaddedPODArray's `allocated_bytes` includes pad_left and pad_right, which are NOT usable for storing elements.
Contributor
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
Contributor
Author
|
/review |
Contributor
There was a problem hiding this comment.
Code review completed. I did not find any blocking issue in this PR.
Critical checkpoint conclusions:
- Goal and proof: The change targets RowSourcesBuffer spill-threshold correctness by using PODArray capacity/size rather than allocated bytes minus used bytes, which matches the actual reallocation condition. The added unit test exercises repeated appends, spill, flush, seek, and read-back.
- Scope: The code change is small and focused on the spill decision; no unrelated behavior changes observed.
- Concurrency: RowSourcesBuffer is used as per-compaction local state; this change adds no shared state or locking changes.
- Lifecycle/static initialization: No new static/global lifecycle concerns.
- Configuration: No new config item; existing vertical_compaction_max_row_source_memory_mb semantics are preserved.
- Compatibility/storage format: No serialized format or persistent metadata change.
- Parallel paths: The shared RowSourcesBuffer path is used by vertical compaction/segcompaction readers, so the fix applies at the common point.
- Tests: Unit coverage was added. I attempted
./run-be-ut.sh --run --filter=VerticalCompactionTest.TestRowSourcesBufferSpillThreshold, but the local runner environment failed during setup becausethirdparty/installed/bin/protocis missing, before the test could execute. - Observability/performance: No new expensive hot-path behavior beyond the existing spill branch; using capacity avoids the over-estimation that could cause unintended reallocation.
- Transaction/data correctness: No transaction, delete-bitmap, visible-version, or rowset lifecycle changes.
User focus: No additional user-provided review focus was specified.
Contributor
Author
|
run buildall |
Contributor
TPC-H: Total hot run time: 29770 ms |
Contributor
TPC-DS: Total hot run time: 171506 ms |
Contributor
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
Contributor
Author
|
run p0 |
Contributor
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
Contributor
|
PR approved by at least one committer and no changes requested. |
Contributor
|
PR approved by anyone and no changes requested. |
github-actions Bot
pushed a commit
that referenced
this pull request
May 14, 2026
…eBuffer during vertical compaction (#63152) Exception Log: ``` thread_mem_tracker_mgr.h:248] alloc large memory: 4294967296, not in query or load, this is just a warning, not prevent memory alloc, stacktrace: 0# doris::ThreadMemTrackerMgr::consume(long, int) 1# Allocator<false, false, false, DefaultMemoryAllocator>::realloc_impl(void*, unsigned long, unsigned long, unsigned long) 2# void doris::vectorized::PODArrayBase<2ul, 4096ul, Allocator<false, false, false, DefaultMemoryAllocator>, 16ul, 15ul>::reserve_for_next_size<>() 3# doris::vectorized::RowSourcesBuffer::append(std::vector<doris::vectorized::RowSource, std::allocator<doris::vectorized::RowSource> > const&) 4# doris::vectorized::VerticalHeapMergeIterator::next_batch(doris::vectorized::Block*) 5# doris::vectorized::VerticalBlockReader::_direct_next_block(doris::vectorized::Block*, bool*) 6# doris::vectorized::VerticalBlockReader::next_block_with_aggregation(doris::vectorized::Block*, bool*) 7# doris::Merger::vertical_compact_one_group(std::shared_ptr<doris::BaseTablet>, doris::ReaderType, doris::TabletSchema const&, bool, std::vector<unsigned int, std::allocator<unsigned int> > const&, doris::vectorized::RowSourcesBuffer*, std::vector<std::shared_ptr<doris::RowsetReader>, std::allocator<std::shared_ptr<doris::RowsetReader> > > const&, doris::RowsetWriter*, long, doris::Merger::Statistics*, std::vector<unsigned int, std::allocator<unsigned int> >, long, doris::CompactionSampleInfo*) 8# doris::Merger::vertical_merge_rowsets(std::shared_ptr<doris::BaseTablet>, doris::ReaderType, doris::TabletSchema const&, std::vector<std::shared_ptr<doris::RowsetReader>, std::allocator<std::shared_ptr<doris::RowsetReader> > > const&, doris::RowsetWriter*, long, long, doris::Merger::Statistics*) 9# doris::Compaction::merge_input_rowsets() 10# doris::CloudCompactionMixin::execute_compact_impl(long) 11# doris::CloudCompactionMixin::execute_compact() 12# doris::CloudCumulativeCompaction::execute_compact() 13# std::_Function_handler<void (), doris::CloudStorageEngine::_submit_cumulative_compaction_task(std::shared_ptr<doris::CloudTablet> const&)::$_2>::_M_invoke(std::_Any_data const&) 14# doris::ThreadPool::dispatch_thread() 15# doris::Thread::supervise_thread(void*) 16# ? 17# ? ``` Reason: PaddedPODArray's `allocated_bytes` includes pad_left and pad_right, which are NOT usable for storing elements. Co-authored-by: liutang123 <liulijia1029@google.com>
github-actions Bot
pushed a commit
that referenced
this pull request
May 14, 2026
…eBuffer during vertical compaction (#63152) Exception Log: ``` thread_mem_tracker_mgr.h:248] alloc large memory: 4294967296, not in query or load, this is just a warning, not prevent memory alloc, stacktrace: 0# doris::ThreadMemTrackerMgr::consume(long, int) 1# Allocator<false, false, false, DefaultMemoryAllocator>::realloc_impl(void*, unsigned long, unsigned long, unsigned long) 2# void doris::vectorized::PODArrayBase<2ul, 4096ul, Allocator<false, false, false, DefaultMemoryAllocator>, 16ul, 15ul>::reserve_for_next_size<>() 3# doris::vectorized::RowSourcesBuffer::append(std::vector<doris::vectorized::RowSource, std::allocator<doris::vectorized::RowSource> > const&) 4# doris::vectorized::VerticalHeapMergeIterator::next_batch(doris::vectorized::Block*) 5# doris::vectorized::VerticalBlockReader::_direct_next_block(doris::vectorized::Block*, bool*) 6# doris::vectorized::VerticalBlockReader::next_block_with_aggregation(doris::vectorized::Block*, bool*) 7# doris::Merger::vertical_compact_one_group(std::shared_ptr<doris::BaseTablet>, doris::ReaderType, doris::TabletSchema const&, bool, std::vector<unsigned int, std::allocator<unsigned int> > const&, doris::vectorized::RowSourcesBuffer*, std::vector<std::shared_ptr<doris::RowsetReader>, std::allocator<std::shared_ptr<doris::RowsetReader> > > const&, doris::RowsetWriter*, long, doris::Merger::Statistics*, std::vector<unsigned int, std::allocator<unsigned int> >, long, doris::CompactionSampleInfo*) 8# doris::Merger::vertical_merge_rowsets(std::shared_ptr<doris::BaseTablet>, doris::ReaderType, doris::TabletSchema const&, std::vector<std::shared_ptr<doris::RowsetReader>, std::allocator<std::shared_ptr<doris::RowsetReader> > > const&, doris::RowsetWriter*, long, long, doris::Merger::Statistics*) 9# doris::Compaction::merge_input_rowsets() 10# doris::CloudCompactionMixin::execute_compact_impl(long) 11# doris::CloudCompactionMixin::execute_compact() 12# doris::CloudCumulativeCompaction::execute_compact() 13# std::_Function_handler<void (), doris::CloudStorageEngine::_submit_cumulative_compaction_task(std::shared_ptr<doris::CloudTablet> const&)::$_2>::_M_invoke(std::_Any_data const&) 14# doris::ThreadPool::dispatch_thread() 15# doris::Thread::supervise_thread(void*) 16# ? 17# ? ``` Reason: PaddedPODArray's `allocated_bytes` includes pad_left and pad_right, which are NOT usable for storing elements. Co-authored-by: liutang123 <liulijia1029@google.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Exception Log:
Reason: PaddedPODArray's
allocated_bytesincludes pad_left and pad_right, which are NOT usable for storing elements.What problem does this PR solve?
Issue Number: close #xxx
Related PR: #xxx
Problem Summary:
Release note
None
Check List (For Author)
Test
Behavior changed:
Does this need documentation?
Check List (For Reviewer who merge this PR)