-
Notifications
You must be signed in to change notification settings - Fork 2k
[BugFix] fix cloud native pk index memory statistic leak #60566
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: luohaha <18810541851@163.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR refactors PersistentIndexMemtable
to accurately track heap memory usage for keys by switching from std::string
to std::string_view
, adding a helper to detect heap-allocated strings, and updating memory accounting and tests.
- Replace
std::string
conversions withstd::string_view
in all mutating and lookup methods to avoid unnecessary allocations. - Introduce
is_string_heap_allocated
to detect when astd::string
uses heap memory and update_keys_heap_size
accordingly. - Update
memory_usage()
andclear()
to include heap key sizes, and add unit tests foris_string_heap_allocated
.
Reviewed Changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.
Show a summary per file
File | Description |
---|---|
be/src/util/string_util.h | Declare is_string_heap_allocated helper |
be/src/util/string_util.cpp | Define heap-allocation detection logic |
be/test/util/string_util_test.cpp | Add unit test for is_string_heap_allocated |
be/src/storage/lake/persistent_index_memtable.h | Rename _keys_size to _keys_heap_size |
be/src/storage/lake/persistent_index_memtable.cpp | Use string_view , update heap-size tracking, memory_usage() , and clear() |
[FE Incremental Coverage Report]✅ pass : 0 / 0 (0%) |
[Java-Extensions Incremental Coverage Report]✅ pass : 0 / 0 (0%) |
[BE Incremental Coverage Report]✅ pass : 19 / 20 (95.00%) file detail
|
Why I'm doing:
In the current cloud-native PK index memtable implementation, we only account for the memory usage of keys and values. However, since we're using a B-tree for storage, the memory consumption of the B-tree's internal nodes is not being accurately tracked.
The B-tree provides a bytes_used interface to retrieve its total memory usage. However, since keys use std::string, the heap memory allocated by these strings isn't included in the bytes_used count. We need to implement additional tracking for this memory.
What I'm doing:
This pull request refactors the
PersistentIndexMemtable
implementation inbe/src/storage/lake/persistent_index_memtable.cpp
to improve memory tracking and efficiency. The changes include replacingstd::string
withstd::string_view
for key handling, introducing a new method to check heap allocation for strings, and updating memory usage calculations.Refactoring for Memory Efficiency:
std::string
withstd::string_view
for keys in methods likeupsert
,insert
,erase
,erase_with_filter
,replace
, andget
to reduce unnecessary string allocations. ([[1]](https://github.com/StarRocks/starrocks/pull/60566/files#diff-2a0d7738bf81a53d3c0561225e77a7ddfb6f5aec5f831a03c4b0371715c04728L33-R38)
,[[2]](https://github.com/StarRocks/starrocks/pull/60566/files#diff-2a0d7738bf81a53d3c0561225e77a7ddfb6f5aec5f831a03c4b0371715c04728L54-R65)
,[[3]](https://github.com/StarRocks/starrocks/pull/60566/files#diff-2a0d7738bf81a53d3c0561225e77a7ddfb6f5aec5f831a03c4b0371715c04728L87-R91)
,[[4]](https://github.com/StarRocks/starrocks/pull/60566/files#diff-2a0d7738bf81a53d3c0561225e77a7ddfb6f5aec5f831a03c4b0371715c04728L113-R115)
,[[5]](https://github.com/StarRocks/starrocks/pull/60566/files#diff-2a0d7738bf81a53d3c0561225e77a7ddfb6f5aec5f831a03c4b0371715c04728L130-R135)
,[[6]](https://github.com/StarRocks/starrocks/pull/60566/files#diff-2a0d7738bf81a53d3c0561225e77a7ddfb6f5aec5f831a03c4b0371715c04728L179-R180)
)is_string_heap_allocated
inbe/src/util/string_util.cpp
to determine whether astd::string
uses heap memory, and updated_keys_heap_size
calculations accordingly. ([[1]](https://github.com/StarRocks/starrocks/pull/60566/files#diff-58984ee0cc1ce8771537ac8af37a9bb3447abc880f413364470f5bbfc989a265R51-R58)
,[[2]](https://github.com/StarRocks/starrocks/pull/60566/files#diff-987ba58807d358f61be1b8ea0c619122113532bdb5b574f922986a6a79eada42R79)
)PersistentIndexMemtable::memory_usage()
to use_keys_heap_size
and_map.bytes_used()
for more accurate memory usage tracking. ([be/src/storage/lake/persistent_index_memtable.cppL179-R180](https://github.com/StarRocks/starrocks/pull/60566/files#diff-2a0d7738bf81a53d3c0561225e77a7ddfb6f5aec5f831a03c4b0371715c04728L179-R180)
)_keys_size
to_keys_heap_size
and reset it in theclear()
method to ensure proper memory management. ([[1]](https://github.com/StarRocks/starrocks/pull/60566/files#diff-0fa7a9fa792819b27a25ea176c8c8154af0ed664f12dfd466c8ed781e304b2aeL81-R81)
,[[2]](https://github.com/StarRocks/starrocks/pull/60566/files#diff-2a0d7738bf81a53d3c0561225e77a7ddfb6f5aec5f831a03c4b0371715c04728R189)
)Testing Enhancements:
test_is_string_heap_allocated
inbe/test/util/string_util_test.cpp
to verify the behavior of theis_string_heap_allocated
function for both small and large strings. ([be/test/util/string_util_test.cppR107-R115](https://github.com/StarRocks/starrocks/pull/60566/files#diff-ca9144c8e70f0db720811da5c0d9413dc7a706795fc7ae268380b3cd55f669b0R107-R115)
)What type of PR is this:
Does this PR entail a change in behavior?
If yes, please specify the type of change:
Checklist:
Bugfix cherry-pick branch check: