[refactor](storage type) remove deepcopy from typeinfo#61856
[refactor](storage type) remove deepcopy from typeinfo#61856yiguolei merged 2 commits intoapache:masterfrom
Conversation
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
|
run buildall |
TPC-H: Total hot run time: 26925 ms |
TPC-DS: Total hot run time: 169217 ms |
|
run buildall |
TPC-H: Total hot run time: 26737 ms |
TPC-DS: Total hot run time: 168444 ms |
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
1 similar comment
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
|
/review |
Code Review SummaryPR GoalRemove the Critical Checkpoint Conclusions1. Does the code accomplish the goal? Is there a test that proves it? 2. Is this modification as small, clear, and focused as possible? 3. Concurrency concerns? 4. Lifecycle management? 5. Configuration items added? 6. Incompatible changes (function symbols, storage formats)? 7. Functionally parallel code paths? 8. Special conditional checks? 9. Test coverage? 10. Observability? 11. Transaction/persistence? 12. Data writes and modifications? 13. FE-BE variable passing? 14. Performance analysis? 15. Other issues?
VerdictNo blocking issues found. The refactoring is correct, complete (no remaining callers), and the replacement implementations are semantically equivalent to the originals. The pre-existing dead members in |
There was a problem hiding this comment.
Pull request overview
Refactors BE storage type handling by removing TypeInfo::deep_copy and updating dependent components/tests to avoid relying on Arena-based deep copies.
Changes:
- Removes
deep_copyfrom theTypeInfointerface and deletes related implementations/traits. - Updates
IndexedColumnWriterto store the first-value encoded key directly (replacing the prior deep-copied raw value buffer). - Updates tests and bloom-filter index writing paths to no longer call
deep_copy.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| be/test/storage/storage_types_test.cpp | Removes deep-copy assertions and adjusts type tests accordingly. |
| be/src/storage/types.h | Removes TypeInfo::deep_copy and associated deep-copy helpers/implementations. |
| be/src/storage/index/indexed_column_writer.h | Replaces faststring first-value buffer with std::string for encoded first-key storage. |
| be/src/storage/index/indexed_column_writer.cpp | Encodes/stashes the first value key at page start and reuses it when building the value index. |
| be/src/storage/index/bloom_filter/bloom_filter_index_writer.h | Simplifies PK bloom filter writer construction by removing stored TypeInfo*. |
| be/src/storage/index/bloom_filter/bloom_filter_index_writer.cpp | Replaces TypeInfo::deep_copy usage with manual Slice copying into an Arena. |
Comments suppressed due to low confidence (1)
be/test/storage/storage_types_test.cpp:146
TEST(ArrayTypeTest, copy_and_equal)no longer exercises any copy/equality behavior after removing the deep-copy assertions; it currently only validates the resolved array itemTypeInfo. Rename the test to reflect its actual intent (e.g., type resolution) or add assertions that match the current test name.
TEST(ArrayTypeTest, copy_and_equal) {
bool bool_array[3] = {true, false, true};
bool null_signs[3] = {true, true, true};
common_test_array<FieldType::OLAP_FIELD_TYPE_BOOL>(CollectionValue(bool_array, 3, null_signs));
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| Slice new_value; | ||
| RETURN_IF_CATCH_EXCEPTION(_type_info->deep_copy(&new_value, v, _arena)); | ||
| new_value.size = v->size; | ||
| if (v->size > 0) { | ||
| new_value.data = _arena.alloc(v->size); | ||
| memcpy(new_value.data, v->data, v->size); | ||
| } else { | ||
| new_value.data = nullptr; | ||
| } |
There was a problem hiding this comment.
PrimaryKeyBloomFilterIndexWriterImpl::add_values used to wrap the deep-copy path in RETURN_IF_CATCH_EXCEPTION(...), which converts allocation/oom exceptions (e.g., from Arena::alloc) into a returned Status. The new manual copy path allocates via _arena.alloc without this wrapper, so std::bad_alloc/tracker exceptions can escape and violate the method contract (“return error when memory is exhausted”), potentially crashing the writer on OOM. Wrap the allocation+memcpy logic (or the whole loop body) in RETURN_IF_CATCH_EXCEPTION and return the resulting error Status on exception.
|
PR approved by at least one committer and no changes requested. |
|
PR approved by anyone and no changes requested. |
What problem does this PR solve?
Issue Number: close #xxx
Related PR: #xxx
Problem Summary:
Release note
None
Check List (For Author)
Test
Behavior changed:
Does this need documentation?
Check List (For Reviewer who merge this PR)