Skip to content

Find bad usage of storage in-memory metadata#106200

Open
Michicosun wants to merge 17 commits into
masterfrom
find_bad_usage_of_in_memory_metadata
Open

Find bad usage of storage in-memory metadata#106200
Michicosun wants to merge 17 commits into
masterfrom
find_bad_usage_of_in_memory_metadata

Conversation

@Michicosun
Copy link
Copy Markdown
Member

Changelog category (leave one):

  • Not for changelog (changelog entry is not required)

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

There is no guarantee that metadata will outlive the query if the pointer is dropped early. Let's find all the places here.

@clickhouse-gh
Copy link
Copy Markdown
Contributor

clickhouse-gh Bot commented May 31, 2026

Workflow [PR], commit [8cae653]

Summary:

job_name test_name status info comment
Stateless tests (arm_asan_ubsan, targeted) FAIL
Server died FAIL cidb
Stateless tests (amd_asan_ubsan, db disk, distributed plan, sequential) FAIL
01161_all_system_tables FAIL cidb
Scraping system tables FAIL cidb
Stateless tests (amd_tsan, sequential, 2/2) FAIL
01161_all_system_tables FAIL cidb
Stateless tests (amd_msan, WasmEdge, sequential, 2/2) FAIL
01161_all_system_tables FAIL cidb

AI Review

Summary

This PR changes getInMemoryMetadataPtr to return a lifetime-owning handle and updates metadata/index users so borrowed metadata subobjects are not taken from dead temporaries. The direct rvalue dereference issue from prior review is fixed, but the current MergeTree override still deep-copies table metadata on every lookup, including cache hits, so I am not ready to approve without either avoiding that copy or showing it is negligible.

Findings

⚠️ Majors

  • [src/Storages/MergeTree/MergeTreeData.cpp:10776] MergeTreeData::getInMemoryMetadataPtr still promises query-scoped metadata caching, but the returned handle owns a fresh StorageInMemoryMetadata copy on every call. That copy clones projections, key/index expressions, settings ASTs, and text-index metadata, so analyzer/planner/read paths lose sharing and pay repeated metadata clone cost even when the query metadata cache hits.
    Suggested fix: return the cached/base StorageMetadataPtr in the handle for normal lookups, and clone only at the specific text-index metadata mutation site, or provide focused before/after evidence that the extra copies are not material.
Performance & Safety

The lifetime wrapper now blocks direct temporary operator-> / operator* use, and the previous dangling range-for patterns I checked were rewritten to keep a local handle alive. The remaining risk is performance and cache semantics from cloning MergeTree metadata on every lookup.

Final Verdict

Status: ⚠️ Request changes

Minimum required action: remove the unconditional metadata clone in MergeTreeData::getInMemoryMetadataPtr, or add focused measurements proving the clone cost is acceptable on representative query-analysis/read paths.

@clickhouse-gh clickhouse-gh Bot added the pr-not-for-changelog This PR should not be mentioned in the changelog label May 31, 2026
Comment thread src/Storages/MergeTree/MergeTreeData.cpp
@Michicosun
Copy link
Copy Markdown
Member Author

The text index drops metadata when a reference to it is used.

@kssenii kssenii self-assigned this Jun 1, 2026
Comment thread src/Storages/StorageInMemoryMetadata.h
}();

return cache->emplace(this, IStorage::getInMemoryMetadataPtr(query_context, bypass_metadata_cache)).first->second;
return std::make_shared<StorageInMemoryMetadata>(*base);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This still deep-copies StorageInMemoryMetadata on every MergeTree metadata lookup, including cache hits. StorageInMemoryMetadata copy clones projections, key/index expressions, and ASTs, so analyzer/planner/read paths now pay that cost repeatedly and the query metadata cache no longer shares the object returned by getInMemoryMetadataPtr. If the goal is only lifetime ownership, return the cached/base StorageMetadataPtr inside the handle and clone only in the text-index mutation path; otherwise this needs focused before/after evidence that the extra copies are negligible.

@clickhouse-gh
Copy link
Copy Markdown
Contributor

clickhouse-gh Bot commented Jun 1, 2026

LLVM Coverage Report

Metric Baseline Current Δ
Lines 84.40% 84.40% +0.00%
Functions 92.40% 92.40% +0.00%
Branches 77.00% 77.00% +0.00%

Changed lines: Changed C/C++ lines covered by tests: 577/690 (83.62%) | Lost baseline coverage (was covered on master, now uncovered in this PR): 37 line(s) · Uncovered code

Full report · Diff report

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-not-for-changelog This PR should not be mentioned in the changelog

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants