Skip to content

Backport of ClickHouse/ClickHouse#104123 — fix DataLake getAllTableNames heavyweight path #1739

@il9ue

Description

@il9ue

Summary

Antalya users on antalya-25.8 are exposed to the DataLake catalog OOM described in ClickHouse#92494SHOW TABLES against a Glue catalog with thousands of tables triggers per-table S3 metadata fetches and can exhaust memory. Upstream addressed this via ClickHouse#94467/ClickHouse#97062; those landed on antalya-26.1 (via Altinity PR #1552) and antalya-26.3 (inherited from upstream history) but were not ported to antalya-25.8.

Additionally, all three branches have a related residual hole that upstream ClickHouse#94467/ClickHouse#97062 did not close: the typo-hint path through getAllTableNames() still triggers the same heavyweight per-table fetch on any query referencing a non-existent table in a DataLake database. I filed ClickHouse#104123 upstream for this; PR ClickHouse#104124 is pending merge.

This issue tracks porting the typo-hint fix (ClickHouse#104123) to all three maintained Antalya branches, and asks for direction on whether to also backport the larger ClickHouse#97062 fix to 25.8.

State across maintained Antalya branches (verified)

Branch ClickHouse#94467 / ClickHouse#97062 (SHOW TABLES fix) getAllTableNames overridden?
antalya-25.8 (tip e1a7eb14) NogetLightweightTablesIterator is still the original heavyweight implementation (50-thread pool + per-table tryGetTableImpl) No — falls through to IDatabase::getAllTableNames
antalya-26.1 (tip 889ea2b0) Yes — via Altinity PR #1552 (merged 2026-03-26) No — same fallthrough
antalya-26.3 (tip 12361002) Yes — picked up natively from upstream history No — same fallthrough

No prior issue, PR, or commit message indicates ClickHouse#97062 was deliberately deferred for 25.8. Flagging this before assuming.

Proposed: ports for all three branches

I plan to make the typo-hint fix available on all three Antalya branches — 25.8, 26.1, and 26.3 — once ClickHouse#104124 merges upstream.

Easy ports: antalya-26.1 and antalya-26.3

Mechanically a cherry-pick of the upstream fix. The override is a five-line getAllTableNames() mirroring the existing getLightweightTablesIterator() override that PR #1552 brought into 26.1 (and that 26.3 inherits from upstream). Plus regression tests under tests/integration/test_database_glue/.

I'll open one PR per branch — separately rather than bundled — to match the pattern set by PR #1552.

antalya-25.8 needs more work — please advise on scope

The typo-hint fix itself is straightforward to port to 25.8. The override returns Strings, and getCatalog()->getTables() exists on 25.8 already (it's used inside the heavyweight getLightweightTablesIterator). So the mechanical fix is a five-line override, no API dependencies.

However, porting only my fix to 25.8 leaves the much larger SHOW TABLES OOM unaddressed. The original ClickHouse#92494 failure mode (53 GB RSS at 10k+ tables) remains fully reproducible on 25.8 because ClickHouse#97062 was never backported there.

Backporting ClickHouse#97062 to 25.8 is not a mechanical cherry-pick. It involves an API reshape:

  • The getLightweightTablesIterator virtual on IDatabase changes return type from DatabaseTablesIteratorPtr to
    std::vector<LightWeightTableDetails>.
  • A new LightWeightTableDetails struct is introduced.
  • Caller sites in StorageSystemTables and StorageSystemCompletions are updated to consume the new return type.
  • PR Antalya 26.1 Backport of #97062 - Improve catalog show tables query #1552's diff on 26.1 touched ~10 files for this reshape.

Three paths for 25.8 — I'd like guidance:

  1. Port my typo-hint fix only. Closes the typo-hint hole. Honest disclosure in the PR description that the SHOW TABLES heavyweight path remains until Revert "Revert "Improve catalog show tables query"" ClickHouse/ClickHouse#97062 is backported separately. Smallest change, ships fast, leaves the bigger bug alive.

  2. Backport Revert "Revert "Improve catalog show tables query"" ClickHouse/ClickHouse#97062 first, then my fix on top. Closes both holes. Substantially larger change to 25.8 (10+ files, virtual signature changes). Requires careful review against any in-flight 25.8 work I can't see.

  3. Skip 25.8 entirely. If 25.8 is end-of-life soon, or if there's in-flight work that would conflict with Revert "Revert "Improve catalog show tables query"" ClickHouse/ClickHouse#97062's API reshape, this is the right call. I'd hold off and ports go to 26.1 / 26.3 only.

I'm comfortable doing any of the three; what's the preferred shape for 25.8 given your maintenance plans?

Timeline

I'll wait for ClickHouse#104124 to merge upstream before opening Altinity PRs, so the ports cherry-pick from finalized code. The 26.1 and 26.3 ports can move as soon as you confirm direction; the 25.8 port shape depends on your answer above.

Metadata

Metadata

Assignees

Labels

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions