Skip to content

Antalya 26.1 Backport of #100452 - DataLakeCatalog: avoid full catalog read for UNKNOWN_TABLE typo hints#1583

Merged
zvonand merged 2 commits intoantalya-26.1from
backports/antalya-26.1/100452
Mar 28, 2026
Merged

Antalya 26.1 Backport of #100452 - DataLakeCatalog: avoid full catalog read for UNKNOWN_TABLE typo hints#1583
zvonand merged 2 commits intoantalya-26.1from
backports/antalya-26.1/100452

Conversation

@mkmkme
Copy link
Copy Markdown
Collaborator

@mkmkme mkmkme commented Mar 26, 2026

Changelog category (leave one):

  • Improvement

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

Avoid scanning the whole remote data lake catalog for “Maybe you meant …” table hints when show_data_lake_catalogs_in_system_tables is disabled (ClickHouse#100452 by @alsugiliazova).

Documentation entry for user-facing changes

...

CI/CD Options

Exclude tests:

  • Fast test
  • Integration Tests
  • Stateless tests
  • Stateful tests
  • Performance tests
  • All with ASAN
  • All with TSAN
  • All with MSAN
  • All with UBSAN
  • All with Coverage
  • All with Aarch64
  • All Regression
  • Disable CI Cache

Regression jobs to run:

  • Fast suites (mostly <1h)
  • Aggregate Functions (2h)
  • Alter (1.5h)
  • Benchmark (30m)
  • ClickHouse Keeper (1h)
  • Iceberg (2h)
  • LDAP (1h)
  • Parquet (1.5h)
  • RBAC (1.5h)
  • SSL Server (1h)
  • S3 (2h)
  • S3 Export (2h)
  • Swarms (30m)
  • Tiered Storage (2h)

@github-actions
Copy link
Copy Markdown

github-actions bot commented Mar 26, 2026

Workflow [PR], commit [50a5ed1]

@mkmkme
Copy link
Copy Markdown
Collaborator Author

mkmkme commented Mar 26, 2026

AI audit note: This review comment was generated by AI (claude-4.6-sonnet-medium-thinking).

Audit update for PR #1583 (DataLakeCatalog: avoid full catalog read for UNKNOWN_TABLE typo hints):

Confirmed defects:

No confirmed defects in reviewed scope.

Coverage summary:

  • Scope reviewed: DatabaseCatalog.cpp TableNameHints::getAllRegisteredNames() (7 added / 3 deleted lines), call-graph from UNKNOWN_TABLE exception paths through hint generation, getExtendedHintForTable cross-DB path, DatabaseNameHints sibling class, setting declaration and guard logic.
  • Categories failed: None.
  • Categories passed: guard ordering/null-safety, null-context path, normal-DB path unchanged, extended-hint path unaffected, setting=true path, DatabaseNameHints unaffected, thread safety (no new shared mutable state), C++ lifetime safety.
  • Assumptions/limits: Static analysis only; getAllTableNames(nullptr) null-context behavior is pre-existing and not audited here; getExtendedHintForTable hardcoding with_datalake_catalogs = false is pre-existing and out of scope.

One pre-existing design note (not a defect in this PR): getExtendedHintForTable unconditionally filters out datalake catalogs via GetDatabasesOptions{.with_datalake_catalogs = false} regardless of the show_data_lake_catalogs_in_system_tables setting, so cross-database datalake hints remain suppressed even when a user sets the flag to true. This asymmetry predates PR 1583 and would be a separate improvement if desired.

@alsugiliazova
Copy link
Copy Markdown
Member

PR #1583 CI Verification Report

CI Results Overview

Category Count
Success 55
Failure 2
Skipped 39
Total (result_pr.json) 96

Dedicated Regression Test: similar_table_names_hint

A dedicated regression test exists for this functionality at iceberg/tests/iceberg_engine/show_data_lake_catalogs_repro.py, scenario similar_table_names_hint. It is included in the iceberg_engine feature (loaded via feature.py line 77) and runs as part of the Iceberg regression suites.

The test:

  1. Creates multiple tables with similar names across namespaces in a DataLakeCatalog
  2. Verifies that with show_data_lake_catalogs_in_system_tables = 0, catalog tables are hidden from system.tables
  3. Verifies that SHOW TABLES FROM still works (explicit listing)
  4. Verifies that dropping a non-existent table with a similar name does not show typo hints when show_data_lake_catalogs_in_system_tables = 0 (i.e., the expensive getAllTableNames() is skipped)

Iceberg regression results:

Suite x86 aarch64
Iceberg 1 Pass (1 module ok, 761 ok scenarios) Pass
Iceberg 2 Pass Pass

The similar_table_names_hint test is part of the 761 passing scenarios in the Iceberg 1 suite. Both Iceberg regression suites passed on both platforms.

CI Failures

1. test_graceful_shutdown — Known Issue, Passed on Retry

Job: Integration tests (amd_binary, 2/5)

test_s3_cluster/test.py::test_graceful_shutdown failed (15.4s) but has retry_ok label — passed on retry. Known issue tracked in #1521.

Related to PR: No — Pre-existing known flaky test

2. test_move_after_processing[another_bucket-AzureQueue] — Unrelated Azure Queue Test

Job: Integration tests (arm_binary, distributed plan, 3/4)

1 failure out of 1,377 tests. Azure Queue storage processing test, completely unrelated to DataLakeCatalog hint logic.

Related to PR: No — Azure Queue test

3. Settings Regression (x86 + aarch64) — Snapshot Mismatch

Jobs: RegressionTestsRelease / Common (settings), RegressionTestsAarch64 / Common (settings)

Failed test: /settings/default values/iceberg_metadata_staleness_ms — snapshot baseline mismatch for a newly added setting (from PR #1575). Not related to this PR's change.

Related to PR: No — Snapshot drift from PR #1575's new setting

4. Swarms Regression (x86 + aarch64) — Pre-existing Node Failure Instability

x86: 3 failed + 1 errored scenarios (node failure: restart, network failure, disk space, restart clickhouse)
aarch64: 2 failed + 1 errored scenarios

Pre-existing instability on the antalya-26.1 branch, same pattern as other PRs.

Related to PR: No — Pre-existing Swarms node failure instability

5. AggregateFunctions (3) (aarch64) — Pre-existing Test Failure

Failed test: /aggregate functions/part 3/state/rankCorrState — aggregate function state test. Completely unrelated to DatabaseCatalog.cpp changes.

Related to PR: No — Pre-existing aggregate function issue

6. GrypeScan (-alpine) — CVE in Base Image

CVE-2026-2673 (High) in Alpine base image OpenSSL.

Related to PR: No — Base image vulnerability

Conclusion

Verdict: Ready to merge — No PR-related failures.

@alsugiliazova alsugiliazova added the verified Verified by QA label Mar 27, 2026
@zvonand zvonand merged commit c665924 into antalya-26.1 Mar 28, 2026
492 of 510 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants