Skip to content

Antalya 26.3: Fix view over iceberg#1759

Open
ianton-ru wants to merge 3 commits intoantalya-26.3from
bugfix/antalya-26.3/view_over_iceberg
Open

Antalya 26.3: Fix view over iceberg#1759
ianton-ru wants to merge 3 commits intoantalya-26.3from
bugfix/antalya-26.3/view_over_iceberg

Conversation

@ianton-ru
Copy link
Copy Markdown

Changelog category (leave one):

  • Bug Fix (user-visible misbehavior in an official stable release)

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

Fix incorrect work view over Iceberg table

Documentation entry for user-facing changes

Solved #1669 - do not make null values for complex column names like materialized(time).
Solved #1629 - use ParquetReadRowGroups in test instead of S3GetObject/AzureGetObject.

CI/CD Options

Exclude tests:

  • Fast test
  • Integration Tests
  • Stateless tests
  • Stateful tests
  • Performance tests
  • All with ASAN
  • All with TSAN
  • All with MSAN
  • All with UBSAN
  • All with Coverage
  • All with Aarch64
  • All Regression
  • Disable CI Cache

Regression jobs to run:

  • Fast suites (mostly <1h)
  • Aggregate Functions (2h)
  • Alter (1.5h)
  • Benchmark (30m)
  • ClickHouse Keeper (1h)
  • Iceberg (2h)
  • LDAP (1h)
  • Parquet (1.5h)
  • RBAC (1.5h)
  • SSL Server (1h)
  • S3 (2h)
  • S3 Export (2h)
  • Swarms (30m)
  • Tiered Storage (2h)

@ianton-ru ianton-ru added antalya port-antalya PRs to be ported to all new Antalya releases antalya-26.3 labels May 8, 2026
@ianton-ru ianton-ru changed the title Bugfix/antalya 26.3/view over iceberg Antalya 26.3: Fix view over iceberg May 8, 2026
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 8, 2026

Workflow [PR], commit [bea30ee]

@ianton-ru ianton-ru mentioned this pull request May 8, 2026
27 tasks
@ianton-ru
Copy link
Copy Markdown
Author

@codex review

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Already looking forward to the next diff.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@ianton-ru
Copy link
Copy Markdown
Author

Audit: PR #1759 — Antalya 26.3: Fix view over Iceberg

AI audit note: This review comment was generated by AI (audit-review skill).

Audit update for PR #1759 (Fix view over Iceberg table under experimental read optimization):

Diff scope: merge-base 69f51ce20c5e060e2dbad3ff67c318469d4ec010 (antalya-26.3) → altinity/bugfix/antalya-26.3/view_over_icebergsrc/Storages/ObjectStorage/StorageObjectStorageSource.cpp, tests/integration/test_storage_iceberg_with_spark/test_read_constant_columns_optimization.py.

Confirmed defects

No confirmed defects in reviewed scope.

Coverage summary

  • Scope reviewed: StorageObjectStorageSource::createReader branch guarded by allow_experimental_iceberg_read_optimization: second pass over requested_columns_list that treats nullable columns missing from Parquet-derived file metadata as constant NULL, plus integration test updates (new view test; ProfileEvents switched to ParquetReadRowGroups and tightened count expectations).

  • Categories failed: (none).

  • Categories passed: constant-column vs missing-metadata correctness (new skip for synthetic materialize(...) names); string starts_with / ends_with checks (no new lifetime/iterator use); regression coverage for CREATE VIEW ... AS SELECT * FROM iceberg(...) with optimization on/off; test metric alignment assuming one row group per file for event counts.

  • Assumptions/limits: Only column names matching materialize( and ending with ) are excluded from forced constant-NULL injection; other planner-generated synthetic names could behave like the original bug unless covered elsewhere. Planner/view naming alignment with this pattern was not traced end-to-end in this pass (static audit only).

Expanded review notes (methodology snapshot)

Call graph (in scope)

  1. Entry: object-storage read pipeline → StorageObjectStorageSource::createReader with Iceberg metadata available on object_info.

  2. When allow_experimental_iceberg_read_optimization holds and file metadata exists: first loop fills constant_columns* from metadata statistics; second loop walks requested columns (getNameInStorage() keys).

  3. For each nullable requested column absent from file metadata (and not prewhere/RLS): previously always injected constant NULL and erased from physical read — wrong for certain view-exposed synthetic columns.

  4. Change: if name looks like materialize(), skip injection and leave the column to normal format read/compute paths.

Transitions and invariant

  • Invariant at risk: “nullable + absent from parquet metadata ⇒ safe to substitute constant NULL” — violated for expressions wrapped as materialize(...) behind a view.

  • Mitigation: narrow string heuristic on storage column name.

Concurrency / C++ hazard classes

Per-request local maps and strings; no new locking or shared mutable state in the diff. No new UB classes identified (memory, overflow, races) tied to these edits.

Logical fault buckets (conceptual injection)

| Category | Status | Outcome |

| -------------------------------- | ------------- | ------- |

| Synthetic nullable column naming | Applied | Mitigated for materialize(...) |

| Alternate synthetic wrappers | Not exercised | Potential residual gap |

| Integration test flake on events | Passed design | Depends on ParquetReadRowGroups semantics |

| Non-nullable missing columns | N/A | Earlier continue skips loop body |

Copy link
Copy Markdown
Collaborator

@mkmkme mkmkme left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

antalya antalya-26.3 bugfix port-antalya PRs to be ported to all new Antalya releases

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants