Skip to content

Add commit info to partition exports table#1832

Open
arthurpassos wants to merge 6 commits into
antalya-26.3from
expand-replicated-partition-exports-columns
Open

Add commit info to partition exports table#1832
arthurpassos wants to merge 6 commits into
antalya-26.3from
expand-replicated-partition-exports-columns

Conversation

@arthurpassos
Copy link
Copy Markdown
Collaborator

Changelog category (leave one):

  • Improvement

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

Add Iceberg metadata file paths and data file paths to the system.replicated_partition_exports to improve observability and debugging

Documentation entry for user-facing changes

...

CI/CD Options

Exclude tests:

  • Fast test
  • Integration Tests
  • Stateless tests
  • Stateful tests
  • Performance tests
  • All with ASAN
  • All with TSAN
  • All with MSAN
  • All with UBSAN
  • All with Coverage
  • All with Aarch64
  • All Regression
  • Disable CI Cache

Regression jobs to run:

  • Fast suites (mostly <1h)
  • Aggregate Functions (2h)
  • Alter (1.5h)
  • Benchmark (30m)
  • ClickHouse Keeper (1h)
  • Iceberg (2h)
  • LDAP (1h)
  • Parquet (1.5h)
  • RBAC (1.5h)
  • SSL Server (1h)
  • S3 (2h)
  • S3 Export (2h)
  • Swarms (30m)
  • Tiered Storage (2h)

@arthurpassos arthurpassos added antalya-26.3 port-antalya PRs to be ported to all new Antalya releases antalya labels May 25, 2026
@arthurpassos
Copy link
Copy Markdown
Collaborator Author

@codex review

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 25, 2026

Workflow [PR], commit [15a0608]

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 0a9022d641

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +1900 to +1907
LOG_INFO(log,
"Export transaction {} already committed, skipping re-commit",
transaction_id);
return;
/// Empty commit info: paths produced by the original commit are not recoverable
/// here without re-reading the snapshot. ExportPartitionUtils::commit only
/// writes the commit_info znode when this struct is non-empty, so the original
/// committer's record (if any) is preserved.
return {};
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Return commit info for already-committed recovery path

When isExportPartitionTransactionAlreadyCommitted is true here, the function returns an empty ExportPartitionCommitInfo, which makes ExportPartitionUtils::commit skip writing <export-entry>/commit_info and only flip status to COMPLETED. In the common recovery scenario (snapshot committed previously, then retried before ZooKeeper was updated), this leaves committed_* fields permanently empty and contradicts the new observability behavior/documentation that expects a visible marker for already-committed retries.

Useful? React with 👍 / 👎.

Comment on lines +857 to +858
assert committed_metadata_file == "<committed in a previous run, paths unavailable>", (
f"Expected sentinel in committed_metadata_file for already-committed retry, got: {committed_metadata_file!r}"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Fix sentinel expectation in post-publish failpoint test

This assertion is inconsistent with the implementation under iceberg_writes_post_publish_throw: that failpoint is ONCE, and the published catch path now returns real file paths (storage_metadata_name, manifest list, manifest file), so committed_metadata_file should be a metadata path, not the "previous run" sentinel. As written, the test will fail despite correct behavior and can block CI.

Useful? React with 👍 / 👎.

@arthurpassos
Copy link
Copy Markdown
Collaborator Author

@codex review

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Already looking forward to the next diff.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@ianton-ru
Copy link
Copy Markdown

AI audit note: This review comment was generated by AI (gpt-5.3-codex).

Audit update for PR #1832 (Add commit info to partition exports table):

Confirmed defects

Medium: commit_info serialization failure can block COMPLETED transition

Impact: A successful destination commit (Iceberg snapshot or object-storage marker) can remain stuck in PENDING (or require later recovery) if commit-info JSON generation throws before ZooKeeper /status is flipped to COMPLETED.

Anchor: src/Storages/MergeTree/ExportPartitionUtils.cpp / ExportPartitionUtils::commit and ExportReplicatedMergeTreePartitionCommitInfoEntry::toJsonString

Trigger: Any exception thrown by ExportReplicatedMergeTreePartitionCommitInfoEntry::toJsonString (it explicitly enables std::ios::failbit exceptions) while destination_commit_info is non-empty.

Why defect: The new observability side-effect (JSON stringify) is now on the critical path of the state transition; a stringify exception aborts the function before the status update, changing behavior from “commit then mark completed” to “commit then throw”.

Fix direction (short): Make commit_info persistence best-effort: wrap toJsonString + tryMulti in try/catch and always fall back to status-only trySet on any exception.

Regression test direction (short): Add a failpoint (or targeted fault injection) that throws during commit_info serialization and assert the task still reaches COMPLETED.

Evidence

Serialization enables stream exceptions:

std::string toJsonString() const
{
    Poco::JSON::Object json;
    json.set("iceberg_metadata_file", iceberg_metadata_file);
    // ...
    std::ostringstream oss;
    oss.exceptions(std::ios::failbit);
    Poco::JSON::Stringifier::stringify(json, oss);
    return oss.str();
}

toJsonString is called before the status flip in the new “atomic commit_info + COMPLETED” multi-op; an exception here prevents reaching the fallback status-only set:

if (!destination_commit_info.empty())
{
    ExportReplicatedMergeTreePartitionCommitInfoEntry commit_info_entry { /* ... */ };
    const std::string commit_info_path = fs::path(entry_path) / "commit_info";

    Coordination::Requests ops;
    ops.emplace_back(zkutil::makeCreateRequest(commit_info_path, commit_info_entry.toJsonString(), zkutil::CreateMode::Persistent));
    ops.emplace_back(zkutil::makeSetRequest(status_path, completed_name, -1));
    // ...
    if (rc == Coordination::Error::ZOK)
    {
        LOG_INFO(log, "ExportPartition: Marked export as completed and persisted commit_info");
        return;
    }
    // fall through to status-only set on ZNODEEXISTS / other errors
}

Coverage summary

Item Detail
Scope reviewed ZK state machine around export task completion (processed/*, status, new commit_info), commit idempotency behavior for Iceberg and plain object storage, in-memory mirror → system.replicated_partition_exports, and integration tests added/updated in the patch.
Categories failed Exception-safety / partial-update (new JSON serialization on completion critical path).
Categories passed State-transition consistency (status + commit_info via tryMulti with status-only fallback), idempotency signaling (Iceberg “already committed” sentinel; object-storage marker path surfaced even if pre-existing), concurrency/interleaving (peer-written commit_info handled via ZNODEEXISTS → status-only set), best-effort ZK mirroring semantics (poll-time refresh; no extra ZK reads on system-table query).
Assumptions/limits Audit is based on the PR’s public patch (.patch) and static reasoning (no local build/test execution).

### Commit info columns

These columns surface paths produced by the destination storage during commit, so it is possible to inspect what was written without consulting the destination directly:

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Description for destination_file_paths column is missing.

/// files and reaching this point, the task still completes via the recovery
/// path but commit_info will be absent. Recovering commit_info from the
/// live Iceberg snapshot in that case is a possible future enhancement.
const std::string status_path = fs::path(entry_path) / "status";
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AI thinks that exception in the block below can breaks commit transaction.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that is totally true. here's what can happen:

once we commit to iceberg, we need to mark it as completed in zookeeper. If the code that creates the commit info throws, we don't mark it as completed in zookeeper and it remains in pending state. In the next scheduler tick, we'll try to commit it again, and we'll notice it has already been committed. In that case, we just mark it as completed. It won't remain in pending forever as far as I can tell

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

antalya antalya-26.3 port-antalya PRs to be ported to all new Antalya releases

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants