Skip to content

[Kernel] [CatalogManaged] Committer, CommitPayload, CommitResponse, CommitFailedException APIs #4814

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

scottsand-db
Copy link
Collaborator

@scottsand-db scottsand-db commented Jun 24, 2025

🥞 Stacked PR

Use this link to review incremental changes.


Which Delta project/connector is this regarding?

  • Spark
  • Standalone
  • Flink
  • Kernel
  • Other (fill in here)

Description

This PR creates the Committer, CommitPayload, CommitResponse, CommitFailedException public APIs.

How was this patch tested?

Just new APIs.

Does this PR introduce any user-facing changes?

Yes.

@scottsand-db scottsand-db changed the title done first pass [Kernel] [CatalogManaged] Committer, CommitPayload, CommitResponse, CommitFailedException APIs Jun 24, 2025
@scottsand-db scottsand-db self-assigned this Jun 24, 2025
* The {@link Protocol} that was read at the beginning of the commit. Empty if a new table is
* being created.
*/
public final Optional<Protocol> readProtocolOpt;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you remind me why we need the old/new p and ms?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question! For example, to detect that the catalogManaged table feature has been enabled on the table.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And, certainly, you want the new Protocol and Metadata so you can send that to the catalog, and then it always knows what's the latets P & M, and it can give that back on the next read request to efficiently construct a ResolvedTable without having to go to the file system to resolve the latest P & M

@scottsand-db scottsand-db force-pushed the stack/kernel_catalog_managed_writes_1 branch from b603c47 to 072ede5 Compare June 25, 2025 21:18
@scottsand-db scottsand-db mentioned this pull request Jun 25, 2025
5 tasks
* from Kernel to {@link Committer} implementations, which use the information to perform
* catalog-specific or filesystem-specific commit operations.
*
* <p><strong>Important:</strong> The actions iterator should only be consumed once. Committer
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Engine implementations

}

/** All Delta actions (data and metadata) that are part of this commit. */
public CloseableIterator<Row> getFinalizedActions() {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mention again that this can only be consumed once

@scottsand-db scottsand-db force-pushed the stack/kernel_catalog_managed_writes_1 branch from 76c689f to 75bae78 Compare June 28, 2025 00:31
@scottsand-db scottsand-db mentioned this pull request Jul 2, 2025
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants