Skip to content

add Transaction.TableCommit() to extract pending changes #786

@laskoviymishka

Description

@laskoviymishka

Feature Request / Improvement

Currently Transaction.Commit() is the only way to finalize a transaction — it both builds the update payload and sends it to the catalog. For multi-table atomic commits, callers need to collect pending changes from multiple transactions and submit them together via CommitTransaction().

Motivation

A typical multi-table snapshot workflow:

var commits []table.TableCommit

for _, tbl := range tables {
    tx := tbl.NewTransaction()
    tx.AddFiles(files[tbl], snapshotProps, false)

    // Extract pending changes WITHOUT committing
    tc, err := tx.TableCommit()
    if err != nil {
        return err
    }
    commits = append(commits, tc)
}

// Atomic commit across all tables
transCat := cat.(catalog.TransactionalCatalog)
return transCat.CommitTransaction(ctx, commits)

Without TableCommit(), callers would need to manually reconstruct the requirements and updates list, duplicating internal transaction logic.

Proposal

Add a method to Transaction:

// TableCommit returns a TableCommit representing the pending changes
// in this transaction, without actually committing. This is used for
// multi-table transactions where multiple commits are batched together.
func (t *Transaction) TableCommit() (TableCommit, error)

Behavior

  • Returns the current requirements and updates bundled with the table identifier
  • Automatically appends AssertTableUUID requirement (same as Commit() does)
  • Does not mark the transaction as committed
  • Returns error if transaction was already committed via Commit()
  • Thread-safe (acquires mutex)

Design Decisions

Decision Recommendation
Should TableCommit() mark the transaction as consumed? No — caller decides whether to use single-table Commit() or multi-table CommitTransaction(). Add MarkCommitted() for callers who want to prevent accidental double-use.
Should PostCommit hooks run? No — the multi-table endpoint returns 204 with no metadata. Callers must LoadTable() after and handle post-commit logic manually.
Can TableCommit() be called multiple times? Yes — it's a read-only snapshot of pending state. Subsequent apply() calls may change the result.

Depends On

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions