Add branchRefName to TableAuditEvent#616
Open
cbb330 wants to merge 5 commits into
Open
Conversation
Table operations against named Iceberg branches are not currently observable: the existing TableAuditEvent records currentSnapshotId and currentSnapshotTimestampMs but has no signal for which branch ref was written. Add branchRefName to TableAuditEvent, populated in extractSnapshotInfo() from the existing snapshotRefs and jsonSnapshots fields on the request body. The committed branch is identified by matching the last snapshot in jsonSnapshots (Iceberg always appends new snapshots chronologically) to the ref that points to it in snapshotRefs. This correctly handles main branch commits, named branch commits where main did not advance, and the case where no main ref is present at all. This is safe to rely on: every request through putIcebergSnapshots has a non-empty jsonSnapshots (the repository layer gates snapshotRefs processing on jsonSnapshots being present), so the last-snapshot invariant holds for all reachable code paths.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
The Git for Data feature introduces named Iceberg branch refs, but table operation observability is blind to them. Today
TableAuditEventrecordscurrentSnapshotIdandcurrentSnapshotTimestampMsfor the main branch, with no signal for which branch a write went to. Dali and Grid Observability cannot answer "was this commit to main or a named branch?" without ACL-gated access to current table state.Solution
Add
branchRefNametoTableAuditEvent, populated at commit time from the existingsnapshotRefsandjsonSnapshotsfields already present onIcebergSnapshotsRequestBody.How the committed branch is identified: Iceberg appends snapshots chronologically, so the last entry in
jsonSnapshotsis always the newly-committed snapshot. We find the ref insnapshotRefswhose snapshot-id matches it — that ref is the branch being written.This correctly handles:
branchRefName = "main"branchRefName = "<branch>"branchRefName = "<branch>",currentSnapshotId = nullWhy the last-snapshot assumption is safe: Every request reaching
putIcebergSnapshotshas a non-emptyjsonSnapshots— the repository layer (doUpdateSnapshotsIfNeeded) gates allsnapshotRefsprocessing onjsonSnapshotsbeing present. A ref-only update is a silent no-op that never commits, so the last-snapshot invariant holds for all reachable code paths.Changes
TableAuditEvent.java— addString branchRefNameTableAuditAspect.java— populate inextractSnapshotInfo()by matching last snapshot to its ref; no separate pass, no special-casing for mainIcebergSnapshotsApiHandlerAuditTest.java— two new tests: main commit sets"main", named branch commit sets the branch name; existing branch-only test updated to assertbranchRefNameTableAuditModelConstants.java— addbranchRefName("main")to the three commit-path expected eventsDepends on
#601 (adds
tablePropertiestoTableAuditEvent; merge that first)