Skip to content

Core: Fix row ID assignment for EXISTING entry during a manifest merge#16263

Merged
amogh-jahagirdar merged 8 commits into
apache:mainfrom
amogh-jahagirdar:existing-row-id-carryover
May 12, 2026
Merged

Core: Fix row ID assignment for EXISTING entry during a manifest merge#16263
amogh-jahagirdar merged 8 commits into
apache:mainfrom
amogh-jahagirdar:existing-row-id-carryover

Conversation

@amogh-jahagirdar
Copy link
Copy Markdown
Contributor

@amogh-jahagirdar amogh-jahagirdar commented May 9, 2026

We observed some cases where when EXISTING entries are carried over in an operation as part of a manifest merge, the first row ID assignment is not preserved and instead gets a new first row ID based on the inheritance rules which is unexpected (and is a spec violation).

@github-actions github-actions Bot added the core label May 9, 2026
}
};
} else {
// data file's first_row_id is null when the manifest's first_row_id is null
Copy link
Copy Markdown
Contributor Author

@amogh-jahagirdar amogh-jahagirdar May 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still checking if changing idAssigner like this for all cases is really the right way to fix this or if during manifest merging we should actually pass in our own id assignment transformer that preserves the existing case.

Here we're changing the manifest reader because

// data file's first_row_id is null when the manifest's first_row_id is null

isn't true in the context of reading merged manifests that have EXISTING entries which need to have first row IDs preserved but the merged manifest does not have a first row ID.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The global change reads correctly. The else branch only runs when manifest.firstRowId is null, and in every such case the on-disk file.firstRowId is what should be preserved:

  • pre-v3 manifests: file.firstRowId is null in the data, identity preserves null → unchanged behavior
  • v3 EXISTING entries in a transient filtered/merged manifest: identity preserves the original assignment → the fix
  • v3 ADDED entries in a freshly-written manifest: file.firstRowId is null in the data (assigned at read time once the manifest gets a firstRowId), identity preserves null → unchanged behavior

I am also wondering about the reason for the previous behavior change. Maybe it is a defensive check covering: ManifestFiles.copyAppendManifest, the user-facing appendManifest() flow. A user can construct a ManifestFile externally with a non-null file-level first_row_id set on an ADDED entry — nothing in the public API stops that.

Under the previous reader-side clobber, the value was stripped before the writer saw it. Under Function.identity(), the value reaches the writer — but writer.add(entry)wrapAppendDelegates.suppressFirstRowId, so the wrapper's firstRowId() returns null and the bytes written for the ADDED entry have null on disk. Identical on-disk result.

The reader-side clobber was redundant on this path; the writer-side suppression in MergingSnapshotProducer.add and GenericManifestEntry.wrapAppend carries the "no override from client" enforcement. The global change preserves that property.

Copy link
Copy Markdown
Contributor Author

@amogh-jahagirdar amogh-jahagirdar May 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah @stevenzwu that's my analysis of all the call points as well. Fundamentally, I would expect the idTransformer that's set by default on the ManifestReader should do no transformation of the first row ID on the entry. That implicitly covers the 3 cases you mentioned.

And for the coppyAppendManifest case, if we really want to be defensive, I think the right solution is that the public writer APIs Iceberg exposes (that produce the manifests) should suppress the firstRowId assignment on the ADDED entry. The read side should just consume what's there in my opinion.

Copy link
Copy Markdown
Contributor

@kevinjqliu kevinjqliu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

i ran the test on main and it failed. one nit on using rewriteFiles as it is deprecated

Comment thread core/src/test/java/org/apache/iceberg/TestRowLineageAssignment.java Outdated
Comment thread core/src/main/java/org/apache/iceberg/ManifestReader.java Outdated
}

@Test
public void testRewritePreservesExistingFileFirstRowIds() {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understand the logic a little bit: seems during merge or filter manifest, we may have null first_row_id at manifest while the existing file already has row_id which should be kept.

Tried for filtering case and the fix works. Maybe we can add the test coverage for filtering as well.

// apply a row filter so the read happens through the "filtered manifest" path
reader.filterRows(Expressions.greaterThan("id", 0));

@amogh-jahagirdar amogh-jahagirdar marked this pull request as ready for review May 11, 2026 15:50
Comment thread core/src/main/java/org/apache/iceberg/ManifestReader.java Outdated
Copy link
Copy Markdown
Contributor

@kevinjqliu kevinjqliu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@amogh-jahagirdar amogh-jahagirdar force-pushed the existing-row-id-carryover branch from e116014 to 3308d81 Compare May 11, 2026 16:52
@amogh-jahagirdar
Copy link
Copy Markdown
Contributor Author

Cheng did a lot of debugging on this issue which surfaced this problem, marking him as co-author. Thank you @ChengJiX!

@kevinjqliu kevinjqliu added this to the Iceberg 1.11.0 milestone May 11, 2026
@kevinjqliu
Copy link
Copy Markdown
Contributor

needs to rebase #16287 to unblock the "Kafka Connect CVE Scan" issue in CI

};
// Preserve the source entry’s first row ID even if the manifest hasn’t assigned one since it
// may be EXISTING
return Function.identity();
Copy link
Copy Markdown
Contributor

@dramaticlly dramaticlly May 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

curious, maybe we can also add a direct UT in TestManifestReader to cover idAssigner on both if/else branch?

};
// Preserve the source entry’s first row ID even if the manifest hasn’t assigned one since it
// may be EXISTING
return Function.identity();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about this change.

It looks like this case covers when the manifest's first_row_id is null. That should only happen when reading a snapshot from an older version without row IDs. In that case, the snapshot's first-row-id is null and we don't assign first_row_id to manifests or to data files. This is covered by Row Lineage for Upgraded Tables in the spec.

If first-row-id is assigned, then every manifest should have a first_row_id assigned. Then this branch shouldn't be triggering.

Is it possible that the problem is further up and a manifest is somehow missing a first_row_id?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, from going through the repro test, I think this is a legitimate case because manifest compaction is using a ManifestFile to read that doesn't have an assigned first_row_id. That means the assumption here is violated because we can have manifests that don't have an assigned first_row_id and we should still read and pass through the first_row_id from files.

I think the fix needs to distinguish between these cases. We need to have an idAssigner for committed manifests (this one) and an idAssigner used for uncommitted manifests that preserves the first_row_id like this does. We can't commit this fix because it breaks the defensive assignment we added for the v2/v1 snapshot case.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 on two assigners.

I agree we should preserve the defensive mode for the committed-manifest path, but I don't think it should be a silent null assignment. If the invariant is "a committed v3 manifest with firstRowId != null shouldn't contain an entry with a non-null first_row_id that wasn't already there," we should encode that as a precondition / checkState and fail loudly. Silent re-assignment masks exactly the class of bug we're fixing here. If there's a real case where re-assigning is necessary, that seems like it should be opt-in behind a config flag rather than the default.

Copy link
Copy Markdown
Contributor Author

@amogh-jahagirdar amogh-jahagirdar May 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I think I get the rationale for the defensive null assignment now. Preserving the entry made the upgrade test cases work because those will anyways have null first row IDs but this is to be defensive against someone producing older manifests with a first row ID somehow?

In either case, I've updated to pass in a committed flag to a pakcage private ManifestFiles.read API , and if it's true and firstRowId is null we do the null assignment, and if not, we preserve the entry as it is.

@RussellSpitzer I think you're saying we should add a Preconditions check for the committed case that the firstRowId is indeed null. I can get behind that, that seemed to be what @rdblue was implying in our offline conversation, if that's our expectation we should have a precondition check.

Copy link
Copy Markdown
Contributor Author

@amogh-jahagirdar amogh-jahagirdar May 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since I think we have a reliable mechanism to distinguish commited vs uncommited manifests , I left out Preconditions checks for now just to keep things simple. Let me know if people feel strongly about this. I think whatever expectations we know we have on the entry at different points probably should have a Precondition check but at the same time I don't want to be over specific on our checks and cause needless failures when reading the manifests.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do, but I think I'm currently overruled by the majority. I don't like it when we change objects to "fix" them

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do, but I think I'm currently overruled by the majority. I don't like it when we change objects to "fix" them

I think this is reasonable. I'm okay with a precondition if that's what others think we should do. My argument for the opposite is that usually when we can read correctly, we should. We know that in the case where there is no manifest first_row_id that this should be null. Is it worth failing a read?

It might be. Russell is correct that it signals a problem somewhere. It probably doesn't matter much either way, since this should be rare.

Comment thread core/src/test/java/org/apache/iceberg/TestRowLineageAssignment.java Outdated
@Test
public void testRewritePreservesExistingFileFirstRowIds() {
table.newAppend().appendFile(FILE_A).appendFile(FILE_B).commit();
// FILE_A→0, FILE_B→125; nextRowId=225
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be helpful to have a comment in this test case that explains what is being triggered by setting the min merge count.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's remove arrow characters. They're hard to read, at least in github.

Copy link
Copy Markdown
Contributor

@dramaticlly dramaticlly May 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We ran into this in the past as well, If I understand this correctly, this force trigger the manifest merge at the snapshot commit time where ManifestMergeManager kick in to reads this manifest to merge
it with existing manifests by constructs a ManifestReader with firstRowId=null.

Comment thread core/src/test/java/org/apache/iceberg/TestRowLineageAssignment.java
Comment thread core/src/main/java/org/apache/iceberg/ManifestReader.java Outdated
return Function.identity();
} else {
// data file's first_row_id is null when the manifest's first_row_id is null
// committed pre-v3 manifest with a null manifest-level firstRowId
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for pre-v3 manifest files, shouldn't data file's first_row_id be null anyway? that means identity also works.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So identity logically works because we expect the entry will be null, but my understanding from @rdblue @RussellSpitzer is that the intent of the previous logic is to defensively assign the values regardless for the older manifests.

Comment thread core/src/main/java/org/apache/iceberg/ManifestMergeManager.java Outdated
@amogh-jahagirdar amogh-jahagirdar force-pushed the existing-row-id-carryover branch from b7a2b7d to b4be442 Compare May 11, 2026 22:15
@amogh-jahagirdar amogh-jahagirdar force-pushed the existing-row-id-carryover branch from 17ac742 to d3d60f5 Compare May 11, 2026 22:32
@amogh-jahagirdar amogh-jahagirdar force-pushed the existing-row-id-carryover branch 3 times, most recently from 8623fec to b005c4f Compare May 11, 2026 22:44
@Override
protected ManifestReader<DataFile> newManifestReader(ManifestFile manifest) {
return MergingSnapshotProducer.this.newManifestReader(manifest);
return newManifestReader(manifest, true);
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Preserves the existing behavior for newManifestReader. While the "isCommitted" is only passed through for the Data file manifest merging case, it's a bit more involved to just put it on that implementation rather than the base class. I think this is the cleanest way (and besides this is all private/package private classes).

Copy link
Copy Markdown
Contributor

@rdblue rdblue May 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with this default, but we should probably start requiring isCommitted to be passed. Can we deprecate the old one and start using this exclusively?

I'm fine with this as a follow-up since we want to unblock the release.

@@ -781,6 +781,33 @@ public void testUpgradeAssignmentWithManifestCompaction(@TempDir File altLocatio
FILE_C.recordCount() + FILE_B.recordCount());
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While we are at it we should add in a test for the "set row_id" to null assignment we do. The fact that we could change the behavior (in your previous commit) and not fail anything in the test suite is an issue. Especially if we think that is an important behavior.

Copy link
Copy Markdown
Contributor Author

@amogh-jahagirdar amogh-jahagirdar May 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So there's a nuance to "change the behavior" which is why all the tests in the previous commit passed. Even with the previous commit, we ended up correctly setting row ID to null because the entries in older manifest wouldn't even have a first row ID to begin with, so the behavior was preserved for the main case we care about for older manifests, and there's quite a few tests there for the upgrade path.

The case we're talking about though is a case where we (somehow?) produce a pre-v3 manifest with first row IDs and we should ignore those by defensively setting those to null; in that specific case, there was a behavior change because we'd just keep those row IDs. I'll see about adding a test but I'm not sure if we really even expose the mechanisms to produce a V2 manifest with first row IDs.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes but, if we think it's important to handle that use case we should be testing it imho

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok so I've added 2 tests to ManifestReader; 1 for testing the uncommitted read path which preserves firstRowId (same path already being tested in the TestRowLinegeAssignment) case, and 1 for testing the committed manifest read path where we expect to null out the entry's first row ID. Since we're now distinguishing between uncommitted and committed, we have a good abstraction boundary to test.

The committed test is something that would detect if the behavior to defensively null out entries changed (verified by commenting out the most recent changes we made, and going back to what we had before).

for (ManifestFile manifest : bin) {
try (ManifestReader<F> reader = newManifestReader(manifest)) {
boolean isCommitted =
manifest.snapshotId() != null && snapshotId() != manifest.snapshotId();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks reasonable to me.

@amogh-jahagirdar
Copy link
Copy Markdown
Contributor Author

Ok I'll go ahead and merge. There are some followup items but I think the PR is in a good state for unblocking the release. Thanks everyone! @RussellSpitzer @stevenzwu @rdblue @mxm @dramaticlly @aihuaxu @kevinjqliu for reviewing

@amogh-jahagirdar amogh-jahagirdar merged commit 1cea23e into apache:main May 12, 2026
36 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants