Core: Fix row ID assignment for EXISTING entry during a manifest merge#16263
Conversation
| } | ||
| }; | ||
| } else { | ||
| // data file's first_row_id is null when the manifest's first_row_id is null |
There was a problem hiding this comment.
I'm still checking if changing idAssigner like this for all cases is really the right way to fix this or if during manifest merging we should actually pass in our own id assignment transformer that preserves the existing case.
Here we're changing the manifest reader because
// data file's first_row_id is null when the manifest's first_row_id is null
isn't true in the context of reading merged manifests that have EXISTING entries which need to have first row IDs preserved but the merged manifest does not have a first row ID.
There was a problem hiding this comment.
The global change reads correctly. The else branch only runs when manifest.firstRowId is null, and in every such case the on-disk file.firstRowId is what should be preserved:
- pre-v3 manifests:
file.firstRowIdis null in the data, identity preserves null → unchanged behavior - v3 EXISTING entries in a transient filtered/merged manifest: identity preserves the original assignment → the fix
- v3 ADDED entries in a freshly-written manifest:
file.firstRowIdis null in the data (assigned at read time once the manifest gets afirstRowId), identity preserves null → unchanged behavior
I am also wondering about the reason for the previous behavior change. Maybe it is a defensive check covering: ManifestFiles.copyAppendManifest, the user-facing appendManifest() flow. A user can construct a ManifestFile externally with a non-null file-level first_row_id set on an ADDED entry — nothing in the public API stops that.
Under the previous reader-side clobber, the value was stripped before the writer saw it. Under Function.identity(), the value reaches the writer — but writer.add(entry) → wrapAppend → Delegates.suppressFirstRowId, so the wrapper's firstRowId() returns null and the bytes written for the ADDED entry have null on disk. Identical on-disk result.
The reader-side clobber was redundant on this path; the writer-side suppression in MergingSnapshotProducer.add and GenericManifestEntry.wrapAppend carries the "no override from client" enforcement. The global change preserves that property.
There was a problem hiding this comment.
Yeah @stevenzwu that's my analysis of all the call points as well. Fundamentally, I would expect the idTransformer that's set by default on the ManifestReader should do no transformation of the first row ID on the entry. That implicitly covers the 3 cases you mentioned.
And for the coppyAppendManifest case, if we really want to be defensive, I think the right solution is that the public writer APIs Iceberg exposes (that produce the manifests) should suppress the firstRowId assignment on the ADDED entry. The read side should just consume what's there in my opinion.
kevinjqliu
left a comment
There was a problem hiding this comment.
LGTM
i ran the test on main and it failed. one nit on using rewriteFiles as it is deprecated
| } | ||
|
|
||
| @Test | ||
| public void testRewritePreservesExistingFileFirstRowIds() { |
There was a problem hiding this comment.
Understand the logic a little bit: seems during merge or filter manifest, we may have null first_row_id at manifest while the existing file already has row_id which should be kept.
Tried for filtering case and the fix works. Maybe we can add the test coverage for filtering as well.
// apply a row filter so the read happens through the "filtered manifest" path
reader.filterRows(Expressions.greaterThan("id", 0));
kevinjqliu
left a comment
There was a problem hiding this comment.
LGTM
Looks like we need to backport this to 1.10.x as well
https://github.com/apache/iceberg/blob/1.10.x/core/src/main/java/org/apache/iceberg/ManifestReader.java#L407-L413
e116014 to
3308d81
Compare
|
Cheng did a lot of debugging on this issue which surfaced this problem, marking him as co-author. Thank you @ChengJiX! |
|
needs to rebase #16287 to unblock the "Kafka Connect CVE Scan" issue in CI |
| }; | ||
| // Preserve the source entry’s first row ID even if the manifest hasn’t assigned one since it | ||
| // may be EXISTING | ||
| return Function.identity(); |
There was a problem hiding this comment.
curious, maybe we can also add a direct UT in TestManifestReader to cover idAssigner on both if/else branch?
| }; | ||
| // Preserve the source entry’s first row ID even if the manifest hasn’t assigned one since it | ||
| // may be EXISTING | ||
| return Function.identity(); |
There was a problem hiding this comment.
I'm not sure about this change.
It looks like this case covers when the manifest's first_row_id is null. That should only happen when reading a snapshot from an older version without row IDs. In that case, the snapshot's first-row-id is null and we don't assign first_row_id to manifests or to data files. This is covered by Row Lineage for Upgraded Tables in the spec.
If first-row-id is assigned, then every manifest should have a first_row_id assigned. Then this branch shouldn't be triggering.
Is it possible that the problem is further up and a manifest is somehow missing a first_row_id?
There was a problem hiding this comment.
Okay, from going through the repro test, I think this is a legitimate case because manifest compaction is using a ManifestFile to read that doesn't have an assigned first_row_id. That means the assumption here is violated because we can have manifests that don't have an assigned first_row_id and we should still read and pass through the first_row_id from files.
I think the fix needs to distinguish between these cases. We need to have an idAssigner for committed manifests (this one) and an idAssigner used for uncommitted manifests that preserves the first_row_id like this does. We can't commit this fix because it breaks the defensive assignment we added for the v2/v1 snapshot case.
There was a problem hiding this comment.
+1 on two assigners.
I agree we should preserve the defensive mode for the committed-manifest path, but I don't think it should be a silent null assignment. If the invariant is "a committed v3 manifest with firstRowId != null shouldn't contain an entry with a non-null first_row_id that wasn't already there," we should encode that as a precondition / checkState and fail loudly. Silent re-assignment masks exactly the class of bug we're fixing here. If there's a real case where re-assigning is necessary, that seems like it should be opt-in behind a config flag rather than the default.
There was a problem hiding this comment.
Ok, I think I get the rationale for the defensive null assignment now. Preserving the entry made the upgrade test cases work because those will anyways have null first row IDs but this is to be defensive against someone producing older manifests with a first row ID somehow?
In either case, I've updated to pass in a committed flag to a pakcage private ManifestFiles.read API , and if it's true and firstRowId is null we do the null assignment, and if not, we preserve the entry as it is.
@RussellSpitzer I think you're saying we should add a Preconditions check for the committed case that the firstRowId is indeed null. I can get behind that, that seemed to be what @rdblue was implying in our offline conversation, if that's our expectation we should have a precondition check.
There was a problem hiding this comment.
Since I think we have a reliable mechanism to distinguish commited vs uncommited manifests , I left out Preconditions checks for now just to keep things simple. Let me know if people feel strongly about this. I think whatever expectations we know we have on the entry at different points probably should have a Precondition check but at the same time I don't want to be over specific on our checks and cause needless failures when reading the manifests.
There was a problem hiding this comment.
I do, but I think I'm currently overruled by the majority. I don't like it when we change objects to "fix" them
There was a problem hiding this comment.
I do, but I think I'm currently overruled by the majority. I don't like it when we change objects to "fix" them
I think this is reasonable. I'm okay with a precondition if that's what others think we should do. My argument for the opposite is that usually when we can read correctly, we should. We know that in the case where there is no manifest first_row_id that this should be null. Is it worth failing a read?
It might be. Russell is correct that it signals a problem somewhere. It probably doesn't matter much either way, since this should be rare.
| @Test | ||
| public void testRewritePreservesExistingFileFirstRowIds() { | ||
| table.newAppend().appendFile(FILE_A).appendFile(FILE_B).commit(); | ||
| // FILE_A→0, FILE_B→125; nextRowId=225 |
There was a problem hiding this comment.
It would be helpful to have a comment in this test case that explains what is being triggered by setting the min merge count.
There was a problem hiding this comment.
Let's remove arrow characters. They're hard to read, at least in github.
There was a problem hiding this comment.
We ran into this in the past as well, If I understand this correctly, this force trigger the manifest merge at the snapshot commit time where ManifestMergeManager kick in to reads this manifest to merge
it with existing manifests by constructs a ManifestReader with firstRowId=null.
| return Function.identity(); | ||
| } else { | ||
| // data file's first_row_id is null when the manifest's first_row_id is null | ||
| // committed pre-v3 manifest with a null manifest-level firstRowId |
There was a problem hiding this comment.
for pre-v3 manifest files, shouldn't data file's first_row_id be null anyway? that means identity also works.
There was a problem hiding this comment.
So identity logically works because we expect the entry will be null, but my understanding from @rdblue @RussellSpitzer is that the intent of the previous logic is to defensively assign the values regardless for the older manifests.
Co-authored-by: ChengJi <cheng.ji@databricks.com>
Co-authored-by: Russell Spitzer <russell.spitzer@GMAIL.COM>
b7a2b7d to
b4be442
Compare
17ac742 to
d3d60f5
Compare
8623fec to
b005c4f
Compare
| @Override | ||
| protected ManifestReader<DataFile> newManifestReader(ManifestFile manifest) { | ||
| return MergingSnapshotProducer.this.newManifestReader(manifest); | ||
| return newManifestReader(manifest, true); |
There was a problem hiding this comment.
Preserves the existing behavior for newManifestReader. While the "isCommitted" is only passed through for the Data file manifest merging case, it's a bit more involved to just put it on that implementation rather than the base class. I think this is the cleanest way (and besides this is all private/package private classes).
There was a problem hiding this comment.
I'm fine with this default, but we should probably start requiring isCommitted to be passed. Can we deprecate the old one and start using this exclusively?
I'm fine with this as a follow-up since we want to unblock the release.
b005c4f to
80e2a0f
Compare
| @@ -781,6 +781,33 @@ public void testUpgradeAssignmentWithManifestCompaction(@TempDir File altLocatio | |||
| FILE_C.recordCount() + FILE_B.recordCount()); | |||
| } | |||
There was a problem hiding this comment.
While we are at it we should add in a test for the "set row_id" to null assignment we do. The fact that we could change the behavior (in your previous commit) and not fail anything in the test suite is an issue. Especially if we think that is an important behavior.
There was a problem hiding this comment.
So there's a nuance to "change the behavior" which is why all the tests in the previous commit passed. Even with the previous commit, we ended up correctly setting row ID to null because the entries in older manifest wouldn't even have a first row ID to begin with, so the behavior was preserved for the main case we care about for older manifests, and there's quite a few tests there for the upgrade path.
The case we're talking about though is a case where we (somehow?) produce a pre-v3 manifest with first row IDs and we should ignore those by defensively setting those to null; in that specific case, there was a behavior change because we'd just keep those row IDs. I'll see about adding a test but I'm not sure if we really even expose the mechanisms to produce a V2 manifest with first row IDs.
There was a problem hiding this comment.
Yes but, if we think it's important to handle that use case we should be testing it imho
There was a problem hiding this comment.
Ok so I've added 2 tests to ManifestReader; 1 for testing the uncommitted read path which preserves firstRowId (same path already being tested in the TestRowLinegeAssignment) case, and 1 for testing the committed manifest read path where we expect to null out the entry's first row ID. Since we're now distinguishing between uncommitted and committed, we have a good abstraction boundary to test.
The committed test is something that would detect if the behavior to defensively null out entries changed (verified by commenting out the most recent changes we made, and going back to what we had before).
| for (ManifestFile manifest : bin) { | ||
| try (ManifestReader<F> reader = newManifestReader(manifest)) { | ||
| boolean isCommitted = | ||
| manifest.snapshotId() != null && snapshotId() != manifest.snapshotId(); |
There was a problem hiding this comment.
This looks reasonable to me.
|
Ok I'll go ahead and merge. There are some followup items but I think the PR is in a good state for unblocking the release. Thanks everyone! @RussellSpitzer @stevenzwu @rdblue @mxm @dramaticlly @aihuaxu @kevinjqliu for reviewing |
We observed some cases where when EXISTING entries are carried over in an operation as part of a manifest merge, the first row ID assignment is not preserved and instead gets a new first row ID based on the inheritance rules which is unexpected (and is a spec violation).