Core: Fix the NPE while updating event in the context of eventual consistency. #2552

flyrain · 2021-05-03T19:34:37Z

We need the same thing as here in the context of eventual consistency. Otherwise, we hits a NPE like this. We need a fix even though it is a WARN.

timestamp="2021-04-23T22:35:23,938+0000",level="WARN",threadName="pool-1-thread-1",appName="spark-driver",logger="org.apache.iceberg.SnapshotProducer",message="Failed to notify listeners",exception="java.lang.NullPointerException
	at org.apache.iceberg.MergingSnapshotProducer.updateEvent(MergingSnapshotProducer.java:377)
	at org.apache.iceberg.SnapshotProducer.notifyListeners(SnapshotProducer.java:329)
	at org.apache.iceberg.SnapshotProducer.commit(SnapshotProducer.java:324)
	at org.apache.iceberg.spark.source.SparkWrite.commitOperation(SparkWrite.java:236)

yyanyy · 2021-05-03T21:20:08Z

core/src/main/java/org/apache/iceberg/MergingSnapshotProducer.java

+      // consistency problems in refresh.
+      LOG.warn("Failed to load committed snapshot, leave sequence number to 0");
+    } else {
+      sequenceNumber = justSaved.sequenceNumber();


I think the fix looks good but I'm not sure if we should default sequenceNumber to 0, as this may result in incorrect data for users that rely on this sequence number in the event. Since without this change we probably will not send out the event at all, if we don't want to wait until the data become eventual I wonder if returning -1 would be better since 0 may refer to an actual sequence number of the table (although in v1 all sequence number will be 0 so return 0 for v1 is better, but I'm not sure how easy it is to check table version here)

Good catch. Make sense to set it to -1 since its initial value is 0 and increase monotonically as this line shows

iceberg/core/src/main/java/org/apache/iceberg/TableMetadata.java

Line 356 in 9edb1fa

public long nextSequenceNumber() {

Posted a new commit to fix it.

flyrain · 2021-05-04T00:11:20Z

Can you take a look? @rdblue @aokolnychyi @RussellSpitzer @karuppayya @kbendick

RussellSpitzer · 2021-05-04T22:01:44Z

I think this is a good fix to the NPE. But I do want to make sure we warn folks that an eventually consistent Catalog can break the history of an Iceberg table. This would only be acceptable for a catalog which guaranteed consistent reads during the lock phase.

My thinking is basically

Consistent - Read Your Writes  Catalog = Best
Eventually Consistent Reads - Consistent when locking/committing = Adequate but not great
Eventually consistent during lock/commit = Will cause lost writes, non-linearizable histories and weirdness.

kbendick · 2021-05-06T05:33:22Z

I think this is a good fix to the NPE. But I do want to make sure we warn folks that an eventually consistent Catalog can break the history of an Iceberg table. This would only be acceptable for a catalog which guaranteed consistent reads during the lock phase.

My thinking is basically
Consistent - Read Your Writes  Catalog = Best
Eventually Consistent Reads - Consistent when locking/committing = Adequate but not great
Eventually consistent during lock/commit = Will cause lost writes, non-linearizable histories and weirdness. 

Is this something that could come up with the DynamoDB based lock? I agree this is a good fix to the NPE, but I have personally had weird edge cases with other software that uses a DynamoDB table as a lock (such as terraform). Ideally testing handles this and the DynamoDB / Glue catalog doesn't suffer from these issues. Possibly I need to review the DynamoDB table lock again, but felt it might be worth noting as I've had weird issues with terraform table locks and consistency issues.

flyrain · 2021-05-06T17:29:36Z

User should avoid eventually consistent Catalog if possible. If Hive catalog is used, we normally expect HMS is strong consistency. It is NOT the case some times, for example, HMS could be an eventual consistency system if it uses Cassandra as the backend DB. But I like the call-out from @RussellSpitzer and @kbendick, avoid it as much as possible.

However, we cannot remove eventual consistency related code if Iceberg still support Hadoop catalog, which based on an eventual consistency system, HDFS.

kbendick · 2021-05-07T23:58:17Z

core/src/main/java/org/apache/iceberg/MergingSnapshotProducer.java

+    if (justSaved == null) {
+      // The snapshot just saved may not be present if the latest metadata couldn't be loaded due to eventual
+      // consistency problems in refresh.
+      LOG.warn("Failed to load committed snapshot, leave its sequence number to an invalid number(-1)");


As a possible follow up, should we consider retrying this with a backoff (or does this already do that)? Or is that something we'd rather avoid getting into?

There is a retry in the main part of method commit(), not for clean-up and notifyListeners. I'm not sure if it's a good idea to introduce retry to them. I didn't see much downside, at most delay the commit, but feel like it is a bit over engineering. It is going to be the another PR anyway.

No need for back-off. That is handled by the table operations when it tries to load the new metadata. All this needs to do is handle the case where the table metadata can't be loaded.

kbendick

This looks good to me.

Left a comment about possibly retrying the refresh for the snapshot with the given snapshotId, perhaps as a follow up. But I'm not sure of the implications of that and ideally users are using consistent catalogs / metastores.

aokolnychyi · 2021-05-18T19:59:34Z

core/src/main/java/org/apache/iceberg/MergingSnapshotProducer.java

    long snapshotId = snapshotId();
-    long sequenceNumber = ops.refresh().snapshot(snapshotId).sequenceNumber();
+    Snapshot justSaved = ops.refresh().snapshot(snapshotId);
+    long sequenceNumber = TableMetadata.INVALID_SEQUENCE_NUMBER;


Well, I am not sure what should be the best way to react here. We do expect the catalog to be consistent so one option is to just log an error and not throw an event or throw some special even type.

cc @rdsr who implemented this part and @rdblue who uses the event-based system. What do you expect in this case?

What is the situation where the snapshot would not be in metadata? If the snapshot wasn't committed, then I would expect to not send an event.

Nevermind, I see the comment about not being able to load the metadata.

I'd probably prefer null, but -1 is fine so that we don't change a public API.

rdblue · 2021-05-19T20:10:11Z

core/src/main/java/org/apache/iceberg/BaseSnapshot.java

               Map<String, String> summary,
               List<ManifestFile> dataManifests) {
-    this(io, INITIAL_SEQUENCE_NUMBER, snapshotId, parentId, timestampMillis, operation, summary, null);
+    this(io, TableMetadata.INITIAL_SEQUENCE_NUMBER, snapshotId, parentId, timestampMillis, operation, summary, null);


This change looks unrelated to me. Can you remove it?

Trying to piggy-back a change to consolidate INITIAL_SEQUENCE_NUMBER from multiple places. I can put it in another PR if it is preferred.

Yes, please do. We try to avoid unnecessary changes so that we don't cause unnecessary commit conflicts for people maintaining branches.

Removed it in a new commit.

rdblue · 2021-05-19T20:18:15Z

core/src/main/java/org/apache/iceberg/MergingSnapshotProducer.java

+    if (justSaved == null) {
+      // The snapshot just saved may not be present if the latest metadata couldn't be loaded due to eventual
+      // consistency problems in refresh.
+      LOG.warn("Failed to load committed snapshot, leave its sequence number to an invalid number(-1)");


I think we can improve the message; "leave its sequence number to an invalid number(-1)" isn't very clear. How about "Failed to load committed snapshot: omitting sequence number from notifications (not known)"

Fixed it in a new commit.

flyrain · 2021-05-19T22:29:02Z

The test failures are unrelated.

rdblue · 2021-05-20T00:55:50Z

Thanks, @flyrain!

Core: Fix the NPE while updating event in case of eventual consistency.

e020b0e

github-actions bot added the core label May 3, 2021

yyanyy reviewed May 3, 2021

View reviewed changes

flyrain added 2 commits May 3, 2021 16:36

Fix the style issue.

546a67c

Set an INVALID_SEQUENCE_NUMBER if snapshot is missing.

cd49d46

kbendick reviewed May 7, 2021

View reviewed changes

kbendick approved these changes May 8, 2021

View reviewed changes

aokolnychyi reviewed May 18, 2021

View reviewed changes

rdblue reviewed May 19, 2021

View reviewed changes

Resolve the comments.

8cc4268

rdblue approved these changes May 19, 2021

View reviewed changes

rdblue merged commit ba54f9c into apache:master May 20, 2021

wForget mentioned this pull request Sep 6, 2023

Core: Avoid NPE when getting updateEvent in FastAppend #8507

Closed

Core: Fix the NPE while updating event in the context of eventual consistency. #2552

Core: Fix the NPE while updating event in the context of eventual consistency. #2552

Uh oh!

Conversation

flyrain commented May 3, 2021

Uh oh!

Choose a reason for hiding this comment

Uh oh!

flyrain May 4, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

flyrain commented May 4, 2021

Uh oh!

RussellSpitzer commented May 4, 2021

Uh oh!

kbendick commented May 6, 2021

Uh oh!

flyrain commented May 6, 2021

Uh oh!

Choose a reason for hiding this comment

Uh oh!

flyrain May 8, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kbendick left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

flyrain commented May 19, 2021

Uh oh!

rdblue commented May 20, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

flyrain May 4, 2021 •

edited

Loading

flyrain May 8, 2021 •

edited

Loading