Improve error handling logic for effectively once #5271

jerrypeng · 2019-09-24T21:20:44Z

Motivation

As a part of solving #5218

Modifications

When there are BK write errors we need to fence the topic and reset highestSequencedPushed -> highestSequencedPersisted

…bk_write_faulure

merlimat · 2019-09-24T21:34:07Z

pulsar-broker/src/main/java/org/apache/pulsar/broker/service/persistent/PersistentTopic.java

@@ -125,7 +129,8 @@
 public class PersistentTopic extends AbstractTopic implements Topic, AddEntryCallback {

    // Managed ledger associated with the topic
-    protected final ManagedLedger ledger;
+    @VisibleForTesting
+    ManagedLedger ledger;


could we retain the final?

will change

pulsar-broker/src/main/java/org/apache/pulsar/broker/service/persistent/PersistentTopic.java

merlimat · 2019-09-24T21:43:51Z

pulsar-broker/src/main/java/org/apache/pulsar/broker/service/persistent/PersistentTopic.java

+            // close all producers
+            List<CompletableFuture<Void>> futures = Lists.newArrayList();
+            producers.forEach(producer -> futures.add(producer.disconnect()));
+            FutureUtil.waitForAll(futures);


This will just return a new future that tracks all the futures in the list, without blocking (which is actually what we want).

To ensure we decrement only after all the connections are actually closed, we'd need to do like:

FutureUtil.waitForAll(futures).thenHandle((ex, v) -> { decrementPendingWriteOpsAndCheck(); });

will change

merlimat · 2019-09-24T21:44:22Z

pulsar-broker/src/main/java/org/apache/pulsar/broker/service/persistent/PersistentTopic.java

+                callback.completed(new PersistenceException(exception), -1, -1);
+            }
+
+            long pending = pendingWriteOps.decrementAndGet();


As above, call decrementPendingWriteOpsAndCheck() from the future callback instead of here

jerrypeng · 2019-09-24T23:06:02Z

rerun cpp tests

jerrypeng · 2019-09-24T23:54:18Z

rerun cpp tests

merlimat · 2019-09-24T23:57:00Z

pulsar-broker/src/main/java/org/apache/pulsar/broker/service/persistent/PersistentTopic.java

@@ -272,17 +290,41 @@ private PersistentSubscription createPersistentSubscription(String subscriptionN

    @Override
    public void publishMessage(ByteBuf headersAndPayload, PublishContext publishContext) {
+        pendingWriteOps.incrementAndGet();


We should add a comment here with the logic behind the "increment then check the fence status" operation, because it will not be evident to a reader here.

ivankelly

Isn't there a fundamental problem here.

What if the client produces [M1, seq:1],[M2, seq:2],[M3, seq:3] asynchronously. M1 succeeds, M2 fails with a BK error, the managed ledger recovers from the error, then M3 hits the broker and is persisted. At this point, M2 can retry, but the message is lost because seq:2 is lower than seq:3.

ivankelly · 2019-09-25T11:28:07Z

pulsar-broker/src/main/java/org/apache/pulsar/broker/service/persistent/PersistentTopic.java

+            List<CompletableFuture<Void>> futures = Lists.newArrayList();
+            producers.forEach(producer -> futures.add(producer.disconnect()));
+            FutureUtil.waitForAll(futures).handle((BiFunction<Void, Throwable, Void>) (aVoid, throwable) -> {
+                decrementPendingWriteOpsAndCheck();


Add a comment here that the write op being decremented is the one incremented in the call that eventually triggered addFailed. Otherwise it looks like you're decrementing for each producer closed.

jiazhai · 2019-09-25T13:20:57Z

run integration tests

merlimat · 2019-09-25T16:55:26Z

What if the client produces [M1, seq:1],[M2, seq:2],[M3, seq:3] asynchronously. M1 succeeds, M2 fails with a BK error, the managed ledger recovers from the error, then M3 hits the broker and is persisted. At this point, M2 can retry, but the message is lost because seq:2 is lower than seq:3.

The current guard against this scenario is that managed ledger will reject all the writes for a period of 10sec. In practical terms, this should avoid all races between threads (for non-blocking ops), though of course it does not give 100% proof.

The next step is to have managed ledger to stay in "error mode" after write failure, until we manually set it back into normal mode, after all the pending ops are done and we got the chance of resetting the topic.

jerrypeng · 2019-09-25T17:44:57Z

rerun integration tests

jerrypeng · 2019-09-25T21:17:51Z

rerun integration tests

jerrypeng · 2019-09-26T04:19:12Z

rerun integration tests

ivankelly · 2019-09-26T11:29:00Z

What if the client produces [M1, seq:1],[M2, seq:2],[M3, seq:3] asynchronously. M1 succeeds, M2 fails with a BK error, the managed ledger recovers from the error, then M3 hits the broker and is persisted. At this point, M2 can retry, but the message is lost because seq:2 is lower than seq:3.

The current guard against this scenario is that managed ledger will reject all the writes for a period of 10sec. In practical terms, this should avoid all races between threads (for non-blocking ops), though of course it does not give 100% proof.

The next step is to have managed ledger to stay in "error mode" after write failure, until we manually set it back into normal mode, after all the pending ops are done and we got the chance of resetting the topic.

Maybe we need to rebrand our "exactly-once" again from "effectively-once" to "probably-once".
I think there needs to be some cooperation with the client w.r.t. failures. A client could have a write pending during the time that the error occurs on a previous write and that error being handled, so when the write hits the broker it proceeds as normal, losing the previous write forever.

Maybe we should have some sort of epoch to represent the client <-> producer relationship? When an error occurs on a write, all subsequent writes from that epoch should fail. The error should be kicked back to the client, which should then have to reestablish it's current position before preceeding.

merlimat · 2019-09-26T13:37:11Z

In this PR, the sequence of handling errors will be:

Get a write error
"fence" topic
disconnect all producers
- Producers will fail to reconnect because of the topic state
- This will discard any other publish request in the pipe
Wait until all pending write ops are completed (with failure)
"un-fence" the topic

jerrypeng · 2019-09-26T19:15:08Z

rerun integration tests

jerrypeng · 2019-09-30T20:53:42Z

rerun integration tests

jerrypeng · 2019-10-03T01:00:41Z

rerun integration tests

jerrypeng · 2019-10-03T03:52:19Z

rerun integration tests

jerrypeng · 2019-10-03T15:54:07Z

rerun integration tests

jerrypeng · 2019-10-03T18:45:43Z

rerun integration tests

jerrypeng · 2019-10-04T18:00:40Z

rerun integration tests

jerrypeng · 2019-10-04T20:49:51Z

rerun integration tests

* Bug in Message Deduplication that may cause incorrect behavior * add tests * fix error message * fix client backoff * fix tests * cleaning up * Fix handling of BK write failures for message dedup * tests and clean up * refactoring code * fixing bugs * addressing comments * add missing license header (cherry picked from commit 8e95f43)

…s unloaded (#7735) ### Motivation When a topic is unloaded and moved to another broker, the producer for geo-replication often remains unclosed. Because of this, geo-replication is not possible on the broker to which the topic was moved and messages accumulate in the replication backlog. ``` 18:56:55.166 [pulsar-io-21-6] ERROR o.a.pulsar.client.impl.ProducerImpl - [persistent://xxx/yyy/zzz] [pulsar.repl.dc2] Failed to create producer: Producer with name 'pulsar.repl.dc2' is already connected to topic ``` When this issue occurs, the following log is output on the broker where the topic is unloaded. ``` 17:14:36.424 [bookkeeper-ml-workers-OrderedExecutor-18-0] INFO o.a.p.b.s.persistent.PersistentTopic - [persistent://xxx/yyy/zzz] Un-fencing topic... ``` Unloaded topics are usually fenced to prevent new clients from connecting. In this case, however, the producers reconnected to the topic because it had been unfenced, and the replicator was restarted. I think this is due to #5271. If a topic is fenced to close or delete, we should not unfence it. ### Modifications When closing or deleting the `PersistentTopic` instance, set the `isClosingOrDeleting` flag to true. If `isClosingOrDeleting` is true, do not unfence the topic unless closing or deleting fails.

…s unloaded (apache#7735) ### Motivation When a topic is unloaded and moved to another broker, the producer for geo-replication often remains unclosed. Because of this, geo-replication is not possible on the broker to which the topic was moved and messages accumulate in the replication backlog. ``` 18:56:55.166 [pulsar-io-21-6] ERROR o.a.pulsar.client.impl.ProducerImpl - [persistent://xxx/yyy/zzz] [pulsar.repl.dc2] Failed to create producer: Producer with name 'pulsar.repl.dc2' is already connected to topic ``` When this issue occurs, the following log is output on the broker where the topic is unloaded. ``` 17:14:36.424 [bookkeeper-ml-workers-OrderedExecutor-18-0] INFO o.a.p.b.s.persistent.PersistentTopic - [persistent://xxx/yyy/zzz] Un-fencing topic... ``` Unloaded topics are usually fenced to prevent new clients from connecting. In this case, however, the producers reconnected to the topic because it had been unfenced, and the replicator was restarted. I think this is due to apache#5271. If a topic is fenced to close or delete, we should not unfence it. ### Modifications When closing or deleting the `PersistentTopic` instance, set the `isClosingOrDeleting` flag to true. If `isClosingOrDeleting` is true, do not unfence the topic unless closing or deleting fails.

jerrypeng added 11 commits September 23, 2019 14:59

Bug in Message Deduplication that may cause incorrect behavior

c4cdd71

add tests

63bb012

fix error message

f134160

fix client backoff

dae3c01

fix tests

c2bd4f6

cleaning up

e7a51ef

Fix handling of BK write failures for message dedup

ccfe628

tests and clean up

5778e86

refactoring code

83db2c5

Merge branch 'master' of github.com:apache/incubator-pulsar into fix_…

24edab5

…bk_write_faulure

fixing bugs

96607ce

jerrypeng added the type/bug The PR fixed a bug or issue reported a bug label Sep 24, 2019

jerrypeng added this to the 2.4.2 milestone Sep 24, 2019

jerrypeng requested review from ivankelly, merlimat and sijie September 24, 2019 21:20

jerrypeng self-assigned this Sep 24, 2019

merlimat reviewed Sep 24, 2019

View reviewed changes

jerrypeng added 2 commits September 24, 2019 15:20

addressing comments

d0ec269

add missing license header

f2b2567

jerrypeng requested a review from merlimat September 24, 2019 23:05

merlimat approved these changes Sep 25, 2019

View reviewed changes

ivankelly reviewed Sep 25, 2019

View reviewed changes

jerrypeng merged commit 8e95f43 into apache:master Oct 5, 2019

jerrypeng mentioned this pull request Oct 5, 2019

Fix bk write failure part 2 #5322

Merged

massakam mentioned this pull request Aug 4, 2020

[broker] Fix bug where producer for geo-replication is not closed when topic is unloaded #7735

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve error handling logic for effectively once #5271

Improve error handling logic for effectively once #5271

jerrypeng commented Sep 24, 2019

merlimat Sep 24, 2019

jerrypeng Sep 24, 2019

merlimat Sep 24, 2019

jerrypeng Sep 24, 2019

merlimat Sep 24, 2019

jerrypeng commented Sep 24, 2019

jerrypeng commented Sep 24, 2019

merlimat Sep 24, 2019

ivankelly left a comment

ivankelly Sep 25, 2019

jiazhai commented Sep 25, 2019

merlimat commented Sep 25, 2019

jerrypeng commented Sep 25, 2019

jerrypeng commented Sep 25, 2019

jerrypeng commented Sep 26, 2019

ivankelly commented Sep 26, 2019

merlimat commented Sep 26, 2019

jerrypeng commented Sep 26, 2019

jerrypeng commented Sep 30, 2019

jerrypeng commented Oct 3, 2019

jerrypeng commented Oct 3, 2019

jerrypeng commented Oct 3, 2019

jerrypeng commented Oct 3, 2019

jerrypeng commented Oct 4, 2019

jerrypeng commented Oct 4, 2019

Improve error handling logic for effectively once #5271

Improve error handling logic for effectively once #5271

Conversation

jerrypeng commented Sep 24, 2019

Motivation

Modifications

merlimat Sep 24, 2019

Choose a reason for hiding this comment

jerrypeng Sep 24, 2019

Choose a reason for hiding this comment

merlimat Sep 24, 2019

Choose a reason for hiding this comment

jerrypeng Sep 24, 2019

Choose a reason for hiding this comment

merlimat Sep 24, 2019

Choose a reason for hiding this comment

jerrypeng commented Sep 24, 2019

jerrypeng commented Sep 24, 2019

merlimat Sep 24, 2019

Choose a reason for hiding this comment

ivankelly left a comment

Choose a reason for hiding this comment

ivankelly Sep 25, 2019

Choose a reason for hiding this comment

jiazhai commented Sep 25, 2019

merlimat commented Sep 25, 2019

jerrypeng commented Sep 25, 2019

jerrypeng commented Sep 25, 2019

jerrypeng commented Sep 26, 2019

ivankelly commented Sep 26, 2019

merlimat commented Sep 26, 2019

jerrypeng commented Sep 26, 2019

jerrypeng commented Sep 30, 2019

jerrypeng commented Oct 3, 2019

jerrypeng commented Oct 3, 2019

jerrypeng commented Oct 3, 2019

jerrypeng commented Oct 3, 2019

jerrypeng commented Oct 4, 2019

jerrypeng commented Oct 4, 2019