Fix #199: Do not stop to replicate when producer throws exception #220

nkurihar · 2017-02-17T18:45:21Z

Motivation

This closes #199

Modifications

When the replication producer catches an exception,
rewind cursor and continue readMoreEntries (if possible) rather than return immediately.

An unit test for simulating #199 is also added.

Result

Replicator will continue to replicate even when producer exception.

…es pulsar2/pulsar3

merlimat

👍

saandrews · 2017-02-17T22:12:32Z

pulsar-broker/src/test/java/com/yahoo/pulsar/broker/service/ReplicatorTest.java

+            consumer1.receive(10);
+
+            // Restrict backlog quota limit to 1
+            admin1.namespaces().setBacklogQuota("pulsar/global/ns1", new BacklogQuota(1, policy));


Can we setup the test this way? If you want, you can create another test.

limit quota to 1 message

Publish 10 messages

verify receive times out after first message is received(since quota is full)

Increase quota to 10

Verify we receive the remaining 9 messages.

@saandrews
I modified the test a little:

limit quota to 1

Publish 1 message and wait for reflecting backlog limitation ※

Publish 9 messages and verify they will be pended

※ If I produce 10 messages at once in interval of reflecting backlog limitation, they will be sended

saandrews · 2017-02-17T22:13:39Z

...ar-broker/src/main/java/com/yahoo/pulsar/broker/service/persistent/PersistentReplicator.java

-                log.debug("[{}][{} -> {}] Message persisted on remote broker", replicator.topicName,
-                        replicator.localCluster, replicator.remoteCluster);
+                // cursor shoud be rewinded since it was incremented when readMoreEntries
+                replicator.cursor.rewind();


@merlimat Do you see any side effect if we rewind it for every exception. All failed pending messages would reach here and invoke rewind.

Rewind is a cheap operation, it just resets the readPosition to markDeletePosition + 1

rdhabalia

👍

* Fix #199: Do not stop to replicate when producer throws exception * Fix log messages * testReplicatorProducerClosing shoud be executed at last since it closes pulsar2/pulsar3 * Add unit test for replication resumption on backlog exceeded

* Fixed Lookup service.

fixes apache#220

fixes apache#220 if using `computeIfAbsent `, `consumerManagerFuture.complete(null)` will store in `consumerTopicManagers`, and `getTopicConsumerManager ` will always get future null cache for key which should getTopic again.

Nozomi Kurihara added 3 commits February 17, 2017 10:26

Fix apache#199: Do not stop to replicate when producer throws exception

c4eeca5

Fix log messages

df454ab

testReplicatorProducerClosing shoud be executed at last since it clos…

b8a99ea

…es pulsar2/pulsar3

merlimat approved these changes Feb 17, 2017

View reviewed changes

merlimat assigned nkurihar Feb 17, 2017

merlimat added the type/bug The PR fixed a bug or issue reported a bug label Feb 17, 2017

merlimat added this to the 1.17 milestone Feb 17, 2017

merlimat requested review from rdhabalia and saandrews February 17, 2017 21:59

saandrews reviewed Feb 17, 2017

View reviewed changes

rdhabalia approved these changes Feb 17, 2017

View reviewed changes

Add unit test for replication resumption on backlog exceeded

c6afe6e

nkurihar force-pushed the fix_repl branch from b30d917 to c6afe6e Compare February 17, 2017 23:31

merlimat merged commit 6fd212a into apache:master Feb 17, 2017

hrsakai pushed a commit to hrsakai/pulsar that referenced this pull request Dec 10, 2020

Fixed tls connection issue (apache#220)

6e5c7d3

* Fixed Lookup service.

hangc0276 pushed a commit to hangc0276/pulsar that referenced this pull request May 26, 2021

update lookup cache when onload & unload (apache#265)

477ce6c

fixes apache#220

xiaotongwang1 mentioned this pull request Aug 4, 2021

Pulsar 2.7.0+ KOP 2.7.2.x getPartitionedTopicMetadata timeout #11532

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix #199: Do not stop to replicate when producer throws exception #220

Fix #199: Do not stop to replicate when producer throws exception #220

nkurihar commented Feb 17, 2017

merlimat left a comment

saandrews Feb 17, 2017

nkurihar Feb 17, 2017 •

edited

saandrews Feb 17, 2017

merlimat Feb 17, 2017

rdhabalia left a comment

Fix #199: Do not stop to replicate when producer throws exception #220

Fix #199: Do not stop to replicate when producer throws exception #220

Conversation

nkurihar commented Feb 17, 2017

Motivation

Modifications

Result

merlimat left a comment

Choose a reason for hiding this comment

saandrews Feb 17, 2017

Choose a reason for hiding this comment

nkurihar Feb 17, 2017 • edited

Choose a reason for hiding this comment

saandrews Feb 17, 2017

Choose a reason for hiding this comment

merlimat Feb 17, 2017

Choose a reason for hiding this comment

rdhabalia left a comment

Choose a reason for hiding this comment

nkurihar Feb 17, 2017 •

edited