Recover cursor with correct readPosition and replay unackedMessages #446

rdhabalia · 2017-06-01T23:26:20Z

Motivation

Right now, broker recovers stored individualDeletedMessages of cursor on cursor's initialization. However, it also sets cursor's readPosition to markDeletePosition and that can block message dispatching for sometime if distance between readPosition and last-acked msg in individualDeletedMessages is very large because dispatcher tries to read messages from readPosition and filters out deletedMessages from the list and it can take significant amount of time. Also, now unackMessages are not part of consumer's unackMessage list so, redelivery will not deliver these unack messages as well.
Therefore, on cursor recovery broker should

set readPosition at lastAcked message from individualDeletedMessages if present
replay all unack messages after recovery so, consumer can consume them again and get chance to ack them back.

Modifications

set readPosition at lastAcked message from individualDeletedMessages if present
replay all unack messages after recovery so, consumer can consume them again and get chance to ack them back.

Result

It will fix message delivery delay at the bundle loading at broker. I think it should also fix #380

merlimat · 2017-06-01T23:40:59Z

managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/ManagedCursorImpl.java

+     * @return
+     */
+    @Override
+    public Set<? extends Position> getNotDeletedMessages() {


I don't think we need to go down this route.

When reading, the managed cursor is already skipping all the individually deleted messages. We should just set the read position to be always on (mark-delete + 1) when recovering.

I think the issue is that skipping over the acked messages takes a lot of time, since the message stream over the individually deleted messages covers millions of messages.

When reading, the managed cursor is already skipping all the individually deleted messages.

Yes, that's correct, cursor reads messages in chunk of size 100 and then filter out already deleted messages and then keep going on.
However, recently we have an issue for one of the subscription where there were large number of messages present between markDeletePosition and last acked message in individuallyDeletedMessages and out of which only few of them were unacked messages. So, cursor was keep reading from markDeletePosition in chunk of size 100 and filtering out most of the messages. so, readPosition was keep incrementing but consumer was not receiving the messages for long time.

Example: just added snapshot of individuallyDeletedMessages and it was large

: { "markDeletePosition" : "402212931:36356", "readPosition" : "403263473:272", "waitingReadOp" : false, "pendingReadOps" : 1, "messagesConsumedCounter" : -385138, "cursorLedger" : -1, "cursorLedgerLastEntry" : -1, "individuallyDeletedMessages" : "[(402212931:36357‥402258887:15270], (402258887:15271‥402312619:6246], (402312619:6247‥402343507:21490], ....... ....... ....... ....... (405361786:172156‥405361786:172253], (405361786:172265‥405361786:172272]]",

OK. In this case then, we should skip the read position to the next available message.

Eg. :

Mark-delete: 1:0,

Individually deleted messages: [(1:1..1:10]]

In this case 1:1 was not acked.

If I try to read 5 messages, I should get a list with 1:1 and the readPosition should be moved to 1:11.

This could be implemented such that, if I read and some of the messages were filtered, we check for the next unacked message to move the read position for next time.

My point here is that this issue should be handled at the ManagedLedger layer. By returning the unacked messages to the application we're kind of leaking back the logic to the broker.
Also, it might not only happen for the shared subscription.

My point here is that this issue should be handled at the ManagedLedger layer. By returning the unacked messages to the application we're kind of leaking back the logic to the broker.

Yes, that's correct, it should be handled by managed-layer only. Fixed it.

jai1 · 2017-06-02T00:34:05Z

...n/java/com/yahoo/pulsar/broker/service/persistent/PersistentDispatcherMultipleConsumers.java

@@ -94,6 +94,8 @@ public PersistentDispatcherMultipleConsumers(PersistentTopic topic, ManagedCurso
        this.readBatchSize = MaxReadBatchSize;
        this.maxUnackedMessages = topic.getBrokerService().pulsar().getConfiguration()
                .getMaxUnackedMessagesPerSubscription();
+        this.cursor.getNotDeletedMessages().forEach(position -> this.messagesToReplay


Can we change it to getNotDeletedMessages(this.messagesToReplay) to reduce garbage due to PositionImpl

i.e directly populate messagesToReplay instead of creating a temporary Set which will eventually be GC'ed.

merlimat

Change LGTM. Just couple of minor details

merlimat · 2017-06-02T16:13:05Z

managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/OpReadEntry.java

+     * @param position
+     * @return next availablePosition
+     */
+    private PositionImpl getNextAvailablePosition(PositionImpl position) {


Can this method be combined with the cursor.getNextAvailablePosition() ?

merlimat · 2017-06-02T16:50:20Z

managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/OpReadEntry.java

        if (log.isDebugEnabled()) {
            log.debug("[{}][{}] Read entries succeeded batch_size={} cumulative_size={} requested_count={}",
                    cursor.ledger.getName(), cursor.getName(), returnedEntries.size(), entries.size(), count);
        }
        List<Entry> filteredEntries = cursor.filterReadEntries(returnedEntries);
        entries.addAll(filteredEntries);

+        final Position nexReadPosition = getNextAvailablePosition(lastPosition);


We should try to avoid checking the range set if there are no filtered entries. There are some objects allocated in there when checking the ranges (which we should address at some point, though outside the scope of this change).

merlimat

👍

Recover cursor with correct readPosition and replay unackedMessages

9ade24f

rdhabalia added the type/bug The PR fixed a bug or issue reported a bug label Jun 1, 2017

rdhabalia added this to the 1.18 milestone Jun 1, 2017

rdhabalia self-assigned this Jun 1, 2017

merlimat reviewed Jun 1, 2017

View reviewed changes

jai1 reviewed Jun 2, 2017

View reviewed changes

rdhabalia force-pushed the readPos branch from 3a710da to 929aed2 Compare June 2, 2017 01:42

merlimat reviewed Jun 2, 2017

View reviewed changes

skip deleted messages for next read

7bbe632

rdhabalia force-pushed the readPos branch from 929aed2 to 7bbe632 Compare June 2, 2017 17:44

merlimat approved these changes Jun 2, 2017

View reviewed changes

merlimat merged commit 2e4475e into apache:master Jun 2, 2017

rdhabalia deleted the readPos branch June 21, 2017 18:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Recover cursor with correct readPosition and replay unackedMessages #446

Recover cursor with correct readPosition and replay unackedMessages #446

rdhabalia commented Jun 1, 2017

merlimat Jun 1, 2017

joefk Jun 2, 2017

rdhabalia Jun 2, 2017

merlimat Jun 2, 2017

rdhabalia Jun 2, 2017

jai1 Jun 2, 2017

merlimat left a comment

merlimat Jun 2, 2017

merlimat Jun 2, 2017

merlimat left a comment

Recover cursor with correct readPosition and replay unackedMessages #446

Recover cursor with correct readPosition and replay unackedMessages #446

Conversation

rdhabalia commented Jun 1, 2017

Motivation

Modifications

Result

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

merlimat left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

merlimat left a comment

Choose a reason for hiding this comment