[fix][broker]Can't consume messages for a long time due to Entry Filter#17782
[fix][broker]Can't consume messages for a long time due to Entry Filter#17782poorbarcode wants to merge 10 commits intoapache:masterfrom
Conversation
|
@poorbarcode Please provide a correct documentation label for your PR. |
02242d3 to
22baf48
Compare
|
This PR should merge into the following branches:
|
2a82948 to
3e48f88
Compare
3e48f88 to
2bb0869
Compare
| if (!redeliveryTracker.hasRedeliveredEntry(entries)){ | ||
| return super.getNextConsumer(); | ||
| } | ||
| if (redeliveryTracker instanceof InMemoryAndPreventCycleFilterRedeliveryTracker tracker){ |
There was a problem hiding this comment.
we should not use "instanceof"
we should leverage polimorphism
There was a problem hiding this comment.
this comment seems unresolved
| } | ||
| if (redeliveryTracker instanceof InMemoryAndPreventCycleFilterRedeliveryTracker tracker){ | ||
| if (tracker.pausedConsumerCount() == consumerSet.size()){ | ||
| log.warn("No consumers are currently able to consume the first redelivery entry {}", |
There was a problem hiding this comment.
I think that this will be logged may times, and actually it is not a problem.
I am testing with some heavy load, in order to tell more
| return super.getNextConsumer(); | ||
| } else { | ||
| Consumer nextConsumer = null; | ||
| while (true){ |
There was a problem hiding this comment.
this looks like a endless loop in some cases.
we cannot keep this thread busy forever
...in/java/org/apache/pulsar/broker/service/InMemoryAndPreventCycleFilterRedeliveryTracker.java
Show resolved
Hide resolved
...in/java/org/apache/pulsar/broker/service/InMemoryAndPreventCycleFilterRedeliveryTracker.java
Show resolved
Hide resolved
| pausedConsumers.clear(); | ||
| } | ||
|
|
||
| private static int comparePosition(Position pos1, Position pos2) { |
There was a problem hiding this comment.
this method should go somewhere in Position or in some utility class
like comparePositionByEntryId
please check in the code if we already have something like that
There was a problem hiding this comment.
Already fixed. I didn't find the right tool method, so I added a new one.
There was a problem hiding this comment.
it seems you are not using the new utility method you introduce here
There was a problem hiding this comment.
Already fixed.
There was a problem hiding this comment.
this method is not used, we should drop it
| return false; | ||
| } | ||
| // Automatically becomes invalid after 1s, because users may use time to filter the Entry. | ||
| return System.currentTimeMillis() - pauseTime < 1000; |
There was a problem hiding this comment.
this should be configurable, maybe next to ServiceConfiguration#dispatcherEntryFilterRescheduledMessageDelay ?
There was a problem hiding this comment.
Already fixed. Could you please help me check whether DOC is suitable?
|
|
||
| void clear(); | ||
|
|
||
| default boolean hasRedeliveredEntry(List<Entry> entries) { |
There was a problem hiding this comment.
maybe the default implementation should be no-op, this way we don't add load to users who don't need this feature
There was a problem hiding this comment.
I feel that since this method exists, it should provide the correct logic, so it should not be no-op
There was a problem hiding this comment.
This method already moved into InMemoryAndPreventCycleFilterRedeliveryTracker, see #17782 (comment)
2bb0869 to
8d3986c
Compare
Fixes:
Motivation
When there are two consumers, users can specify the consumption behavior of each consumer by
Entry filter:case:consumer_1can consume 60% of the messages,consumer_2can consume 60% of the messages, and there is 10% intersection betweenconsumer_1andconsumer_2If returns FilterResult.RESCHEDULE for more than 10% of messages, then it's possible: some message that can only be consumed by
consumer_1keeps redelivered toconsumer_2, and some message that can only be consumed byconsumer_2keeps redelivered toconsumer_1. Then the problem occurs:pulsar/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/AbstractBaseDispatcher.java
Lines 140 to 148 in 8441f67
You can reproduce the problem by run
FilterEntryTest.testEntryFilterRescheduleMessageDependingOnConsumerSharedSubscription20 timesModifications
When a message is redelivered by the same consumer more than 3 times, make that consumer pause to receive this message for 1 second.
Since tracking the consumption of all the messages cost memory too much, we trace only the earliest message.
Documentation
doc-required(Your PR needs to update docs and you will update later)
doc-not-needed(Please explain why)
doc(Your PR contains doc changes)
doc-complete(Docs have been already added)
Matching PR in forked repository
PR in forked repository: