Skip to content

Fix consumer block problem: skip no entry#14543

Closed
lordcheng10 wants to merge 2 commits intoapache:masterfrom
lordcheng10:fix_consumer_block
Closed

Fix consumer block problem: skip no entry#14543
lordcheng10 wants to merge 2 commits intoapache:masterfrom
lordcheng10:fix_consumer_block

Conversation

@lordcheng10
Copy link
Contributor

@lordcheng10 lordcheng10 commented Mar 3, 2022

Motivation

We encountered such a problem: the broker never pushes messages because it reads a non-existing entry, and will keep trying to read the non-existing entry.

Through the stats command, we found that some partitions have not pushed messages to consumers:
image

The error log is as follows:
00:00:10.094 [pulsar-io-4-122] INFO org.apache.pulsar.broker.service.persistent.PersistentDispatcherMultipleConsumers - [persistent://teg_onion_onion_common_data_gz/teg_onion_onion_common_data_gz/teg_onion_onion_common_data_gz-partition-159 / t_teg_onion_b_teg_onion_onion_common_data_gz_cg_onion_common_data_cos_svr_gz_001] Retrying read operation
00:00:28.320 [BookKeeperClientWorker-OrderedExecutor-21-0] ERROR org.apache.bookkeeper.client.PendingReadOp - Read of ledger entry failed: L15759573 E15389-E15389, Sent to [11.135.219.182:3181, 11.135.217.80:3181], Heard from [] : bitset = {}, Error = 'No such entry'. First unread entry is (-1, rc = null)
00:00:28.320 [BookKeeperClientWorker-OrderedExecutor-33-0] WARN org.apache.bookkeeper.mledger.impl.OpReadEntry - [teg_onion_onion_common_data_gz/teg_onion_onion_common_data_gz/persistent/teg_onion_onion_common_data_gz-partition-275][t_teg_onion_b_teg_onion_onion_common_data_gz_cg_onion_common_data_cos_svr_gz_001] read failed from ledger at position:15759573:15389
org.apache.bookkeeper.mledger.ManagedLedgerException$NonRecoverableLedgerException: No such entry
00:00:28.320 [BookKeeperClientWorker-OrderedExecutor-33-0] ERROR org.apache.pulsar.broker.service.persistent.PersistentDispatcherMultipleConsumers - [persistent://teg_onion_onion_common_data_gz/teg_onion_onion_common_data_gz/teg_onion_onion_common_data_gz-partition-275 / t_teg_onion_b_teg_onion_onion_common_data_gz_cg_onion_common_data_cos_svr_gz_001] Error reading entries at 15759573:15389 : No such entry, Read Type Normal - Retrying to read in 54.299 seconds

And I found through the bookkeeper command: sh bin/bookkeeper shell ledger 15759573 that entryId=15389 does not exist, and the smallest entryid=15558:
image

Through monitoring, we found that the consumption delay of some partiotns continued to rise:
image

Modifications

When we encounter this situation, skip the entry

Documentation

Check the box below or label this PR directly (if you have committer privilege).

Need to update docs?

  • doc-required

    (If you need help on updating docs, create a doc issue)

  • no-need-doc

    (Please explain why)

  • doc

    (If this PR contains doc changes)

@github-actions github-actions bot added the doc-not-needed Your PR changes do not impact docs label Mar 3, 2022
@lordcheng10 lordcheng10 changed the title Fix the problem that the broker never pushes messages to the consumer Fix consumer block problem: skip no entry Mar 3, 2022
@lordcheng10 lordcheng10 closed this Mar 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

doc-not-needed Your PR changes do not impact docs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant

Comments