-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[pulsar-broker] add pending read subscription metrics to stats-internal #9788
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
rdhabalia
requested review from
merlimat,
sijie,
jiazhai,
aahmed-se,
codelipenghui and
nkurihar
March 4, 2021 00:52
codelipenghui
approved these changes
Mar 4, 2021
merlimat
approved these changes
Mar 4, 2021
codelipenghui
pushed a commit
that referenced
this pull request
Mar 10, 2021
…on (#9789) ### Motivation We have been frequently seeing issue where subscription gets stuck on different topics and broker is not dispatching messages though consumer has available-permits and no pending reads (example #9788). It can happen due to regression bug or unknown issue when expiry runs.. one of the workarounds is manually unload the topic and reload it which is not feasible if this happens frequently to many topics. Or broker should have the capability to discover such stuck subscriptions and unblock them. Below example shows that: subscription has available-permit>0, there is no pending reads, cursor's read-position is not moving forward and that builds the backlog until we unload the topic. It happens frequently due to unknown reason: ``` STATS-INTERNAL: "sub1" : { "markDeletePosition" : "11111111:15520", "readPosition" : "11111111:15521", "waitingReadOp" : false, "pendingReadOps" : 0, "messagesConsumedCounter" : 115521, "cursorLedger" : 585099247, "cursorLedgerLastEntry" : 597, "individuallyDeletedMessages" : "[]", "lastLedgerSwitchTimestamp" : "2021-02-25T19:55:50.357Z", "state" : "Open", "numberOfEntriesSinceFirstNotAckedMessage" : 1, "totalNonContiguousDeletedMessagesRange" : 0, STATS: "sub1" : { "msgRateOut" : 0.0, "msgThroughputOut" : 0.0, "msgRateRedeliver" : 0.0, "msgBacklog" : 30350, "blockedSubscriptionOnUnackedMsgs" : false, "msgDelayed" : 0, "unackedMessages" : 0, "type" : "Shared", "msgRateExpired" : 0.0, "consumers" : [ { "msgRateOut" : 0.0, "msgThroughputOut" : 0.0, "msgRateRedeliver" : 0.0, "consumerName" : "C1", "availablePermits" : 723, "unackedMessages" : 0, "blockedConsumerOnUnackedMsgs" : false, "metadata" : { }, "connectedSince" : "2021-02-25T19:55:50.358285Z", ``` ![image](https://user-images.githubusercontent.com/2898254/109894631-ab62d980-7c42-11eb-8dcc-a1a5f4f5d14e.png) ### Modification Add capability in broker to periodically check if subscription is stuck and unblock it if needed. This check is controlled by flag and for initial release it can be disabled by default (and we can enable by default in later release) ### Result It helps broker to handle stuck subscription and logs the message for later debugging.
eolivelli
pushed a commit
that referenced
this pull request
May 13, 2021
…on (#9789) We have been frequently seeing issue where subscription gets stuck on different topics and broker is not dispatching messages though consumer has available-permits and no pending reads (example #9788). It can happen due to regression bug or unknown issue when expiry runs.. one of the workarounds is manually unload the topic and reload it which is not feasible if this happens frequently to many topics. Or broker should have the capability to discover such stuck subscriptions and unblock them. Below example shows that: subscription has available-permit>0, there is no pending reads, cursor's read-position is not moving forward and that builds the backlog until we unload the topic. It happens frequently due to unknown reason: ``` STATS-INTERNAL: "sub1" : { "markDeletePosition" : "11111111:15520", "readPosition" : "11111111:15521", "waitingReadOp" : false, "pendingReadOps" : 0, "messagesConsumedCounter" : 115521, "cursorLedger" : 585099247, "cursorLedgerLastEntry" : 597, "individuallyDeletedMessages" : "[]", "lastLedgerSwitchTimestamp" : "2021-02-25T19:55:50.357Z", "state" : "Open", "numberOfEntriesSinceFirstNotAckedMessage" : 1, "totalNonContiguousDeletedMessagesRange" : 0, STATS: "sub1" : { "msgRateOut" : 0.0, "msgThroughputOut" : 0.0, "msgRateRedeliver" : 0.0, "msgBacklog" : 30350, "blockedSubscriptionOnUnackedMsgs" : false, "msgDelayed" : 0, "unackedMessages" : 0, "type" : "Shared", "msgRateExpired" : 0.0, "consumers" : [ { "msgRateOut" : 0.0, "msgThroughputOut" : 0.0, "msgRateRedeliver" : 0.0, "consumerName" : "C1", "availablePermits" : 723, "unackedMessages" : 0, "blockedConsumerOnUnackedMsgs" : false, "metadata" : { }, "connectedSince" : "2021-02-25T19:55:50.358285Z", ``` ![image](https://user-images.githubusercontent.com/2898254/109894631-ab62d980-7c42-11eb-8dcc-a1a5f4f5d14e.png) Add capability in broker to periodically check if subscription is stuck and unblock it if needed. This check is controlled by flag and for initial release it can be disabled by default (and we can enable by default in later release) It helps broker to handle stuck subscription and logs the message for later debugging.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Motivation
We frequently see consumer gets stuck and broker is not dispatching messages though it should. We need additional pending-read metrics of the topic for better debugging. eg: when subscription is stuck, we want to know
pendingRead
andpendingReplayRead
for better debugging.for now, we have to validate from heapdump.
eg:
below example is stuck subscription which doesn't show subscription pending read in metrics
stats
stats-internal