fix: clear preferredReadReplica if broker shutdown #2108

dnwe · 2022-01-12T00:36:44Z

After Sarama had been given a preferred replica to consume from, it was mistakenly latching onto that value and not unsetting it in the case that the preferred replica broker was shutdown and left the cluster metadata.

Fetches continued to work as long as that broker remained shutdown, because they were now being sent to the Leader, which would service them itself as it had no better preferred replica to point the client at.

However, consumption would then hang after the broker came back online, because the Leader would stop returning records in the FetchResponse and would instead just return the preferred replicaID, expecting the client to send its FetchRequests over there. However, because the partitionConsumer had latched the value of preferredReplica it never dispatched to (re-)connect to the preferred replica and instead just continued to send FetchRequests to the leader and received no records back.

Contributes-to: #2090

After Sarama had been given a preferred replica to consume from, it was mistakenly latching onto that value and not unsetting it in the case that the preferred replica broker was shutdown and left the cluster metadata. Fetches continued to work as long as that broker remained shutdown, because they were now being sent to the Leader, which would service them itself as it had no better preferred replica to point the client at. However, consumption would then hang after the broker came back online, because the Leader would stop returning records in the FetchResponse and would instead just return the preferred replicaID, expecting the client to send its FetchRequests over there. However, because the partitionConsumer had latched the value of preferredReplica it never dispatched to (re-)connect to the preferred replica and instead just continued to send FetchRequests to the leader and received no records back. Contributes-to: #2090 Signed-off-by: Dominic Evans <dominic.evans@uk.ibm.com>

bai

👍🏼 lgtm

lizthegrey · 2022-01-19T00:08:22Z

Confirming, this fix solved our problem.

dnwe requested a review from bai as a code owner January 12, 2022 00:36

dnwe mentioned this pull request Jan 12, 2022

bug: follower fetch halts rather than switch to leader if replica is stopped/restarted #2090

Closed

chore: fix gosec and misspell warnings

1f754e2

dnwe added the fix label Jan 12, 2022

bai approved these changes Jan 12, 2022

View reviewed changes

dnwe merged commit a059adb into main Jan 12, 2022

dnwe deleted the dnwe/fix-consumer-from-follower branch January 12, 2022 09:29

This was referenced Jan 19, 2022

Update module github.com/Shopify/sarama to v1.34.1 - autoclosed amuraru/koperator#2

Closed

fix(deps): update module github.com/shopify/sarama to v1.31.0 shortlink-org/shortlink#3292

Merged

niamster mentioned this pull request Apr 7, 2022

v1.30.1 patched niamster/sarama#1

Closed

renovate bot mentioned this pull request Apr 19, 2022

fix(deps): update module github.com/shopify/sarama to v1.32.0 secustor/opentelemetry-meetup#41

Merged

1 task

niamster mentioned this pull request Apr 20, 2022

v1.30.1 patched DataDog/sarama#1

Closed

renovate bot mentioned this pull request Apr 20, 2022

Update module github.com/Shopify/sarama to v1.32.0 shift/vflow#13

Merged

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: clear preferredReadReplica if broker shutdown #2108

fix: clear preferredReadReplica if broker shutdown #2108

dnwe commented Jan 12, 2022

bai left a comment

lizthegrey commented Jan 19, 2022

fix: clear preferredReadReplica if broker shutdown #2108

fix: clear preferredReadReplica if broker shutdown #2108

Conversation

dnwe commented Jan 12, 2022

bai left a comment

Choose a reason for hiding this comment

lizthegrey commented Jan 19, 2022