Skip to content
This repository has been archived by the owner on May 13, 2019. It is now read-only.

Retry claiming partitions if the partition is already claims. See #62 #68

Merged
merged 1 commit into from
Aug 17, 2015

Conversation

nemosupremo
Copy link
Contributor

This should fix #62, and combined with the fix in PR #63, should also fix #60.

What I've opted to do is simply retry ClaimPartition 3 times, sleeping for 1sec if the error kazoo.ErrPartitionClaimedByOther occurs. If another error occurs, we exit as normal.

The other changes (line 366, 371) were due to go fmt.

@wvanbergen
Copy link
Owner

Ideally, I would like to use zookeeper watches instead of retrying. This way we can just wait until the partition becomes available, and resume execution then. What do you think?

This is the approach I took for my Ruby consumer

@wvanbergen
Copy link
Owner

I've decided to go with the watch approach in the reimplementation (see #72). For now I'll merge this as a nice addition for the current version 👍

wvanbergen added a commit that referenced this pull request Aug 17, 2015
Retry claiming partitions if the partition is already claims. See #62
@wvanbergen wvanbergen merged commit 7c09a42 into wvanbergen:master Aug 17, 2015
caihua-yin pushed a commit to caihua-yin/kafka that referenced this pull request Apr 19, 2016
This is complementary fix for
wvanbergen#68
(issue: wvanbergen#62), before the
re-implementation (wvanbergen#72) is ready.

In my use case, the message consuming logic is sometimes time consuming,
even with 3 times retry as the fix in pull#68, it's still easy to have
the issue#62. Furhter checking current logic in
consumer_group.go:partitionConsumer(), it may take
as many as cg.config.Offsets.ProcessingTimeout to ReleasePartition
so that the partition can be claimed by new consumer during rebalance.
So just simply set the max retry time same as
cg.config.Offsets.ProcessingTimeout, which is 60s by default.

Verified this the system including this fix with frequent rebalance
operations, the issue does not occur again.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Race condition in partition rebalance. Panic Integer Divide by Zero when dividePartitionsBetweenConsumers
2 participants