-
Notifications
You must be signed in to change notification settings - Fork 882
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The on_assign callback for subscribe provides an empty partitions list #133
Comments
Setting group.id to None does not actually null it, it is converted to the string "None" - thus a valid consumer group. This is a bug in the Python wrapper. Why you are seeing an empty assignment initially is probably because of the above, you might have multiple clients using this "None" group and not enough partitions for all joined consumers. Sounds plausible? |
Yes, that is completely plausible - it would also explain why I've been unable to reproduce it in a dev setting (because I've only ever used a single client in dev). But, the restriction on |
The easiest approach is to disable auto commits and using a unique group.id for each client, this will give you a short-lived consumer group with only one consumer. |
Actually, a couple follow-up questions: Eventually, all of my clients get an Also, you mentioned that |
Hard to say what is going on there since I don't have a clear picture of your setup, but if you want to see what's going on behind the scenes I suggest you enable subscribe() starts the high-level balanced consumer, joins the configured group.id, and waits for a partition assignment based on its subscribed topics. When an assignment is received it either calls assign() automatically or lets you do it in the on_assign callback. assign() is the lower level consumer that actually starts consuming the given set of partitions (after stopping any previous consumption). The group.id does not have any relevance for this call since it all it operates on is a definitive set of partitions that comes from somewhere (high-level consumer assignment, user manual assignment, etc..). You are free to call assign() at any time, more or less, to replace the current set of partitions being consumed. Having said that, a group.id must still be configured for assign() to function, but this is due to internal implementation details. Unless you join the group (subscribe()) or commit offsets the group.id is not actually used. |
None conf values are now converted to NULL (closes #133)
We do custom partition/offset tracking, so we run our consumers with
enable.auto.commit=False
andgroup.id=None
and we callConsumer.subscribe
with a callback set foron_assign
. Usually, our initial call tosubscribe
results in an emptypartitions
parameter to theon_assign
callback. We added retry logic around oursubscribe
calls, and we generally get a validpartitions
list inon_assign
within a minute or so.From this, it seems like initially the
Consumer
isn't really ready to fieldsubscribe
calls withon_assign
but eventually becomes ready (perhaps related to initial metadata requests running in the background?). Is there some way for users to determine this readiness and delay calls to subscribe? Our retries are a bit hacky and I'd like to remove them.Or, is our usage horribly wrong, requiring a new approach to our offset management?
The text was updated successfully, but these errors were encountered: