Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ISSUE-525 / 764: ruby-kafka does not support different topic subscriptions in the same consumer group #903

Merged
merged 3 commits into from
Jul 26, 2021

Conversation

eduardopoleoflipp
Copy link

There are 2 issues that currently prevent supporting this feature:

  1. The current RoundRobinAssignmentStrategy assumes all subscriptions are equal

We need a more generalized algorithm to perform assignments now that we have different subscriptions across the consumer group. This PR introduces a new additional strategy heavily inspired in the java kafka client RoundRobinAssignor that properly handle this case.

  1. The leader consumer thread assumes that all consumers subscription are equivalent

If different consumers were to have different topic subscription we can't rely on the in-memory @topics variable to produce correct assignments. Instead, we can leverage the subscription information provided by the cluster, and stored in@members, to determine the correct consumer subscriptions for a particular consumer group. Once we have the correct list of topics the downstream code in the assignor can then fetch the corresponding partitions from the cluster in preparation for the assignments.

@eduardopoleoflipp eduardopoleoflipp force-pushed the multiple_topic_subscriptions branch 2 times, most recently from 4c68314 to e3b681c Compare June 8, 2021 20:07
@mrnonz
Copy link

mrnonz commented Jun 17, 2021

Can't wait!!

@dasch
Copy link
Collaborator

dasch commented Jun 22, 2021

Would there be a reason to not always use the new assignment strategy? I'd prefer to have fewer options to confuse people.

@dasch
Copy link
Collaborator

dasch commented Jun 22, 2021

Looks good! I'd like to see if we could perhaps merge this with the existing strategy rather than adding a new one, if this is purely additional in terms of functionality and reliability.

@eduardopoleoflipp
Copy link
Author

I just wanted to be cautious since I'm not all too familiarized with the project but yes, I can merge this with the original strategy. It should be easy enough since this already passes all the original strategy unit tests. Thanks for taking time to review.

@eduardopoleoflipp eduardopoleoflipp force-pushed the multiple_topic_subscriptions branch 3 times, most recently from c8b2266 to d82f4b7 Compare July 15, 2021 18:39
@eduardopoleoflipp
Copy link
Author

Hey @dasch ! I incorporated your suggestions. There are a few failing tests but I have a strong suspicious that these are flaky tests. There are all of the type of:

Kafka::ConnectionError:
       Could not connect to any of the seed brokers:
       - kafka://localhost:9092: Connection error IOError: stream closed in another thread
      
Kafka::UnknownError:
       Kafka::UnknownError

Kafka::DeliveryFailed:
       Could not connect to any of the seed brokers:
       - kafka://localhost:9092: Connection error IOError: stream closed in another thread

I've pushed already 3 times and I'm always getting similar errors but for different specs. Also I find it hard to believe that my changes would break the connection to the broker.

@dasch
Copy link
Collaborator

dasch commented Jul 26, 2021

Yeah, unfortunately the test suite is really flake – I would love to have someone improve that :D

Copy link
Collaborator

@dasch dasch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good! Can you add an entry to the changelog?

Eduardo Poleo added 2 commits July 26, 2021 09:53
The existing assignment strategy assumes identical subscriptions
among consumers within the same consumer group. We need a more
general implementation that accounts for different topic subscriptions
to be able to perform correct assignments.The new implementation presented
here is heavily inspired by the Kafka java client RoundRobinAssignor
https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/clients/consumer/RoundRobinAssignor.java
…iption information

The leader consumer thread needs to have knowledge about
all siblings consumer subscriptions in order to be perform
the correct partition assignments.
For consumers with different subscription this is not possible
through the local `@topics`. A way to solve this is by obtaining
this info from the cluster itself through the `@members` var.
@eduardopoleoflipp
Copy link
Author

Great! just added the entry to the changelog. Thank you so much!

@dasch
Copy link
Collaborator

dasch commented Jul 26, 2021

Awesome! I'd like to cut an alpha release and have you test that out in production, if that's okay?

@dasch dasch merged commit 7aadb07 into zendesk:master Jul 26, 2021
@eduardopoleoflipp
Copy link
Author

Great yes of course, we will!

@eduardopoleoflipp
Copy link
Author

@dasch I'll ping you on the dev slack channel once we've ascertained that everything is looking good on our system with the new changes.

@dasch
Copy link
Collaborator

dasch commented Jul 27, 2021

Great!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants