Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rewrite client to support high volume of topics #234

Merged
merged 29 commits into from
Nov 27, 2018
Merged

Rewrite client to support high volume of topics #234

merged 29 commits into from
Nov 27, 2018

Conversation

mnorkin
Copy link
Contributor

@mnorkin mnorkin commented Nov 20, 2018

Implement high performant metadata read from kafka cluster.

Previous implementation looped over all the topics, which it received from kafka cluster. Current implementation fetches only metadata by topic, so no loop over all topics is required.

  • Rewrote couple of tests to reduce dependencies between tests
  • Added more test to prove single instance multi topic publish/ consume/ group consume
  • Wrote a kafka testkit to dynamically create topics. Now there is no need to download, run kafka and create topics for tests to work - just npm run test
  • No API changes to the client

@oleksiyk
Copy link
Owner

It looks like its a breaking change as current version allows GroupConsumer to subscribe to multiple topics (via subscriptions array).

@mnorkin
Copy link
Contributor Author

mnorkin commented Nov 20, 2018

@oleksiyk sorry, was not sure about that than I was refactoring. Will update tomorrow.

Should I add test for GroupConsumer to check for multiple topics?

@oleksiyk
Copy link
Owner

@mnorkin Well actually it also breaks the Producer too, which allows to send messages to any topic.

@mnorkin
Copy link
Contributor Author

mnorkin commented Nov 21, 2018

@oleksiyk I've rewritten the client to not break API. I'll make some performance comparisons for high number of topics to provide some evidence why this PR is important.

@oleksiyk
Copy link
Owner

I'll make some performance comparisons for high number of topics to provide some evidence why this PR is important.

I do believe there is significant difference. However there must be tests added that verify that this PR really supports producing and consuming from multiple topics within the same instance of Producer, SimpleConsumer or GroupConsumer. Otherwise I won't be able to merge it.

@mnorkin
Copy link
Contributor Author

mnorkin commented Nov 21, 2018

I'm adding more tests

@mnorkin
Copy link
Contributor Author

mnorkin commented Nov 23, 2018

@oleksiyk updated the description, build is green 🎉

@@ -196,7 +201,7 @@ GroupConsumer.prototype._fullRejoin = function () {
});
})
.catch(function (err) {
self.client.error('Full rejoin attempt failed:', err);
self.client.error('Full rejoin attempt failed:', JSON.stringify(err));
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mnorkin why do you stringily error here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed

@mnorkin
Copy link
Contributor Author

mnorkin commented Nov 26, 2018

@oleksiyk any updates?

@oleksiyk
Copy link
Owner

@mnorkin Give me some time to check and try it, it is a big update and I don't want to break any existing code relying on this library.

buffer = self.protocol.write().ListGroupsRequest({
correlationId: correlationId,
clientId: self.options.clientId
}).result;

return Promise.map(_.values(self.brokerConnections), function (connection) {
return Promise.map(_.values(self.initialBrokers), function (connection) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mnorkin Not sure if it can happen that brokerConnection won't include all of the initialBrokers? I'm asking because ListGroupsrequest must be sent to ALL brokers, or you won't receive all groups.

Copy link
Contributor Author

@mnorkin mnorkin Nov 27, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@oleksiyk How I understood, this command is executed only by groupAdmin and it does not contain any information about the topics, so no topology query here, hence only initialBrokers is present. It works if you know the the topology before hand, it does not work in cause you have more brokers, than initially defined.

However I believe this implementation is better than having all the topology read from the all brokers and freeze your application in case you have 100k topics

@oleksiyk oleksiyk merged commit 0a3072c into oleksiyk:master Nov 27, 2018
oleksiyk added a commit that referenced this pull request Nov 27, 2018
oleksiyk added a commit that referenced this pull request Nov 27, 2018
@oleksiyk
Copy link
Owner

@mnorkin Can you please make new PR, I accidentally merged this one while thinking I'm looking at other project page. Sorry about this.

@oleksiyk
Copy link
Owner

And can you please revert the ability to run tests locally, without Doker.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants