Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Group and Txn metadata topics should be queried directly from the controller #7716

Closed

Conversation

viktorsomogyi
Copy link
Contributor

GroupMetadataManager and TransactionStateManager should use a direct broker-to-controller channel to query the number of partitions instead of relying on Zookeeper.

This change introduces a new class that always sends the request to the active controller. In case the cached controller isn't available or not the controller it closes the connection and tries to refresh itself from the local metadataCache until it finds the active controller.

BrokerToControllerMetadataManager manages the request queue that is consumed by the request thread and also controls its lifecycle. Lazy initialization is used as the means of creating the thread so it won't try to create it before there is an actual need for it. The public methods of this class supposed to implement the high level functions that are queried by various classes (in this case GroupMetadataManager and TransactionStateManager) and return KafkaFuture so that the users of this class can work asynchronously over a blocking connection that the BrokerToControllerRequestThread implements.

Committer Checklist (excluded from commit message)

  • Verify design and implementation
  • Verify test coverage and CI build status
  • Verify documentation (including upgrade notes)

@viktorsomogyi
Copy link
Contributor Author

@dhruvilshah3, @cmccabe, @ijuma I've published the broker-to-controller channel I've been working on. There are still a few things that I'd like to do (mainly on the testing side) hence the PR is a draft for now but wanted to get your opinion on this approach if it's in sync with what you're thinking of. (And if the implementation is in line with your thinking we could get into a deeper review.)

@ijuma
Copy link
Contributor

ijuma commented Nov 20, 2019

InterBrokerSendThread was meant to be a building block for this kind of thing. Having said that, was a decision made that we should make the call to the Controller for this instead fo relying on the metadata cache? There are some issues with the latter and maybe this is the right approach as we move to a pull model, but I didn't see much discussion. cc @cmccabe @hachikuji

@viktorsomogyi
Copy link
Contributor Author

Yea I was looking at InterBrokerSendThread but I thought we'd be OK with a less complex solution. For instance I think it's enough to use one queue instead of per node as handling controller failovers would be easier. Also a blocking call might be enough similarly to the controller-to-broker communication. Although if you think it would be better to do this with the InterBrokerSendThread I'm fine with rewriting that part (I don't expect it to add too much overhead).
Regarding the pull vs metadata cache decision: I was inferring this from KIP-500's "New Controller APIs" section as it says in some cases we'd need new API to replace an operation that was formerly done via ZooKeeper. It brings up ISR altering as an example but I think it will need to be applied for other protocols as well, such as broker bootstrapping and registration, log dir failure handling, producer ID management. I can write a KIP too if you think this should be discussed more elaborately.

@viktorsomogyi
Copy link
Contributor Author

Had a chat with @satishd yesterday and it seems like continuing with the InterBrokerSendThread would be indeed better. Will update this PR with the change using IBST.

@viktorsomogyi
Copy link
Contributor Author

retest this please

@viktorsomogyi viktorsomogyi marked this pull request as ready for review February 27, 2020 13:53
@viktorsomogyi
Copy link
Contributor Author

@ijuma @cmccabe @hachikuji I rebased this and actualized it a bit. Would you please review this? Does this need more elaborate discussion such as a KIP?

@viktorsomogyi
Copy link
Contributor Author

retest this please

@abbccdda
Copy link
Contributor

If this change adds or mutates jmx metrics, I think we should do a KIP.

@viktorsomogyi
Copy link
Contributor Author

@abbccdda it shouldn't modify JMX metrics

*/
private def getGroupMetadataTopicPartitionCount: Int = {
zkClient.getTopicPartitionCount(Topic.GROUP_METADATA_TOPIC_NAME).getOrElse(config.offsetsTopicNumPartitions)
controllerChannel.getPartitionCount(Topic.GROUP_METADATA_TOPIC_NAME).get
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one needs to be more robust, now it's prone to timeout related errors. Working on this.

Copy link
Contributor

@abbccdda abbccdda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR, I got a high level question, which is whether the change considers the controller broker version, which means whether the targeted controller could answer the topic metadata request in all scenarios?

@@ -169,6 +169,8 @@ class KafkaServer(val config: KafkaConfig, time: Time = Time.SYSTEM, threadNameP
var metadataCache: MetadataCache = null
var quotaManagers: QuotaFactory.QuotaManagers = null

var controllerChannel: BrokerToControllerChannelManager = _
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the benefit of initializing as default value here vs starting from null?

}
}
} catch {
case e: Exception =>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we be more strict here about exception handling? I don't think we shall continue in every possible exceptions, if we could brainstorm :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Of course, let's brainstorm! :)

My main goal with this was that ideally we should catch network related exceptions when the request fails due to disconnect events. Do you have specific ideas about what to catch here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, I agree. We could just stay here as long as we are not detecting any other fatal exceptions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tomorrow I'll write some test cases for this for some specific scenarios.
I think though that usually we should try and handle most exceptions and reconnect to a controller when possible. On the other hand we likely don't want to catch any non-Exception Throwables as those are usually more serious cases (for instance OOM).

@abbccdda
Copy link
Contributor

abbccdda commented Mar 3, 2020

@hachikuji @cmccabe Could you also take a look?

@viktorsomogyi
Copy link
Contributor Author

@hachikuji , @cmccabe I rebased my solution. Would you please look at this and review it and suggest if it's fine, needs more tests or how can we proceed with this?

@viktorsomogyi
Copy link
Contributor Author

Hey folks, I'm closing this PR due to the lack of interest. If anyone interested, please feel free to pick up the related jira.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants