Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KafkaSinkCluster: rack aware routing for fetch requests #1637

Merged
merged 2 commits into from
May 28, 2024

Conversation

rukai
Copy link
Member

@rukai rukai commented May 27, 2024

Progress towards: #1526

With this PR shotover should now route all requests to their correct rack.
In order to catch misconfigurations, or even shotover bugs, a shotover_out_of_rack_requests_count metric is introduced to count out of rack requests.
This matches the shotover_out_of_rack_requests_count metric used in CassandraSinkCluster.

Fetch requests can be routed to any replica, some of which are in shotover's rack and some are outside of shotover's rack.
Previously we were just sending fetch requests to a random replica.
But with this PR we now always send to a replica within shotover's rack, unless such a replica does not exist in which case we fall back to any replica at all.
To make this routing cheap to perform at runtime, shotover's stored partition replica nodes list is split into shotover_rack_replica_nodes and external_rack_replica_nodes fields.

For all other request types, there is only one possible destination.
For these request types shotover modifies the metadata response such that the client will send requests to the shotover in the same rack as the destination, ensuring that no cross-rack routing occurs.
e.g. MetadataResponse::controller_id is set to the shotover in the rack of the controller broker.

// If broker has no rack - use the first shotover node
// If broker has rack - use the first shotover node with the same rack
// This is deterministic because the list of shotover nodes is sorted.
if let Some(shotover_node) = self.shotover_nodes.iter().find(|shotover_node| {
controller_node
.rack
.as_ref()
.map(|rack| rack == &shotover_node.rack)
.unwrap_or(true)
}) {
metadata.controller_id = shotover_node.broker_id;

TODO in follow up PR:

  • Detect out of rack requests for all other request types and call out_of_rack_requests.increment(1)

@rukai rukai force-pushed the kafka_rack_aware_routing branch from fff1bc9 to 0b7e736 Compare May 27, 2024 01:23
Copy link

codspeed-hq bot commented May 27, 2024

CodSpeed Performance Report

Merging #1637 will not alter performance

Comparing rukai:kafka_rack_aware_routing (309e5c8) with main (65b280b)

Summary

✅ 37 untouched benchmarks

@rukai rukai marked this pull request as ready for review May 27, 2024 01:53
@rukai rukai requested a review from conorbros May 27, 2024 01:53
@rukai rukai merged commit 56b2922 into shotover:main May 28, 2024
41 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants