-
Notifications
You must be signed in to change notification settings - Fork 5.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for query heartbeat from coordinator that is shutting down #16322
Conversation
125335f
to
02155d5
Compare
@@ -141,8 +144,10 @@ public void registerQueryHeartbeat(String nodeId, BasicQueryInfo basicQueryInfo) | |||
{ | |||
requireNonNull(nodeId, "nodeId is null"); | |||
requireNonNull(basicQueryInfo, "basicQueryInfo is null"); | |||
Stream<InternalNode> activeOrShuttingDownCoordinators = concat(internalNodeManager.getCoordinators().stream(), | |||
internalNodeManager.getNodes(SHUTTING_DOWN).stream().filter(InternalNode::isCoordinator)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it might make sense to expose Shutting down coordinator as a seperate method from InternalNodeManager. i.e. 'getShuttingDownCoordinator` or something similar.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also wondering, what value are we going to get with this check here where a coordinator sends it's heartbeat to the RM and we check if it belongs to the same set of nodes or not? We don't do same for the node heartbeat. @tdcmeehan Any specific reason for this check here or can we remove this entirely?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see any other reason other than a scenario where RM receives a heart beat from a coordinator that does not belong to the cluster, not sure if this is possible. I will wait for Tim to comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's more of a check to ensure that we're not making decisions based on heartbeats which are inconsistent with discovery service. I would like to eventually add a similar check for node heartbeats.
02be719
to
7bc2f97
Compare
Currently query heartbeat is failing to register if it comes from a coordinator node which is shutting down. The fix here is to consider query heartbeat from a shutting down coordinator as a valid one.
7bc2f97
to
a1cd0b7
Compare
Currently query heartbeat is failing to register if it comes from a coordinator node which is shutting down. The fix here is to consider query heartbeat from a shutting down coordinator as a valid one.
Test plan - unit test