-
Notifications
You must be signed in to change notification settings - Fork 24.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tribe client connects directly to client node over transport #16756
Comments
This seems related to some other issue around client nodes connecting to other client nodes: #16815, #3617, #16105. Looking at the linked issue though, it seems like any client node does not even try to connect to other client nodes, while it should. The problem here seems to be the opposite, an attempt of connection that may not be desirable. I am not sure about the proposal. Why shouldn't the tribe node connect to the client nodes that are part of the cluster? I think every node should rather be able to connect to whichever other node in the cluster. I understand that the tribe node is already a client of its own, and it doesn't need to connect to other clients nodes when it comes to operations that involve data, but there are apis, like monitoring ones, that do need to have access to client nodes too. My reasoning goes along this other comment. |
Thanks @javanna , sounds good. It will be nice though if we document this behavior also in the tribe node documentation - will be helpful for admins out there who have to figure out what ports to open between the tribe node and the other nodes. |
Reopening this ticket for a follow up discussion. One side effect of the current behavior is that the tribe node log file gets filled up with heaps of exceptions like the one noted at the beginning of this issue. For instance, within a 16 hour period (< 1 day) with just 1 client node in a downstream cluster, the tribe node ends up logging 176Mb of log entries, pretty much filling up the log file with 21K instances of these exception stacks. While we do not intend to change the design that the tribe node will try to connect to all nodes in the cluster, it can be helpful if we can move these exceptions (when a tribe node attempts to connect to a client node) to the trace level. Thoughts? |
@ppf2 do you mean the log line that's part of the description of this issue or some other log line? |
Here you go :) We are seeing a ton of these indicating that the tribe node is trying to connect to a client node.
|
Thanks @ppf2! I am not sure we can change log level only when the log line comes from a tribe node. Seems like working around the problem. I think every node should get access to all the other nodes instead, including the client ones. This specific log line comes from calling nodes info from the tribe node. The tribe node will gather the info from all the nodes, as simple as that. Another way to work around it would be to not use the tribe node for monitoring calls, or filter out some of the nodes from this call (e.g. using node attributes). |
What I previously provided are workarounds, assuming that the firewall config stays the same. But given that we removed support for the |
what @javanna said. The tribe node should be able to connect to any node in the clusters it connects to. Agreed that it's confusing with the current way we treat client nodes as clients, but that's what we're changing... |
With this issue are we resolving whether tribe nodes in the federation cluster should connect to each other or not? We have a federation cluster with two tribe nodes, but no api output shows that tribe nodes have connected to each other. Is this an expected behaviour? Also is there any documentation on scaling tribe nodes? |
This is observed in a setup where the tribe node does not have firewall access over the transport port to a client node of the downstream cluster:
The message is benign since there is no reason for the tribe node to connect directly to a downstream cluster's client node. The behavior is likely due to how client nodes work (in general) where they will connect to all nodes in the cluster (and with tribe node being just a specialized client node, it just behaves the same way). Perhaps we can add an exclusion for tribe node so it will not attempt to connect to the client nodes in the downstream clusters, etc..
The text was updated successfully, but these errors were encountered: