You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The current load balancing in the ClusterClient is a very naive round-robin. Every API call is simply given the next connection in the list and if a node is down the next one in the list is then used in a retry. No connection is every removed from the list, either temporarily or permanently. A synchronized index variable is used to determine which connection to use.
In a multi-threaded application a case can occur where an API call fails even when there are plenty of valid connections. This is due to thread interleaving causing a single thread to retry on the same "bad" node until the number of retries is exhausted.
The text was updated successfully, but these errors were encountered:
Update on current state: Attempting to fix this unearths a number of issues. The amount of work required is substantial in that a significant amount of the underlying code down to the original clients would need to be redesigned/refactored. At this point the best suggestion is to use an external load balancer rather than the ClusterClient.
The current load balancing in the ClusterClient is a very naive round-robin. Every API call is simply given the next connection in the list and if a node is down the next one in the list is then used in a retry. No connection is every removed from the list, either temporarily or permanently. A synchronized index variable is used to determine which connection to use.
In a multi-threaded application a case can occur where an API call fails even when there are plenty of valid connections. This is due to thread interleaving causing a single thread to retry on the same "bad" node until the number of retries is exhausted.
The text was updated successfully, but these errors were encountered: