-
Notifications
You must be signed in to change notification settings - Fork 74
Description
The tracking issue for the Java side PR.
Prevent client ending up with two addresses to same node when node is behind a private network
See hazelcast/hazelcast#17844 for details.
Background :
getOrConnect method tries to open a connection to an
address only once. It should never open a second connection
to a member. To achieve that we are using the remote address
of a connection as the source of truth. The remote address is the
one in the member list for members.
If the member is using private-addresses on a closed network,
the client can not use that address on its config to connect that member.
So we have two addresses to define a single member on the client-side.
How bug happens:
When a client is already connected to members, it could be the case that
all connections are dropped temporarily. In that case, we trigger
doConnectToCandidateCluster logic. It connects to a random address
provided in the config. This one can use public address of the member.
In the meantime, we have a periodic task ConnectToAllClusterMembersTask
that tries to connect to all members in the latest known memberlist.
This one uses the remote address which is the private address
Both of them uses getOrConnect and doConnectToCandidateCluster
can open a second connection while a first one is already there.
Note that some of our other services rely on single connection per member
logic.
How to fix:
uuidwill be used as the source of truth.
This means what we can not know the uuid of a connection before opening
a connection first. We will allow opening connections parallelly from
doConnectToCandidateClusterpath. As soon as a response returned from the
server(with uuid), we check if a connection is already there and close
the new connection before putting it inactiveConnection.
Periodic ConnectToAllClusterMembersTask will not try to connect
an already connected member.And it will use uuid to decide that.
- These two paths will not use the same method
getOrConnectto simplify
the logic. One will be based on the address and the other only uuid to
make the intent clear.
Test:
I could not write a unit test because this requires an inaccessible
private network behind a public address.