New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[HZ-581] Address issues with hostnames #20014
Merged
ramizdundar
merged 86 commits into
hazelcast:master
from
ufukyilmaz:5.1/hostname-fix-part-1
Jan 11, 2022
Merged
[HZ-581] Address issues with hostnames #20014
ramizdundar
merged 86 commits into
hazelcast:master
from
ufukyilmaz:5.1/hostname-fix-part-1
Jan 11, 2022
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
ufukyilmaz
added
Team: Core
Source: Internal
PR or issue was opened by an employee
Module: Network I/O
labels
Dec 1, 2021
ufukyilmaz
changed the title
HZ-581 Hostname fixes part 1
[HZ-581] Hostname fixes part 1
Dec 6, 2021
kwart
reviewed
Jan 3, 2022
hazelcast/src/main/java/com/hazelcast/instance/impl/NodeContext.java
Outdated
Show resolved
Hide resolved
kwart
reviewed
Jan 5, 2022
hazelcast/src/main/java/com/hazelcast/internal/cluster/impl/TcpIpJoiner.java
Outdated
Show resolved
Hide resolved
Since, it creates so much complexity without much gain.
Because if we close this connection after it starts to be used from one side, we lose some packets on the way.
…part-1 # Conflicts: # hazelcast/src/test/java/com/hazelcast/instance/TestNodeContext.java
ufukyilmaz
changed the title
[HZ-581] Hostname fixes
[HZ-581] Address issues with hostname
Jan 10, 2022
frant-hartm
changed the title
[HZ-581] Address issues with hostname
[HZ-581] Address issues with hostnames
Jan 10, 2022
Formerly, the tests were assuming that there is only one connection on each plane between the members. So, tests started to fail when we remove duplicate handling.
vbekiaris
approved these changes
Jan 11, 2022
sancar
approved these changes
Jan 11, 2022
kwart
approved these changes
Jan 11, 2022
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving to have this in the BETA release.
We still investigate some failures within the split-brain scenarios (reproducer in https://github.com/hazelcast/hazelcast-qe/tree/hostnames-bouncing/it/hostnames).
1 task
ufukyilmaz
pushed a commit
to ufukyilmaz/hazelcast
that referenced
this pull request
Jan 12, 2022
This reverts commit 16ec195.
6 tasks
ufukyilmaz
pushed a commit
to ufukyilmaz/hazelcast
that referenced
this pull request
Jan 12, 2022
This reverts commit 16ec195.
5 tasks
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
Add to Release Notes
Module: Network I/O
Source: Internal
PR or issue was opened by an employee
Team: Core
Type: Defect
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR includes the connection manager changes which are
prerequisite for resolving the hostname related issues.
What I'm trying to do in this PR is to handle the multiple addresses
that refer to the members in the connection manager layer, and I tried
to make only the public address of the members visible at the higher
level of the code except for the places where connection initiation is
made that takes addresses from the config.
Also, tried to keep the scope of this PR especially at the connection
manager layer and avoided some changes that would cause changes
in the other layers such as operation service.
The changes in this PR are:
LocalAddressRegistry
is introduced to storeAddress-UUID
and
UUID-Addresses
mappings.adjusting the connection managers to this uuid changes
advanced network is being used, then use the corresponding protocol's
public address) as the
remoteAddress
of connection which previouslyexposes the other target addresses to the outside of the connection
manager layer (
OperationRunnerImpl
gets this remote address of theconnection and sets it as the caller address of Operation so they expose)
The most difficult part of the PR is the determination of the lifecycle of the
UUID->Address
,Address->UUID
entries' lifecycle. We must determinewhen to remove these entries from our registry. We must remove these entries at
some point since they can get stale (and also resource usage)
The events that can leave these entries stale are as follows:
network address) can be changed. These are:
UUID than the previous member despite using the same network addresses
ClusterServiceImpl#reset
triggers the UUID change on themember without any change on the network address to which the member is bound.
This method is only called before the split brain merge happens.
host machine can use the UUID of a crashed member on a different address/es.
These two cases should be considered while determining the lifecycle of this
entries.
For now, UUID-Address entries registration takes place in:
For the members:
For the clients:
UUID-Address entries removal performed:
For the members:
connections between members, and we must not clear these entries when
one of these connections is closed.)
For the clients:
closed (safe to remove as there is only one active connection).
In testing:
Need to test the consistency of the entries in this
LocalAddressRegistry
(there must be no stale entries inside), and need to add tests that guarantee
that the entries inside do not leak.
Nightly test run on the latest commit: http://jenkins.hazelcast.com/job/ufuk-nightly-runner/26/
Checklist:
Team:
,Type:
,Source:
,Module:
) and Milestone setAdd to Release Notes
orNot Release Notes content
set@Nonnull/@Nullable
annotations@since
tags in Javadoc