Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ZOOKEEPER-3698: fixing NoRouteToHostException when starting large cluster locally #1228

Closed
wants to merge 1 commit into from

Conversation

symat
Copy link
Contributor

@symat symat commented Jan 20, 2020

When we tested RC 3.6.0, we had a problem of starting ZooKeeper cluster with large
number (11+) of ensemble members locally on mac. We found exceptions in the logs
when the new MultiAddress feature tries to filter the unreachable hosts from the
address list. This involves the calling of the InetAddress.isReachable method with
a default timeout of 500ms, which goes down to a native call in java and basically
try to do a ping (an ICMP echo request) to the host. Naturally, the localhost should
be always reachable.

The problem was that on mac we have the ICMP rate limit set to 250 by default.

In this patch we:

  • changed the reachability check behavior by disabling the check if there is only
    a single address provided (so we wouldn't be able to filter the unreachable
    addresses anyway).
  • added and documented a configuration parameter to disable the reachability check
    for testing. (default: enabled)
  • added and documented a configuration parameter to set the timeout for the
    reachability checks. (default: 1000ms)

Copy link
Contributor

@anmolnar anmolnar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice patch. Just a few comments.


Default value is **true**. By setting this parameter to 'false' you can disable the reachability checks.
Please note, disabling the reachability check will cause the cluster not to be able to reconfigure
itself properly during network problems, so the disabling is advised only during testing.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you please elaborate a little bit on this? Why will the cluster not able to reconfigure if the ping check is disabled?
On the other hand, does that mean that multi-address feature only works properly if hosts respond to ICMP Echo requests?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for checking! :)

The whole purpose of the multi-address feature is to always try to use an address which works. The current implementation is (in case of the leader election) always filters the address list using InetAddress.isReachable() calls to find out which is the working server address. This will cause ICMP calls (or TCP connections on port 7 (Echo) of the destination host), depending on the native implementation (see: https://docs.oracle.com/javase/7/docs/api/java/net/InetAddress.html#isReachable(int) )

So if the InetAddress.isReachable can not reach the host, then the current multi-address feature will not able to take the given address as a working one. Basically right now it can not distinguish between the case of a broken network link (when the whole node is unreachable) and the case of a disabled ICMP (when only the ICMP port and the port 7 is disabled in the firewall of the destination host). I am not an expert in cluster / firewall operation, so I can not tell how serious is this limitation.

One way to improve this could be to implement something like the ruok 4LW command for the server ports. Some simple request-response messages that only shows that the server is alive and listen on the given election / quorum port. Then we could use that instead of the ICMP calls. I think this would be a reasonable improvement, but maybe more like a separate task, out of the scope of 3.6.0.

What do you think?

(also: do you think I should extend the documentation, or you just wanted to elaborate here in the PR?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After thinking a bit more:

One other improvement can be to implement something like how the Learner is doing this right now (if I remember correctly, it basically starts to connect to all known Quorum ports in parallel, then keep the connection which is established first). However, it might be more tricky in case of the Leader Election protocol...

An other way would be just to try to establish a connection to the election addresses one-by-one, and go to the next one if the call fails. It would be slower, but we wouldn't rely on InetAddress.isReachable().

However, in both cases, it can be tricky to detect if the current election address become unavailable. This is an other edge case where we use InetAddress.isReachable(). (this is why we call the SendWorker.asyncValidateIfSocketIsStillReachable())

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@symat Agreed. All suggestions would be nice improvements, but we definitely need separate Jira / patch for it. isReachable() is suitable for now.

Copy link
Contributor

@eolivelli eolivelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.
It is exactly following what we discussed on JIRA

@eolivelli
Copy link
Contributor

@anmolnar are you okay with this patch?

…ster locally

When we tested RC 3.6.0, we had a problem of starting ZooKeeper cluster with large
number (11+) of ensemble members locally on mac. We found exceptions in the logs
when the new MultiAddress feature tries to filter the unreachable hosts from the
address list. This involves the calling of the InetAddress.isReachable method with
a default timeout of 500ms, which goes down to a native call in java and basically
try to do a ping (an ICMP echo request) to the host. Naturally, the localhost should
be always reachable.

The problem was that on mac we have the ICMP rate limit set to 250 by default.

In this patch we:
- changed the reachability check behavior by disabling the check if there is only
a single address provided (so we wouldn't be able to filter the unreachable
addresses anyway).
- added and documented a configuration parameter to disable the reachability check
for testing. (default: enabled)
- added and documented a configuration parameter to set the timeout for the
reachability checks. (default: 1000ms)
@symat
Copy link
Contributor Author

symat commented Jan 22, 2020

FYI: I just rebased to the current branch-3.6, as there was a conflict in the zookeeperAdmin.md file (as the recently pushed ZOOKEEPER-3482 changed that too)

Copy link
Contributor

@anmolnar anmolnar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 lgtm.

We would need this patch for master, but I don't think it's a problem for the merge script. Let's try.

asfgit pushed a commit that referenced this pull request Jan 23, 2020
…ster locally

When we tested RC 3.6.0, we had a problem of starting ZooKeeper cluster with large
number (11+) of ensemble members locally on mac. We found exceptions in the logs
when the new MultiAddress feature tries to filter the unreachable hosts from the
address list. This involves the calling of the InetAddress.isReachable method with
a default timeout of 500ms, which goes down to a native call in java and basically
try to do a ping (an ICMP echo request) to the host. Naturally, the localhost should
be always reachable.

The problem was that on mac we have the ICMP rate limit set to 250 by default.

In this patch we:
- changed the reachability check behavior by disabling the check if there is only
a single address provided (so we wouldn't be able to filter the unreachable
addresses anyway).
- added and documented a configuration parameter to disable the reachability check
for testing. (default: enabled)
- added and documented a configuration parameter to set the timeout for the
reachability checks. (default: 1000ms)

Author: Mate Szalay-Beko <szalay.beko.mate@gmail.com>

Reviewers: eolivelli@apache.org, andor@apache.org

Closes #1228 from symat/ZOOKEEPER-3698-branch-3.6
@asfgit asfgit closed this in b4c5a7f Jan 23, 2020
@anmolnar
Copy link
Contributor

Merged to master and branch-3.6
Thanks @symat !

junyoungKimGit pushed a commit to junyoungKimGit/zookeeper that referenced this pull request Feb 7, 2020
…ster locally

When we tested RC 3.6.0, we had a problem of starting ZooKeeper cluster with large
number (11+) of ensemble members locally on mac. We found exceptions in the logs
when the new MultiAddress feature tries to filter the unreachable hosts from the
address list. This involves the calling of the InetAddress.isReachable method with
a default timeout of 500ms, which goes down to a native call in java and basically
try to do a ping (an ICMP echo request) to the host. Naturally, the localhost should
be always reachable.

The problem was that on mac we have the ICMP rate limit set to 250 by default.

In this patch we:
- changed the reachability check behavior by disabling the check if there is only
a single address provided (so we wouldn't be able to filter the unreachable
addresses anyway).
- added and documented a configuration parameter to disable the reachability check
for testing. (default: enabled)
- added and documented a configuration parameter to set the timeout for the
reachability checks. (default: 1000ms)

Author: Mate Szalay-Beko <szalay.beko.mate@gmail.com>

Reviewers: eolivelli@apache.org, andor@apache.org

Closes apache#1228 from symat/ZOOKEEPER-3698-branch-3.6

(cherry picked from commit 8352f78)
Signed-off-by: Andor Molnar <andor@apache.org>
stickyhipp pushed a commit to stickyhipp/zookeeper that referenced this pull request Aug 19, 2020
…ster locally

When we tested RC 3.6.0, we had a problem of starting ZooKeeper cluster with large
number (11+) of ensemble members locally on mac. We found exceptions in the logs
when the new MultiAddress feature tries to filter the unreachable hosts from the
address list. This involves the calling of the InetAddress.isReachable method with
a default timeout of 500ms, which goes down to a native call in java and basically
try to do a ping (an ICMP echo request) to the host. Naturally, the localhost should
be always reachable.

The problem was that on mac we have the ICMP rate limit set to 250 by default.

In this patch we:
- changed the reachability check behavior by disabling the check if there is only
a single address provided (so we wouldn't be able to filter the unreachable
addresses anyway).
- added and documented a configuration parameter to disable the reachability check
for testing. (default: enabled)
- added and documented a configuration parameter to set the timeout for the
reachability checks. (default: 1000ms)

Author: Mate Szalay-Beko <szalay.beko.mate@gmail.com>

Reviewers: eolivelli@apache.org, andor@apache.org

Closes apache#1228 from symat/ZOOKEEPER-3698-branch-3.6

(cherry picked from commit 8352f78)
Signed-off-by: Andor Molnar <andor@apache.org>
RokLenarcic pushed a commit to RokLenarcic/zookeeper that referenced this pull request Aug 31, 2022
…ster locally

When we tested RC 3.6.0, we had a problem of starting ZooKeeper cluster with large
number (11+) of ensemble members locally on mac. We found exceptions in the logs
when the new MultiAddress feature tries to filter the unreachable hosts from the
address list. This involves the calling of the InetAddress.isReachable method with
a default timeout of 500ms, which goes down to a native call in java and basically
try to do a ping (an ICMP echo request) to the host. Naturally, the localhost should
be always reachable.

The problem was that on mac we have the ICMP rate limit set to 250 by default.

In this patch we:
- changed the reachability check behavior by disabling the check if there is only
a single address provided (so we wouldn't be able to filter the unreachable
addresses anyway).
- added and documented a configuration parameter to disable the reachability check
for testing. (default: enabled)
- added and documented a configuration parameter to set the timeout for the
reachability checks. (default: 1000ms)

Author: Mate Szalay-Beko <szalay.beko.mate@gmail.com>

Reviewers: eolivelli@apache.org, andor@apache.org

Closes apache#1228 from symat/ZOOKEEPER-3698-branch-3.6

(cherry picked from commit 8352f78)
Signed-off-by: Andor Molnar <andor@apache.org>
RokLenarcic pushed a commit to RokLenarcic/zookeeper that referenced this pull request Sep 3, 2022
…ster locally

When we tested RC 3.6.0, we had a problem of starting ZooKeeper cluster with large
number (11+) of ensemble members locally on mac. We found exceptions in the logs
when the new MultiAddress feature tries to filter the unreachable hosts from the
address list. This involves the calling of the InetAddress.isReachable method with
a default timeout of 500ms, which goes down to a native call in java and basically
try to do a ping (an ICMP echo request) to the host. Naturally, the localhost should
be always reachable.

The problem was that on mac we have the ICMP rate limit set to 250 by default.

In this patch we:
- changed the reachability check behavior by disabling the check if there is only
a single address provided (so we wouldn't be able to filter the unreachable
addresses anyway).
- added and documented a configuration parameter to disable the reachability check
for testing. (default: enabled)
- added and documented a configuration parameter to set the timeout for the
reachability checks. (default: 1000ms)

Author: Mate Szalay-Beko <szalay.beko.mate@gmail.com>

Reviewers: eolivelli@apache.org, andor@apache.org

Closes apache#1228 from symat/ZOOKEEPER-3698-branch-3.6

(cherry picked from commit 8352f78)
Signed-off-by: Andor Molnar <andor@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants