New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix network binding for ipv4/ipv6 #12942

Closed
wants to merge 9 commits into
base: master
from

Conversation

Projects
None yet
5 participants
@rmuir
Contributor

rmuir commented Aug 17, 2015

Elasticsearch doesn't work well today with modern machines that might support both ipv4 and ipv6. The worst problem here that curl http://localhost... does not work e.g. on mac (#12906), because we aren't bound to any v6 address.

Equally bad, we only bind to loopback by default for 2.0, so if you want to connect to the network "for real" you might provide an interface name, but that always tends to do the wrong thing too (#12915), e.g. pick link local or some other useless address.

This is a compromise fix that tries to keep things simple:

  • we bind to multiple addresses when a specified host/interface name has multiple addresses. If you dont like this, then specify a single address.
  • we still only publish by default to one (this would require more work).
  • no changes for ipv6 multicast or anything like that yet.

The default for which address to publish when bound to multiple addresses is pretty simple, we prefer ipv4 by default (java.net.preferIPv4Stack) and I think we should until things like multicast are fixed to work correct over ipv6. Otherwise we prefer "real addresses" > site local > link local and so on.

Some things are still a bit messy, because real cleanups need to not be right before a beta release.

Closes #12906
Closes #12915

@dakrone

This comment has been minimized.

Member

dakrone commented Aug 17, 2015

So, an interesting byproduct of this is that if I have an ES node bind to the public interface with bin/elasticsearch -Des.network.host=_en0_ and then start up another with no options (so it binds to loopback interfaces only), the loopback one detects and joins a cluster with the public facing one.

I'm not sure if this is desired or not, I guess I would expect the loopback one only to be able to form clusters with other nodes bound to the loopback interface. What do you think?

@dakrone

This comment has been minimized.

Member

dakrone commented Aug 17, 2015

Also, this is an OSX-only issue, Linux doesn't have this loopback <-> non-loopback communication weirdness.

@rmuir

This comment has been minimized.

Contributor

rmuir commented Aug 17, 2015

That is not related to this change! That is #12914 and I deferred it because its Mac OS X specific, and because its not related to these changes.

@dakrone

This comment has been minimized.

Member

dakrone commented Aug 17, 2015

That is not related to this change!

Okay, I totally agree about the deferring it. Just wanted to make sure it was something we are aware of (which of course there is the other issue open for it).

bindHost = settings.get(GLOBAL_NETWORK_BINDHOST_SETTING, settings.get(GLOBAL_NETWORK_HOST_SETTING));
}
if (bindHost == null) {
bindHost = defaultValue2;

This comment has been minimized.

@dakrone

dakrone Aug 17, 2015

Member

Do you prefer the double-if? If not I guess you could do:

if (bindHost == null) {
  bindHost = settings.get(GLOBAL_NETWORK_BINDHOST_SETTING, settings.get(GLOBAL_NETWORK_HOST_SETTING, defaultValue2));
}
return resolveInetAddress(bindHost);

Instead

This comment has been minimized.

@rmuir

rmuir Aug 17, 2015

Contributor

I have changes coming in a commit. Basically it was too hard for me to see the various fallbacks for the default case (null).

/** Sorts an address by preference. This way code like publishing can just pick the first one */
static int sortKey(InetAddress address, boolean prefer_v4) {
int key = address.getAddress().length;
if (prefer_v4 == false) {

This comment has been minimized.

@dakrone

dakrone Aug 17, 2015

Member

Doesn't this also need to check if the address is actually an ipv6 address also?

This comment has been minimized.

@rmuir

rmuir Aug 17, 2015

Contributor

No. The length in bytes is enough.

This comment has been minimized.

@dakrone

dakrone Aug 17, 2015

Member

Ahh of course, I missed the length call, thanks

/** Returns system default for SO_REUSEADDR */
public static boolean defaultReuseAddress() {
return Constants.WINDOWS ? false : true;

This comment has been minimized.

@dakrone

dakrone Aug 17, 2015

Member

Can you add a comment for why reusing addresses is bad on Windows? (I don't have any idea why)

This comment has been minimized.

@rmuir

rmuir Aug 17, 2015

Contributor

Its not related to this change, I just documented what it does.

throw new BindTransportException("Failed to resolve publish address", e);
}
this.boundAddress = new BoundTransportAddress(new InetSocketTransportAddress(boundAddress), new InetSocketTransportAddress(publishAddress));
logger.info("Bound http to address [{}]", boundSocket.get());

This comment has been minimized.

@dakrone

dakrone Aug 17, 2015

Member

This logging is helpful, thanks for adding it!

@dakrone

This comment has been minimized.

Member

dakrone commented Aug 17, 2015

Left a few comments but generally LGTM. I tested this locally on OSX and Linux. Would be great to have someone manually test on Windows as well just for a sanity check.

@@ -50,16 +50,16 @@ h3. Indexing
Let's try and index some twitter like information. First, let's create a twitter user, and add some tweets (the @twitter@ index will be created automatically):
<pre>
curl -XPUT 'http://127.0.0.1:9200/twitter/user/kimchy' -d '{ "name" : "Shay Banon" }'
curl -XPUT 'http://localhost:9200/twitter/user/kimchy' -d '{ "name" : "Shay Banon" }'

This comment has been minimized.

@s1monw

s1monw Aug 17, 2015

Contributor

who is this shay banon guy after all?

}
}
}
InetAddress address = resolvePublishHostAddress(publishHost, "_local_");

This comment has been minimized.

@s1monw

s1monw Aug 17, 2015

Contributor

I am not sure where we use this else but would make it sense to move _local_ to a constant? Primarily I am curious what this means and if we can add some javadocs to the constant

This comment has been minimized.

@rmuir

rmuir Aug 17, 2015

Contributor

I was concurrently cleaning up this code, because it was still too much for me. Please see the latest commit.

static final boolean PREFER_V4 = Boolean.parseBoolean(System.getProperty("java.net.preferIPv4Stack", "true"));
/** Sorts an address by preference. This way code like publishing can just pick the first one */
static int sortKey(InetAddress address, boolean prefer_v4) {

This comment has been minimized.

@s1monw

s1monw Aug 17, 2015

Contributor

I like this sorting here - neat!

InetSocketAddress boundAddress = (InetSocketAddress) serverChannels.get(0).getLocalAddress();

This comment has been minimized.

@s1monw

s1monw Aug 17, 2015

Contributor

When I first looked at this I wondered if this is prone to concurrent modification exception? I wondered if we should make the serverChannels final and use a CopyOnWriteArrayList instead. Reading without sync would be safe. But then looking at the entire file I think we should just make the list final and clear it before we close each channel? the volatile confuses me and I think it's not needd

This comment has been minimized.

@rmuir

rmuir Aug 17, 2015

Contributor

All the concurrency in these methods is super confusing. Its doing really slow stuff like binding to ports, which should not happen all the time. I am happy to fix the concurrency, it will be to make everything synchronized: this is all nuts.

This comment has been minimized.

@s1monw

s1monw Aug 17, 2015

Contributor

I agree - we can do this in a follow up!

@s1monw

This comment has been minimized.

Contributor

s1monw commented Aug 17, 2015

Left mostly minor comments! This looks great to me! I will run tests with node.mode=network quickly as well on that.

@@ -146,7 +146,7 @@
// node id to actual channel
protected final ConcurrentMap<DiscoveryNode, NodeChannels> connectedNodes = newConcurrentMap();
protected final Map<String, ServerBootstrap> serverBootstraps = newConcurrentMap();
protected final Map<String, Channel> serverChannels = newConcurrentMap();
protected final Map<String, List<Channel>> serverChannels = newConcurrentMap();
protected final Map<String, BoundTransportAddress> profileBoundAddresses = newConcurrentMap();

This comment has been minimized.

@s1monw

s1monw Aug 17, 2015

Contributor

this must be a ConcurrentMap you use putIfAbsent

@s1monw s1monw added the blocker label Aug 17, 2015

@s1monw

This comment has been minimized.

Contributor

s1monw commented Aug 17, 2015

I ran tests with -Des.node.mode=network and they pass +1 to push here is my LGTM

thanks rob!

@clintongormley

This comment has been minimized.

Member

clintongormley commented Aug 18, 2015

Merged in b3a9abb and 68307aa

@rmuir rmuir referenced this pull request Aug 21, 2015

Closed

Fully support IPv6 #10535

@jpountz jpountz removed the v2.1.0 label Aug 21, 2015

jaymode added a commit to jaymode/elasticsearch that referenced this pull request Sep 23, 2015

expose all addresses that the transports are bound to
In elastic#12942, the NettyTransport and NettyHttpServerTransport were updated to allow for binding
to multiple addresses. However, the BoundTransportAddress holder only exposed the first address
that the transport was bound to and this object is used to populate the values returned to the user
via our APIs.

This change exposes all of the bound addresses in the BoundTransportAddress holder, which allows
for an accurate representation of all interfaces that elasticsearch is bound to and listening on.

jaymode added a commit that referenced this pull request Sep 23, 2015

expose all addresses that the transports are bound to
In #12942, the NettyTransport and NettyHttpServerTransport were updated to allow for binding
to multiple addresses. However, the BoundTransportAddress holder only exposed the first address
that the transport was bound to and this object is used to populate the values returned to the user
via our APIs.

This change exposes all of the bound addresses in the BoundTransportAddress holder, which allows
for an accurate representation of all interfaces that elasticsearch is bound to and listening on.

jaymode added a commit that referenced this pull request Sep 23, 2015

expose all addresses that the transports are bound to
In #12942, the NettyTransport and NettyHttpServerTransport were updated to allow for binding
to multiple addresses. However, the BoundTransportAddress holder only exposed the first address
that the transport was bound to and this object is used to populate the values returned to the user
via our APIs.

This change exposes all of the bound addresses in the BoundTransportAddress holder, which allows
for an accurate representation of all interfaces that elasticsearch is bound to and listening on.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment