New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

http: Fail initialization when any bind fails #14968

Closed
wants to merge 1 commit into
base: master
from

Conversation

Projects
None yet
7 participants
@laanwj
Copy link
Member

laanwj commented Dec 15, 2018

Currently the HTTP server initialization (HTTPBindAddresses) fails only when all bindings fail. So if multiple binds are specified (127.0.0.1 and ::1 by defeault) and one succeeds and the other fails, the latter is essentially ignored.

This commit changes the error behavior to fail if not all binds could be performed, which I think is more in line with how software normally handles this and what users expect.

Needs mention in release notes.

@laanwj laanwj added the RPC/REST/ZMQ label Dec 15, 2018

@luke-jr

This comment has been minimized.

Copy link
Member

luke-jr commented Dec 15, 2018

Doesn't binding 127.0.0.1 fail if ::1 is already bound, on some systems? (Because ::1 includes 127.0.0.1 in practice.)

@laanwj

This comment has been minimized.

Copy link
Member

laanwj commented Dec 15, 2018

Doesn't binding 127.0.0.1 fail if ::1 is already bound, on some systems? (Because ::1 includes 127.0.0.1 in practice.)

No, that's definitely not true. No localhost address or subnet should include each other.

lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
        inet 127.0.0.1  netmask 255.0.0.0
        inet6 ::1  prefixlen 128  scopeid 0x10<host>

so:

127.0.0.1/8 (0:0:0:0:0:0:7f00:0001/96 if embedded in IPv6)
0:0:0:0:0:0:0:1/128

None of these overlaps.

You're mixing it up with ::0 / INADDR_ANY, which can be all-inclusive in some systems (sometimes depending on dual-stack settings).

@MeshCollider

This comment has been minimized.

Copy link
Member

MeshCollider commented Dec 15, 2018

Concept ACK

The error message here also needs updating for the new behaviour:

bitcoin/src/httpserver.cpp

Lines 393 to 396 in f128eca

if (!HTTPBindAddresses(http)) {
LogPrintf("Unable to bind any endpoint for RPC server\n");
return false;
}

@laanwj laanwj force-pushed the laanwj:2018_12_http_bind_error branch from f128eca to 8d9f29e Dec 16, 2018

@laanwj

This comment has been minimized.

Copy link
Member

laanwj commented Dec 16, 2018

@MeshCollider thanks, updated

@sipa

This comment has been minimized.

Copy link
Member

sipa commented Dec 16, 2018

utACK

@gmaxwell

This comment has been minimized.

Copy link
Member

gmaxwell commented Dec 16, 2018

utACK would also be nicer to get this in sooner rather than later, so we get more of a chance to find out that SpatulaBSD behaves like luke thought before this ends up in a release. :)

@Empact

This comment has been minimized.

Copy link
Member

Empact commented Dec 16, 2018

Concept ACK

@MarcoFalke

This comment has been minimized.

Copy link
Member

MarcoFalke commented Dec 16, 2018

Travis fails when creating the cache. Could be because of the default value of bind_to_localhost_only? Not sure if self.bind_to_localhost_only even makes sense after the recent changes to how rpc binds.

@laanwj

This comment has been minimized.

Copy link
Member

laanwj commented Dec 17, 2018

bind_to_localhost_only has to do with P2P, not RPC. IIRC RPC was always already bound to localhost only for the tests, apart from the bind test.

@laanwj

This comment has been minimized.

Copy link
Member

laanwj commented Dec 17, 2018

I guess this has to do with Travis not supporting (even local!) IPv6 at all, so the IPv6 bind fails and it doesn't start.

I suppose not being able to auto-bind on either IPv4 or IPv6 is valid if there is no such interface, at least if -rpcbind was not explicitly given.

@laanwj laanwj force-pushed the laanwj:2018_12_http_bind_error branch from 8d9f29e to 8a4b1c5 Dec 17, 2018

@laanwj

This comment has been minimized.

Copy link
Member

laanwj commented Dec 17, 2018

OK, per suggestion by @gmaxwell I've changed the logic. By default, if no explicit bind configuration is given, the EADDRNOTAVAIL error while binding is ignored (but not, say, EADDRINUSE). This is user-friendly as it means the default setting will auto-detect based on available IPv4/IPv6. In all other cases all errors are fatal.

This can be tested using linux network namespaces

ip netns add dumb
ip netns exec dumb ip addr add 127.0.0.1/8 dev lo # adds both IPv4 and IPv6
ip netns exec dumb ip addr del ::1/128 dev lo # remove IPv6
ip netns exec dumb ip link set lo up
ip netns exec dumb src/bitcoind -regtest

@laanwj laanwj force-pushed the laanwj:2018_12_http_bind_error branch from 8a4b1c5 to 247ce61 Dec 17, 2018

@laanwj

This comment has been minimized.

Copy link
Member

laanwj commented Dec 17, 2018

Sigh, that still didn't solve the Travis issue though. Locally running the tests with no IPv6 localhost worked fine for me now.

@laanwj laanwj force-pushed the laanwj:2018_12_http_bind_error branch from 247ce61 to a846175 Dec 17, 2018

}
}
if (num_fail != 0) {
// In case of an error, clean up listening sockets that succeeded to

This comment has been minimized.

@laanwj

laanwj Dec 17, 2018

Member

Needed this to prevent a leak and segfault in StopHTTPServer (because the owning evhttp is freed but the listeners are not).

@laanwj laanwj force-pushed the laanwj:2018_12_http_bind_error branch from 7c5e478 to 515288d Dec 17, 2018

http: Fail initialization when any bind fails
Currently the HTTP server initialization (`HTTPBindAddresses`) fails
only when *all* bindings fail. So if multiple binds are specified
(`127.0.0.1` and `::1` by defeault) and one succeeds and the other
fails, the latter is essentially ignored.

This commit changes the error behavior to fail *if not all* binds could
be performed, which I think is more in line with how software normally
handles this and what users expect.

@laanwj laanwj force-pushed the laanwj:2018_12_http_bind_error branch from bfc1899 to 7b5e400 Dec 17, 2018

@laanwj laanwj added this to the 0.18.0 milestone Dec 17, 2018

@luke-jr

This comment has been minimized.

Copy link
Member

luke-jr commented Dec 28, 2018

Unfortunately, evhttp_bind_socket_with_handle returns a 0/Success errno when listen fails. I've opened a report upstream to fix this, but obviously we can't rely on it now: libevent/libevent#738

@luke-jr

This comment has been minimized.

Copy link
Member

luke-jr commented Dec 28, 2018

One possible solution is to clone the evhttp code. I got it down to 65 lines: master...luke-jr:my_bind_socket_with_handle

@luke-jr

This comment has been minimized.

Copy link
Member

luke-jr commented Dec 28, 2018

At least on Travis, the failure isn't caused by listen()'s errno being lost.

Instead, what's happening is that getaddrinfo is returning EVUTIL_EAI_ADDRFAMILY. These errors are borrowed from the system's EAI_* constants, which [can] overlap with errno's constants, so it's not practical to just stuff them into errno from evhttp_bind_socket_with_handle... Opened a libevent issue for this: libevent/libevent#740

Added a somewhat-ugly hack on top of my branch: d2d51e7

@luke-jr

This comment has been minimized.

Copy link
Member

luke-jr commented Dec 28, 2018

Just noticed Travis is passing on this PR... did we lose some test coverage between 0.17 (where I'm testing it) and master?

@laanwj

This comment has been minimized.

Copy link
Member

laanwj commented Dec 28, 2018

Unfortunately, evhttp_bind_socket_with_handle returns a 0/Success errno when listen fails.

Strange, this worked for me, and it seems to work on Travis. As stated I've tested this (at least on Linux) in various ways.

An alternative solution here (which I initially wanted to do, but this seemed like a workable suggestion at the time if of course C code didn't mess up error handling everywhere) is to test which local interfaces are available, and only

  • try to bind to IPv4 if there's IPv4 localhost
  • try to bind to IPv6 if there's IPv6 localhost

This avoids having to check error codes, and could be more robust. I might give this a try. Though I'm not sure there's a good platform-independent way of doing this.

I do not think this is worth adding a forest of hacks cloning parts of code from libevent for. Better to leave it as-is then, it wasn't that bad.

@laanwj

This comment has been minimized.

Copy link
Member

laanwj commented Dec 28, 2018

WAIT: it's the bind in evhttp_bind_socket_with_handle failing that we're looking for here, not the listen. This seems to not touch the error code, so it can be retrieved succesfully:

	if ((fd = bind_socket(address, port, 1 /*reuse*/)) == -1)
		return (NULL);

I don't think we care about specific errno when listen fails, do we? This is always fatal.

@luke-jr

This comment has been minimized.

Copy link
Member

luke-jr commented Dec 28, 2018

With this backported to 0.17, Travis was indeed failing on getaddrinfo; no idea why it's passing on master. :/

@laanwj

This comment has been minimized.

Copy link
Member

laanwj commented Dec 28, 2018

Yes, tried in #15050, really strange. I think specific error code checking is a dead end, just too fragile.

Looking for a platform independent way now to check if there is a local IPv4 or local IPv6 interface without actually binding to it. libevent internally checks for a nonlocal interface in evutil_found_ifaddr but this is not what we want to know in this case.
Too bad the logic in evutil_check_ifaddrs is not exposed externally.

Looks like we have vaguely analogous logic in void Discover() in net.cpp, though for windows it falls back to 'look up localhost'.

@laanwj

This comment has been minimized.

Copy link
Member

laanwj commented Dec 28, 2018

Closing this for now.

@laanwj laanwj closed this Dec 28, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment