New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add option to open listener sockets in SO_REUSEPORT mode #244

Merged
merged 7 commits into from Jun 3, 2017

Conversation

Projects
None yet
2 participants
@blind-oracle

blind-oracle commented May 22, 2017

These commits add a listener.reuseport option (which is disabled by default) that enables sniproxy to open the listener sockets in SO_REUSEPORT mode to allow multiple processes to bind to the same ip:port pair.

We gain kernel-based load-balancing of incoming connections that are evenly distributed between several sniproxy instances without the use of any external http/tcp proxy.

@dlundquist

Interesting, I'm not able to look this over in depth now, but will in the next few days. How would this compare to forking off several child worker processes after the listening socket has been opened each running an event loop accepting new connections? My first thought is that it would be better to hide the multiprocess complexity than expose it to the user and hope they get it right.

I'm also curious what sort of scaling you were obtaining with a single process which instigated this improvement.

I look forward to hearing back, thank you!

@blind-oracle

This comment has been minimized.

blind-oracle commented May 23, 2017

How would this compare to forking off several child worker processes after the listening socket has been opened each running an event loop accepting new connections? My first thought is that it would be better to hide the multiprocess complexity than expose it to the user and hope they get it right.

Well, you can look on Nginx benchmarks that were performed when they've introduced this feature as a replacement for the same multiprocess paradigm that you propose: https://www.nginx.com/blog/socket-sharding-nginx-release-1-9-1/ (tl;dr they've gained 20% reduced latency, 13x reduced latency jitter and 2x increased request rate)

We can use nginx model to fork workers opening their sockets in SO_REUSEPORT to hide the complexity of running several sniproxy manually, but i didn't have time to investigate the internals of sniproxy to implement this, so did a quick modification only.

The kernel balances incoming connections between processess consistently using hash over 4-tuple (src ip-port, dst ip-port) and it works fine even for UDP sockets.

I'm also curious what sort of scaling you were obtaining with a single process which instigated this improvement

We're using sniproxy+nginx to proxy https/http connections to certain domains in a large wireless network to allow these domains to work before authentication and keep our ACLs small and static.

Our main problem was not the scalability of sniproxy, but the limit of 65k TCP ports that is imposed on single IP-address that sniproxy uses as the source to initiate it's tunnels. So we needed multiple instances of sniproxy to listen on the same ip:port pair and each instance was using different source IP address to overcome this limitation. We could've placed HAProxy or something similar between clients and sniproxy in TCP mode, but SO_REUSEPORT is far more efficient and simple.

As for the CPU limitation, we've seen about 45% usage of a single core (Xeon 26xx v3 i think) by a single sniproxy process handling ~150Mbit TLS traffic, so when the traffic increased we could've easily hit 100% ceiling in a single thread/process model. So now we have two sniproxy processess in SO_REUSEPORT mode handling 300Mbit divided evenly between them. Although the main bottleneck is not traffic itself, but the rate of TLS handshakes that sniproxy has to decode and process, but i don't have connections/sec numbers at hand right now...

@dlundquist

Thanks for getting back so quickly to my earlier questions. The Nginx article was quite interesting and suprirsing. Sorry it took so long to get back again.

I'm onboard with merging this feature, but still a little concerned with providing proper guidance on when to use it.

I ran into the same outbound ephemeral port range limitation with weighttp, and would be open to extending the configuration syntax allow specifying multiple backend source IPs to provide a more user friendly way of overcoming ephemeral port exhaustion.

@@ -113,6 +113,7 @@ improve performance.
.nf
listener 192.0.2.10:80 {
protocol http
reuseport yes

This comment has been minimized.

@dlundquist

dlundquist Jun 1, 2017

Owner

This is the first boolean configuration open we are introducing to the configuration syntax, do we want to use: true/false, yes/no, and/or on/off?

This comment has been minimized.

@blind-oracle

blind-oracle Jun 1, 2017

Agreed, i've added parse_boolean function.

@@ -105,6 +106,13 @@ struct Keyword listener_stanza_grammar[] = {
(int(*)(void *, char *))accept_listener_protocol,
NULL,
NULL},
#ifdef SO_REUSEPORT

This comment has been minimized.

@dlundquist

dlundquist Jun 1, 2017

Owner

Do we want to modify the configuration grammar on platforms without SO_REUSEPORT or reject a configuration with reuseport turned on when the platform doesn't support SO_REUSEPORT?

This comment has been minimized.

@blind-oracle

blind-oracle Jun 1, 2017

The problem is that the platform sniproxy was built on and the platform it's being run can be different. And SO_REUSEPORT is only checked by CPP at build time. I've reworked my commits a little to indicate that (if) sniproxy was build without SO_REUSEPORT support if the config file contains "reuseport" directive.

#ifdef SO_REUSEPORT
int
accept_listener_reuseport(struct Listener *listener, char *reuseport) {
if (strncasecmp(reuseport, "yes", strlen(reuseport)) == 0)

This comment has been minimized.

@dlundquist

dlundquist Jun 1, 2017

Owner

See above comments. I would suggest using ifdef around behavior rather than function definition. We could pull parse_boolean_argument() into the config.c so we can support on, yes, true, etc.

This comment has been minimized.

@blind-oracle

blind-oracle Jun 1, 2017

I had to add the function to the listener.c because the listener is included by config and not vice versa... Maybe it's not the best solution.

@@ -454,6 +466,17 @@ init_listener(struct Listener *listener, const struct Table_head *tables, struct
return result;
}
#ifdef SO_REUSEPORT

This comment has been minimized.

@dlundquist

dlundquist Jun 1, 2017

Owner

I would prefer if we could ifdef around the smallest possible section of code (like initiate_server_connect()):

#ifdef SO_REUSEPORT
	result = setsockopt(sockfd, SOL_SOCKET, SO_REUSEPORT, &on, sizeof(on));
#else
        result = -ENOSYS;
#endif

This comment has been minimized.

@blind-oracle
@@ -37,6 +37,9 @@ struct Listener {
/* Configuration fields */
struct Address *address, *fallback_address, *source_address;
const struct Protocol *protocol;
#ifdef SO_REUSEPORT

This comment has been minimized.

@dlundquist

dlundquist Jun 1, 2017

Owner

If we treat resuseport yes as a valid configuration option on unsupported platforms this and accept_listener_reuseport will not need ifdef guards.

Новгородов Игорь added some commits Jun 1, 2017

Новгородов Игорь
Новгородов Игорь
Новгородов Игорь
return 1;
#else
err("sniproxy was built without SO_REUSEPORT support")

This comment has been minimized.

@dlundquist

dlundquist Jun 1, 2017

Owner

CI build is failing: missing a semicolon here.

@@ -49,13 +50,15 @@ struct Listener {
SLIST_ENTRY(Listener) entries;
};
static int parse_boolean(char *boolean);

This comment has been minimized.

@dlundquist

dlundquist Jun 1, 2017

Owner

static function declarations belong in the .c file since they are not exported beyond the module (akin to private methods in Java).

parse_boolean(char *boolean) {
int i;
for (i = 0; i < 3; i++) {
if (strncasecmp(boolean, boolean_true[i], strlen(boolean)) == 0)

This comment has been minimized.

@dlundquist

dlundquist Jun 1, 2017

Owner

I think we want to strcasecmp here otherwise 'o' will match on since it is matched first and the config tokenizer guarantees the argument is a null terminated string.

@dlundquist

There are still a few minor issues and the CI build is failing. I'm seeing a mix of tabs and spaces, and would like to introduce a new test (starting with functional_test) to test the two sniproxy daemon deployment, in addition if move parse boolean into the config module we can introduce a config_test case.

I'm happy to make these changes and merge this myself, but will not get to it until this evening (assuming I have internet access).

Новгородов Игорь
@blind-oracle

This comment has been minimized.

blind-oracle commented Jun 1, 2017

I've cleaned up the things you've mentioned. Sorry for the not the best quality, didn't had much time to test it thoroughly.

@dlundquist dlundquist merged commit 579bef4 into dlundquist:master Jun 3, 2017

1 check failed

continuous-integration/travis-ci/pr The Travis CI build failed
Details

@dlundquist dlundquist referenced this pull request Dec 18, 2017

Merged

Feature ipv6_only #270

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment