[bdp] Maximum value limits throughput on high-latency connections #2400

euroelessar · 2018-10-23T20:57:09Z

What version of gRPC are you using?

grpc-go 1.14 (but present on master as well)

What did you do?

Use server-side streaming (single response stream over single connection) and measure throughput.
The latency between two hosts is 100ms.

What did you expect to see?

Performance is limited by network stack or CPU, and is on par with iperf.

What did you see instead?

Performance is limited by 40MB/s. Both CPU and network are under-utilized.

gRPC network stack quickly reaches limit of 4MB window size and doesn't scale it further.

Manually specifying both InitialWindowSize and InitialConnWindowSize, or modifying grpc-go codebase to increase bdpLimit resolves the performance issue as well, but it doesn't scale well for busy or otherwise low-throughput network.

The text was updated successfully, but these errors were encountered:

dfawley · 2018-10-23T22:32:01Z

@euroelessar the 4MB window size limit is set based on standard TCP settings -- most TCP connections can't effectively utilize any larger of a window size. What is the discrepancy between our out-of-the-box 40MB/s and what you're seeing with iperf or when you set Initial[Conn]WindowSize? What iperf tests are you running when you see these results?

euroelessar · 2018-10-23T22:45:49Z

Currently the difference is 40MBps vs 100MBps for cross-zone traffic (≈150% delta), and 300MBps vs 450MBps for traffic within (≈50% delta). But within a metro we're likely hitting CPU utilization bottleneck at 450MBps, as iperf can do ~1.2GBps over single TCP connection.
It does worth noting that this numbers are for gRPC over TLS connection.

Regarding iperf test:
server: $ iperf iperf -s -p 6000
client: $ iperf -c ${remote_server} -p 6000 -P 1
output:

[  5] local $my port 6000 connected with $remote port 26460
[  5]  0.0-10.1 sec  1004 MBytes   838 Mbits/sec
[  4] local $my port 6000 connected with $local port 51102
[  4]  0.0-10.0 sec  10.7 GBytes  9.22 Gbits/sec

dfawley · 2018-10-24T17:31:55Z

Looks like 4MB is definitely too conservative of a default value - we are open to increasing this, or possibly reading the TCP settings and mirroring the receive window size when setting our flow control.

@euroelessar could you do a cat /proc/sys/net/ipv4/tcp_rmem so we can see what your TCP receive window size can scale to (on both the client and server if they are not homogeneous)? Thanks!

euroelessar · 2018-10-24T18:14:53Z

Sure,

$ cat /proc/sys/net/ipv4/tcp_rmem
4096	135168	16777216

dfawley added P1 Type: Performance Performance improvements (CPU, network, memory, etc) labels Oct 25, 2018

dfawley self-assigned this Nov 1, 2018

dfawley mentioned this issue Nov 13, 2018

transport: increase BDP limit to 16MB to improve performance for high latency networks #2455

Merged

dfawley closed this as completed in #2455 Nov 14, 2018

asubiotto mentioned this issue Mar 2, 2019

rpc: enable dynamic window resizing cockroachdb/cockroach#35161

Closed

lock bot locked as resolved and limited conversation to collaborators May 13, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[bdp] Maximum value limits throughput on high-latency connections #2400

[bdp] Maximum value limits throughput on high-latency connections #2400

euroelessar commented Oct 23, 2018

dfawley commented Oct 23, 2018

euroelessar commented Oct 23, 2018

dfawley commented Oct 24, 2018

euroelessar commented Oct 24, 2018

[bdp] Maximum value limits throughput on high-latency connections #2400

[bdp] Maximum value limits throughput on high-latency connections #2400

Comments

euroelessar commented Oct 23, 2018

What version of gRPC are you using?

What did you do?

What did you expect to see?

What did you see instead?

dfawley commented Oct 23, 2018

euroelessar commented Oct 23, 2018

dfawley commented Oct 24, 2018

euroelessar commented Oct 24, 2018