Skip to content

Commit

Permalink
Merge b6aafd6 into 4d9f0f6
Browse files Browse the repository at this point in the history
  • Loading branch information
xnuter committed Jan 19, 2021
2 parents 4d9f0f6 + b6aafd6 commit 712ba88
Show file tree
Hide file tree
Showing 58 changed files with 811 additions and 310 deletions.
406 changes: 98 additions & 308 deletions examples/README.md

Large diffs are not rendered by default.

11 changes: 11 additions & 0 deletions examples/cpp-config.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
### Running draft-http-tunnel (C++)

Repository: https://github.com/cmello/draft-http-tunnel/

```bash
sudo cgcreate -t $USER:$USER -a $USER:$USER -g cpuset:cpptunnel
echo 2-3 > /sys/fs/cgroup/cpuset/cpptunnel/cpuset.cpus
echo 0 > /sys/fs/cgroup/cpuset/cpptunnel/cpuset.mems

cgexec -g cpuset:cpptunnel --sticky ./draft_http_tunnel --bind 0.0.0.0:8081 tcp --destination localhost:80> /dev/null
```
13 changes: 13 additions & 0 deletions examples/golang-config.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
### Running tpc-proxy (Golang)

Repository: https://github.com/ickerwx/tcpproxy/

```bash
cd ~/go/src/github.com/jpillora/go-tcp-proxy/cmd/tcp-proxy

sudo cgcreate -t $USER:$USER -a $USER:$USER -g cpuset:tcpproxy
echo 2-3 > /sys/fs/cgroup/cpuset/tcpproxy/cpuset.cpus
echo 0 > /sys/fs/cgroup/cpuset/tcpproxy/cpuset.mems

cgexec -g cpuset:tcpproxy --sticky ./tcp-proxy -l localhost:8111 -r localhost:80 > /dev/null
```
48 changes: 48 additions & 0 deletions examples/haproxy-config.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
### Setting up HAProxy

We need to specify TCP frontend and backend. It's important to turn off logging, otherwise it would flood the disk.
Also, it should only use cores #2 and #3:

```
global
# disable logging
log /dev/log local0 warning alert
log /dev/log local1 warning alert
chroot /var/lib/haproxy
stats socket /run/haproxy/admin.sock mode 660 level admin expose-fd listeners
stats timeout 30s
user haproxy
group haproxy
# stick to cores 2 and 3
nbproc 2
cpu-map 1 2
cpu-map 2 3
daemon
frontend rserve_frontend
bind *:8999
mode tcp
timeout client 1m
default_backend rserve_backend
backend rserve_backend
mode tcp
option log-health-checks
log global
balance roundrobin
timeout connect 10s
timeout server 1m
server rserve1 localhost:80
```

### Starting

```bash
sudo systemctl start haproxy
```

### Stopping
```bash
sudo systemctl stop haproxy
```
45 changes: 45 additions & 0 deletions examples/java-config.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
### Running NetCrusher (Java)

Repository: https://github.com/NetCrusherOrg/netcrusher-java/

Empirically, I found out that you need to run two instances of NetCrusher to achieve higher density:

```bash
cd ~/netcrusher-core-0.10/bin

sudo cgcreate -t $USER:$USER -a $USER:$USER -g cpuset:javanio1
echo 2-3 > /sys/fs/cgroup/cpuset/javanio1/cpuset.cpus
echo 0 > /sys/fs/cgroup/cpuset/javanio1/cpuset.mems

cgexec -g cpuset:javanio1 --sticky ./run-tcp-crusher.sh localhost:8000 localhost:80
```

and
```bash
cd ~/netcrusher-core-0.10/bin

sudo cgcreate -t $USER:$USER -a $USER:$USER -g cpuset:javanio2
echo 2-3 > /sys/fs/cgroup/cpuset/javanio2/cpuset.cpus
echo 0 > /sys/fs/cgroup/cpuset/javanio2/cpuset.mems

cgexec -g cpuset:javanio2 --sticky ./run-tcp-crusher.sh localhost:8001 localhost:80
```

### Starting up

I used `tmux` to be able to run and shutdown instances:

```bash
tmux new-session -d -s "java1" ./start-java1.sh # port 8000
tmux new-session -d -s "java2" ./start-java2.sh # port 8001
```

### Shutting down

```bash
tmux kill-session -t java1
tmux kill-session -t java2

# sometimes it still managed to survive, so to make sure it's killed
ps -ef | grep java | grep -v grep | awk {'print $2'} | xargs kill -9
```
171 changes: 171 additions & 0 deletions examples/max-tps.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,171 @@
- [Max TPS](#max-tps)
* [High-performance (C, C++, Rust)](#high-performance--c--c----rust-)
+ [Maximum rate achieved](#maximum-rate-achieved)
+ [Regular percentiles (p50,90,99)](#regular-percentiles--p50-90-99-)
+ [Tail latency (p99.9 and p99.99)](#tail-latency--p999-and-p9999-)
+ [Trimmed mean and standard deviation](#trimmed-mean-and-standard-deviation)
+ [CPU consumption](#cpu-consumption)
+ [Summary](#summary)
* [Memory-safe languages (Rust, Golang, Java, Python)](#memory-safe-languages--rust--golang--java--python-)
+ [Maximum rate achieved](#maximum-rate-achieved-1)
+ [Regular percentiles (p50,90,99)](#regular-percentiles--p50-90-99--1)
+ [Tail latency (p99.9 and p99.99)](#tail-latency--p999-and-p9999--1)
+ [Trimmed mean and standard deviation](#trimmed-mean-and-standard-deviation-1)
+ [CPU consumption](#cpu-consumption-1)
+ [Summary](#summary-1)
* [Total summary](#total-summary)
* [Conclusion](#conclusion)

<small><i><a href='http://ecotrust-canada.github.io/markdown-toc/'>Table of contents generated with markdown-toc</a></i></small>

### Max TPS

The load is generated without any rate limiting with the concurrency setting `100`.

#### High-performance (C, C++, Rust)

##### Maximum rate achieved

The most interesting question, is how much RPS each solution can handle?

While Nginx is capable of handling `~60k` requests per second (impressive for just two cores!),
all three C/C++/Rust are somewhat comparable (but C++ handled slightly more requests):

* C - 45k
* C++ - 48.8k
* Rust - 46k

![](./prom/max-baseline-c-cpp-rust-rps.png)

##### Regular percentiles (p50,90,99)

The results are somewhat mixed again. While C++ showed better `p50`, it's `p99` is worse.
At the `p90` level all three are close:

![](./prom/max-baseline-c-cpp-rust-p50-99.png)

##### Tail latency (p99.9 and p99.99)

For the tail latency, Rust is better than both C and C++:

![](./prom/max-baseline-c-cpp-rust-tail.png)

##### Trimmed mean and standard deviation

All three are nearly identical, however C++ is a tiny bit better (see the table below for the numbers):

![](./prom/max-baseline-c-cpp-rust-mean.png)

##### CPU consumption

CPU utilization is important here. What we want, is to saturate the CPU as much as we can.

Baseline CPU Utilization is 73%, but in fact it is 93% of available cores (as cores 2 and 3 were not used).

![](./prom/max-baseline-c-cpp-rust-cpu.png)

| | CPU Utilization |
|---|---|
|Baseline |97% |
|C |90%|
|C++ |96%|
|Rust |93%|

Which means that C++ managed to use more CPU and spent more time handling requests.
However, it's worth mentioning that the `draft-http-tunnel` is implemented using callbacks, while the Rust solution is based on `tokio`,
which is a feature-rich framework and is much more flexible and extendable.

##### Summary

| | p50 | p90 | p99 | p99.9 | p99.99 | max | tm99 | stddev | rps (k) |
|---|---|---|---|---|---|---|---|---|---|
| Baseline | 1.8 | 2.9 | 3.4 | 3.9 | 9.0 | 202.6 | 1.6 | 1.2 | 60.4 |
| C (HAProxy) | 2.5 | 3.3 | 4.1 | 5.4 | 13.4 | 191.2 | 2.2 | 1.3 | 45.3 |
| C++ (draft-http-tunnel) | 1.9 | 3.2 | 5.3 | 9.2 | 18.9 | 205.4 | 2.0 | 1.2 | 48.8 |
| Rust (http-tunnel) | 2.4 | 3.2 | 3.9 | 5.2 | 10.8 | 202.5 | 2.2 | 1.3 | 46 |

#### Memory-safe languages (Rust, Golang, Java, Python)

##### Maximum rate achieved

Among memory safe both Rust and Golang showed comparable throughput, while Java and Python were significantly behind:

* Rust - 46k
* Golang - 42.6k
* Java - 25.9k
* Python - 18.3k

![](./prom/max-rust-golang-java-python-rps.png)

##### Regular percentiles (p50,90,99)

Again, we can see, that at `p50`-`p90` level Golang is somewhat comparable to Rust,
but quickly deviates at `p99` level, adding almost two milliseconds.

Java and Python exhibit substantially higher latencies, but Java `p99` latency is much worse than Python:

![](./prom/max-rust-golang-java-python-p50-99.png)

##### Tail latency (p99.9 and p99.99)

Tail latency shows even larger difference with Rust, and for Java is the worst of all four:

![](./prom/max-rust-golang-java-python-tail.png)

##### Trimmed mean and standard deviation

Golang's is comparable to Rust, which is impressive.
Again, Java and Python are well behind both Rust and Golang:

![](./prom/max-rust-golang-java-python-mean.png)

##### CPU consumption

CPU utilization is important here. What we want, is to saturate the CPU as much as we can.

As we can see, Rust does the best job of utilizing resources while Golang, Java and Python (in this order)
allow more power to stay idle.

![](./prom/max-rust-golang-java-python-cpu.png)

| | CPU Utilization |
|---|---|
|Rust |93%|
|Golang |84% |
|Java |74%|
|Python |65%|

##### Summary

| | p50 | p90 | p99 | p99.9 | p99.99 | max | tm99 | stddev | rps (k) |
|---|---|---|---|---|---|---|---|---|---|
| Rust (http-tunnel) | 2.4 | 3.2 | 3.9 | 5.2 | 10.8 | 202.5 | 2.2 | 1.3 | 46 |
| Tcp-Proxy (Golang) | 2.4 | 3.4 | 5.7 | 10.1 | 19.6 | 206.6 | 2.3 | 1.4 | 42.6 |
| NetCrusher (Java) | 3.1 | 6.9 | 15.6 | 53.2 | 89.1 | 2,850 | 3.7 | 12.3 | 25.9 |
| pproxy (Python) | 5.2 | 9.2 | 13.5 | 36.8 | 59.2 | 242.7 | 5.4 | 3.7 | 18.3 |

#### Total summary

| | p50 | p90 | p99 | p99.9 | p99.99 | max | tm99 | stddev | rps (k) |
|---|---|---|---|---|---|---|---|---|---|
| Baseline | 1.8 | 2.9 | 3.4 | 3.9 | 9.0 | 202.6 | 1.6 | 1.2 | 60.4 |
| C (HAProxy) | 2.5 | 3.3 | 4.1 | 5.4 | 13.4 | 191.2 | 2.2 | 1.3 | 45.3 |
| C++ (draft-http-tunnel) | 1.9 | 3.2 | 5.3 | 9.2 | 18.9 | 205.4 | 2.0 | 1.2 | 48.8 |
| Rust (http-tunnel) | 2.4 | 3.2 | 3.9 | 5.2 | 10.8 | 202.5 | 2.2 | 1.3 | 46 |
| Tcp-Proxy (Golang) | 2.4 | 3.4 | 5.7 | 10.1 | 19.6 | 206.6 | 2.3 | 1.4 | 42.6 |
| NetCrusher (Java) | 3.1 | 6.9 | 15.6 | 53.2 | 89.1 | 2,850 | 3.7 | 12.3 | 25.9 |
| pproxy (Python) | 5.2 | 9.2 | 13.5 | 36.8 | 59.2 | 242.7 | 5.4 | 3.7 | 18.3 |

#### Conclusion

The Rust solution is on par with C/C++ solutions at all levels.
Golang is slightly worse, especially for tail latencies, but is close to high performance languages.

NetCrusher and pproxy have much worse throughput and latency characteristics if a network service is under heavy load.
But, NetCrusher (Java) showed the worst max latency measured in seconds:

![](./prom/max-rust-golang-java-python-max.png)

BTW, try to guess Java on the memory consumption graph:

![](./prom/java-vs-others-memory.png)
Loading

0 comments on commit 712ba88

Please sign in to comment.