Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

poor NAT & networking performance #7857

Closed
hustcat opened this issue Sep 3, 2014 · 20 comments
Closed

poor NAT & networking performance #7857

hustcat opened this issue Sep 3, 2014 · 20 comments
Labels
area/networking exp/expert kind/enhancement Enhancements are not bugs or new features but can improve usability or performance. status/more-info-needed

Comments

@hustcat
Copy link

hustcat commented Sep 3, 2014

I use netperf to test network performance. These are some result:
network packet size Sum Trans Rate/s
no docker 1 742020
Bridge+NAT 1 213721
Bridge only 1 432079
docker host 1 674737

we can see, NAT's performance is very poor, and bridge(only) is also declined. Is there any way to improve performance while maintaining network isolation, such as SR-IOV in KVM?

@unclejack
Copy link
Contributor

@hustcat How were you benchmarking?

@hustcat
Copy link
Author

hustcat commented Sep 4, 2014

run netserver on one machine, and run netperf on the other 4 machines with 400 processes each machine. This is the client script.

#!/bin/bash
if [ $# -lt 2 ]; then
echo "Usage: $0 <proc_num> <base_port>"
exit 1
fi

num=$1
base=$2
port=$base
i=0
while [ $i -lt $num ]
do
bin/netperf -H 10.x.x.x -p 12865 -l 300 -t TCP_RR -- -r 1,1 -P 0,$port &
i=expr $i + 1
port=expr $i + $base
done

@unclejack unclejack changed the title The NAT performance is poor, Is there any way improve it? poor NAT & networking performance Oct 10, 2014
@hustcat
Copy link
Author

hustcat commented Nov 6, 2014

For bridge only, see #8277, qdisc of veth will become the bottleneck,see here
some test results:
network_mode Sum Trans Rate/s
no docker 742020
bridge only 432079
bridge only(veth txqueuelen=0) 704440

As we can see, bridge only with set veth's txqueuelen to zero, performance loss is small.
But for NAT, conntrack module of kernel become bottleneck, and it seems to be no way to optimize.
Overall, bridge will consume a lot of CPU, MACVLAN is better. However, the kernel is to achieve Non-promisc bridge, see here

@gdm85
Copy link
Contributor

gdm85 commented Nov 8, 2014

Is there an issue already covering native support of MACVLAN in Docker? I've read how to accomplish it here, but would like to follow the issue that could unlock this feature.

@unclejack
Copy link
Contributor

@hustcat Have you benchmarked NAT with bridge and veth txqueuelen=0? It would be interesting to see where that would fit in your benchmark above.

@hustcat
Copy link
Author

hustcat commented Nov 23, 2014

@unclejack Yes, I've tested this, the result(243145/s) is better than NAT with default veth's txqueuelen, but is still very poor, because conntrack module(NAT will use it) of kernel become bottleneck.

@thom4parisot
Copy link

Have a same issue for a container running in KVM. I am still not sure why, as the other containers connectivity is fine. For some reason I also cannot access this container through localhost: I have to explicitly mention the IP of the docker0 network interface.

@jessfraz jessfraz added /system/networking kind/enhancement Enhancements are not bugs or new features but can improve usability or performance. exp/expert labels Feb 26, 2015
@fulltopic
Copy link

I've tested UDP bandwidth by single netper instance with the same configuration as your "No NAT" case, and the result is that container has is about 2/3 of the bandwidth of host-host. Is that reasonable?

Container to Host:
64Byte: 80Mbps
1024Byte: 1251Mbps
8192Byte: 4009Mbps

Bridge to Host:
64Byte: 146.3Mbps
1024Byte: 2239Mbps
8192Byte: 6355Mbps

CPU: 100%
Command: netperf -c -L 172.17.42.131 -H 172.17.42.170 -t UDP_STREAM -l 20 -T12,12 -P0 -- -r 1024,1024

Linux compute2 3.10.74-rt79 #2 SMP PREEMPT RT Fri May 29 15:30:35 CST 2015 x86_64 x86_64 x86_64 GNU/Linux

I had had the txqueuelen = 0 set.

I found the cause, I had not had RPS enabled.

@cpuguy83
Copy link
Member

Is this actionable or is this just a side-effect of using veth?
Also, on your NAT test, are you certain traffic was not routing through the userland proxy?

@unclejack
Copy link
Contributor

@cpuguy83 This is indeed the kind of performance you get through NAT. It is still a problem because that's the default and some resort to host network to get around this problem.

@priyadarsh
Copy link

We are implementing microservices using docker and the poor network performance is something we found out in our performance tests. For the time being, we are using host network instead of default bridge network. However, I am just curious if there is any plan to fix this issue. This ticket is open from the past one year with no updates.

@cpuguy83
Copy link
Member

@priyadarsh How are you using the network?
Most people should not even notice performance issues with the network.

@priyadarsh
Copy link

@cpuguy83 Hi. We have deployed a rest-json based micro service as a docker image and things work fine when the response size is in Kbs. However, there are cases where the response size exceeds 3Mb. In such cases, the download time is over a minute. We have tried gzipping the response but to our surprise it took more time. The difference is network latency between host and bridge is very evident with such package size.

@cpuguy83
Copy link
Member

@priyadarsh I'm more interested in how you are accessing these services?
Are they on the same host? By what means are they communicating?

@priyadarsh
Copy link

@cpuguy83 The client consuming this service(via http) is running on a different host and not deployed as a docker image.

@cpuguy83
Copy link
Member

@priyadarsh Thank you. I would not expect the bridge interfaces or NAT to give such bad overhead.
Maybe it's something else? MTU?

@tactical-drone
Copy link

docker-proxy seems to use a lot of CPU.

Therefor: If your CPU is slow, so will your networking.

I really thought they could have achieved networking with some netfilter trix instead of an executable that sucks CPU power. Unlucky.

@cpuguy83
Copy link
Member

@pompomJuice docker-proxy is (should) only be used for local traffic, this is to facilitate hairpinning traffic back into the container.
If you are accessing the public host port from within the same host... basically just, don't do this.

@tactical-drone
Copy link

Aah, I see.

Thanks @cpuguy83

asfgit pushed a commit to apache/aurora-packaging that referenced this issue Aug 2, 2016
The default bridge network is known to be slow (1) and potentially
flaky (2). Switching to host networking is a desperate attempt to reduce
flaps in our nightly package builds.

(1) moby/moby#7857
(2) moby/moby#11407

Reviewed at https://reviews.apache.org/r/50716/
@cpuguy83
Copy link
Member

Docker 1.12 has support for macvlan and ipvlan (l2 and l3) which should give even better performance than bridge networking and does not require nating.

Closing I believe this solves the problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/networking exp/expert kind/enhancement Enhancements are not bugs or new features but can improve usability or performance. status/more-info-needed
Projects
None yet
Development

No branches or pull requests

9 participants