New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Docker is incredibly slow to start many containers in parallel even on a many-core machine #42096
Comments
I believe I might be experiencing the same issue when launching many containers in parallel in a VM on Google Compute Engine (GCE). I have tried to reproduce the issue on multiple bare-metal and non-GCE virtual machines, but so far I can only reproduce it on GCE where I have been able to consistently reproduce the issue on multiple different VMs. Inspired by @GunshipPenguin, I ran a very similar experiment: for i in {1..100}
do
time docker run --rm alpine /bin/true &
sleep 1
done The reason I pause program for 1 second between invocations is so that the container schedule is "sustainable". That is, since a single command In the following diagram I have plotted the response times (i.e., the elapsed time between calling If I reduce the sleep time in the aforementioned experiment, the drastic increase in response time happens for earlier containers as well. E.g., if I set the sleep time to 0.5 sec, the 28th container is already affected by the high response time. Output of
|
Do you see a big difference if you start the container with |
The response time of each container is roughly halved when I use However, if I reduce the sleep time between consecutive container launches from 1 second to 0.5 seconds, I see the sudden increase of response time again. As before, some of the response times easily exceed 30 seconds. It seems to me that Upon investigating further, I found that when the response times increase drastically, there is often not a single container running. So docker must be stuck somewhere else. As @GunshipPenguin suggested, docker might be waiting for some lock to be released. I would be happy to investigate further if you have any suggestions. |
I am experiencing the same problem on GCP. It happened in June as well, but at that time I reverted to a previous snapshot of my GCP image and it seemed to solve the problem. This time around, I reverted to the same image I snapshotted after that episode but am still seeing this behaviour. When I try to start with --network=host I see the error: docker: Error response from daemon: OCI runtime start failed: starting container: setting up network: creating interfaces from net namespace "/proc/7577/ns/net": cannot run with network enabled in root network namespace: unknown. Docker version: Client: Docker Engine - Community Docker info: Client: |
@iaindooley I've described a similar issue in #42817 but I'm only able to reproduce it in GCP, not on any other provider. |
@jespada-bc in the end I just split up my monolithic instance into multiple GCP instances. Through trial and error, I found that docker starts to bottleneck at about 15 instances, so in my case each GCP instance is 2core/2GB which based on my application is enough to run linux + 15 containers. Then I scale the service up and down by spinning up new GCP instances which adds chunks of 15 workers to process the queue, then when the queue goes back down I shut down those instances. It was a real hassle to change over but in the end it's actually a nicer system than the more monolithic scaling I was pursuing when I ran into the problem :) |
I'm having the same problem and am also happy to help investigate. Steps to reproduce:
The difference between the first timestamp and last timestamp is 2-2.5s. If you remove the In contrast, running the same thing without docker, the difference between the first and last timestamp is about 0.02s.
It makes sense that Docker is doing a lot more than starting a single process, but 100x seems like a big difference. Version info:
|
I meet same problem, but when I update kernel after 5.15-rc1 or later, on physical machine will start quickly. use commit d195d7aac09bddabc2c8326fb02fcec2b0a2de02 will start quickly. git log d195d7aac09b faa6a1f9de51 have many commits ,I don't know which commit influence docker start. here is kernel commit
this page |
after this commit torvalds/linux@9d3684c24a52 will start quickly |
this commit torvalds/linux@9d3684c24a52 will fix this issue,but I want to know why docker start container quickly after this commit? run.sh
after this commit
before this commit
even though docker start container more quickly than before |
will appear one times slow before this commit, when run about per 10 times real 0m0.315s real 0m0.329s real 0m0.318s real 0m0.302s real 0m0.314s real 0m0.298s real 0m0.306s real 0m0.331s real 0m0.308s real 0m0.713s real 0m0.314s real 0m0.320s real 0m0.288s real 0m0.341s real 0m0.311s real 0m0.303s real 0m0.324s real 0m0.283s real 0m0.309s real 0m0.694s |
Hi,
Before i create the container above, I also run
Hope this help and this issue can be solved soon. |
this commit maybe fix this issue |
We are facing that issue during the startup/shutdown on our bare metal servers that we use to do shared hosting for our customers. The server is working fine with really great performance when launched, but the start and shutdown of the docker service is a hell. We are running currently around 600-700 containers and the restart of the docker service takes literally hours. I am thinking to stop using the restart policies and use a custom startup/shutdown script of the docker service to gracefully stop each docker-compose project and do it in a sequential way instead of trying to do it in an automatic parallel way. |
you can try this pr #44887 |
Came across this after encountering this same issue, when starting hundreds of containers is basically trashing the system. Any reason #44887 that were supposed to solve this stalled? |
#44887 stalled because its various iterations either did not fix this issue or fixed the issue by breaking something else. |
I see, thanks for the response! But is it so uncommon to spawn tens of containers at the same time this isn't prioritized? AFAICS, this isn't an edge-case apart from the number of containers being started...? |
I thinks my commit doesn't have too much risk,can my patch solve your problem? |
Description
On a many-core machine (Ryzen 9 3950X in my case), starting 32 docker containers simultaneously results in the time needed to start any one of them increasing 10-20 fold.
Steps to reproduce the issue:
1.
On a multi-core machine, start a bunch of docker containers running
/bin/true
(which does nothing but exit with status 0, thus time needed to run the command should be negligible) in parallel. This can be done with GNU Parallel as follows:In this case I'm starting starting 32 with
-j32
as I'm on a 32 threaded machine.Immediately thereafter, in another terminal run
/bin/true
in another docker container and measure the time it takes to complete:After all containers from step 1 have exited, run the command in step 2 again and compare the times.
In my case, when running
/bin/true
with 32 other containers starting at the same time, I got:When running without the other containers starting at the same time, I got:
Describe the results you received:
Time needed in step 2 was vastly greater than time needed in step 3.
Describe the results you expected:
Times in step 2 and 3 are comparable given container creation processes should be scheduled on different cores.
Additional information you deem important (e.g. issue happens only occasionally):
Thinking
dockerd
may be serializing around a lock or something of the sort, I generated two off CPU flame graphs of thedockerd
process using tools from Brendan Gregg. They can be found zipped here: flamegraphs.zip.offcpu-dockerd-parallel.svg
covers docker being used to start 32 containers in parallel andoffcpu-dockerd-sequential.svg
covers docker being to start 32 containers sequentially. As can be seen in the parallel case,dockerd
spends a very large amount of time being blocked in theopen
syscall (see right hand side of graph), specifically when waiting for the other end of a FIFO to be opened. This is not present in the sequential case.Running
lsof
on dockerd during the parallel test, I consistently got output resembling the following:ie. the FIFOs being waited on seem to be
/run/docker/containerd/<container_id>/init-[stderr,stdout]
.Some googling resulted in me stumbling upon #29369, which seems to be a very similar situation. This comment is of particular interest as it revealed that containerd only allows starting 10 containers in parallel at once, putting any further requests in a queue. This might have provided an answer to this issue, however, repeating the reproduction steps above with 9 containers in step 2 resulted in a similar (albeit slightly less extreme) performance hit (~1.8s vs ~0.7s). Additionally, given that comment is 5 years old, I'm not sure if it's still accurate regarding containerd limits.
I'm not entirely sure if this is expected behaviour or not. It would of course be expected behaviour if there was a contended resource shared between all 32 container startups that needed to be locked around for sequential access. I'm however not familiar enough with the docker/containerd codebases to know if such a resource exists. If it does, I'd be very grateful if someone could enlighten me to its existence 😛
Output of
docker version
:Output of
docker info
:Additional environment details (AWS, VirtualBox, physical, etc.):
Running on a Debian 10 workstation. Output of
uname -a
:The text was updated successfully, but these errors were encountered: