-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Uses a lot of CPU when idle #5089
Comments
Here is a high resolution capture of CPU usage of
It should be 0.00% non-stop instead. |
I think this is mostly all related to the go runtime. |
@baryluk you can run |
another dump with docker also running:
|
Watching my system while idle and containerd not running anything, I see it periodically doing something as well. The amount of CPU is likely runtime related as @cpuguy83 said, but when I ran the profile over a period of 5 minutes (of inactivity), 25% was to the dialer, about 70% to runtime, and 1% to some That I am not sure about the dialer, whether that is related to the debug socket or something else going on. Either way, having those periodic tasks that run when not needed needs to be cleaned up and would give the runtime less to do. The GC scheduler has a similar periodic check where it wakes up and checks a flag, I'm not sure if there is a cleaner pattern in Go to trigger that on rather than sleep -> check -> sleep (see https://github.com/containerd/containerd/blob/master/gc/scheduler/scheduler.go#L273) |
Hi, I have the same issue on a fresh Ubuntu 21.04 installation. Containerd from official Ubuntu repository ( When using
|
I tried going through and disabling various plugins to see if there was anything obviously increasing the CPU. The overall CPU can be decreased by disabling or making some of the periodically functions more efficient...
Disabling/fixing those doesn't do much for the mentioned syscalls. Given the only things left running are the rpc servers and various channel listeners, the calls are likely from the go runtime or go libraries. I'm not seeing anything close to the high CPU usage mentioned, just constant low usage even with nothing running. We'll continue to look at ways to cut down on idle usage, but updating containerd to a version built from a newer version of Go will always help as well as disabling plugins you don't need. Some of those plugins can also be configured with longer intervals, the defaults today are probably more assuming containers are actively be run and not necessarily tuned for mostly idle usage. |
Some gdb's bt which may help
and
|
On startup `gcTimeSum` might work fast and return `0`, so on this case the algorithm turns in infinity loop which simple consume CPU on timer which fires without any interval. Closes: containerd#5089
BTW. I reported this bug almost 3 years ago, but recently I had no issues on Debian testing. containerd is mostly idle and not using much CPU. 0.0-1.0% of one core. So it looks like it is still wakeing up a little periodically, despite no containers or user interactions. dockerd is running, but is idle. Looking at strace, it is doing some non-blocking epoll_pwait, with zero timeout. Then it either tries to acquire some mutex with futex wait timeout (randomized around 6ms timeout), and also sometimes go to sleep for 10ms. So it still technically spins, and executes something 100 times per second. Instead doing this, better approach would be to create eventfd, and add it to the epoll file set, then wait on everything with infinite timeout. the other threads can wakt that epoll using eventfd, I.e. when they release some mutex, or something (i do not know where that loop is or what it does exactly, but it looks like it participates in some cross-thread synchronization too). |
On startup `gcTimeSum` might work fast and return `0`, so on this case the algorithm turns in infinity loop which simple consume CPU on timer which fires without any interval. Closes: containerd#5089
On startup `gcTimeSum` might work fast and return `0`, so on this case the algorithm turns in infinity loop which simple consume CPU on timer which fires without any interval. Closes: containerd#5089 Signed-off-by: Kirill A. Korinsky <kirill@korins.ky>
On Thu, 08 Feb 2024 22:33:34 +0100, Witold Baryluk wrote:
BTW. I reported this bug almost 3 years ago, but recently I had no issues on Debian testing. containerd is mostly idle and not using much CPU. 0.0-1.0% of one core. So it looks like it is still wakeing up a little periodically, despite no containers or user interactions. dockerd is running, but is idle.
Looking at strace, it is doing some non-blocking epoll_pwait, with zero timeout. Then it either tries to acquire some mutex with futex wait timeout (randomized around 6ms timeout), and also sometimes go to sleep for 10ms. So it still technically spins, and executes something 100 times per second.
Instead doing this, better approach would be to create eventfd, and add it to the epoll file set, then wait on everything with infinite timeout. the other threads can wakt that epoll using eventfd, I.e. when they release some mutex, or something (i do not know where that loop is or what it does exactly, but it looks like it participates in some cross-thread synchronization too).
The issue is tricky.
It may appear on start up, and may not, it depends on how long was the first GC
cycle. It it was and it spen sometime, an interval won't be zero and nobody
encountered it.
I had encountered it on clean VM with 100% reproduction rate.
So, the fix is available as: #9795
…--
wbr, Kirill
|
On startup `gcTimeSum` might work fast and return `0`, so on this case the algorithm turns in infinity loop which simple consume CPU on timer which fires without any interval. Use `5ms` as fallback to have interval `245ms` for that case. Closes: containerd#5089 Signed-off-by: Kirill A. Korinsky <kirill@korins.ky>
On startup `gcTimeSum` might work fast and return `0`, so on this case the algorithm turns in infinity loop which simple consume CPU on timer which fires without any interval. Use `5ms` as fallback to have interval `245ms` for that case. Closes: containerd#5089 Signed-off-by: Kirill A. Korinsky <kirill@korins.ky>
On startup `gcTimeSum` might work fast and return `0`, so on this case the algorithm turns in infinity loop which simple consume CPU on timer which fires without any interval. Use `5ms` as fallback to have interval `245ms` for that case. Closes: containerd#5089 Signed-off-by: Kirill A. Korinsky <kirill@korins.ky>
On startup `gcTimeSum` might work fast and return `0`, so on this case the algorithm turns in infinity loop which simple consume CPU on timer which fires without any interval. Use `5ms` as fallback to have interval `245ms` for that case. Closes: containerd#5089 Signed-off-by: Kirill A. Korinsky <kirill@korins.ky>
On startup `gcTimeSum` might work fast and return `0`, so on this case the algorithm turns in infinity loop which simple consume CPU on timer which fires without any interval. Use `5ms` as fallback to have interval `245ms` for that case. Closes: containerd#5089 Signed-off-by: Kirill A. Korinsky <kirill@korins.ky>
On startup `gcTimeSum` might work fast and return `0`, so on this case the algorithm turns in infinity loop which simple consume CPU on timer which fires without any interval. Use `5ms` as fallback to have interval `245ms` for that case. Closes: containerd#5089 Signed-off-by: Kirill A. Korinsky <kirill@korins.ky>
On startup `gcTimeSum` might work fast and return `0`, so on this case the algorithm turns in infinity loop which simple consume CPU on timer which fires without any interval. Use `5ms` as fallback to have interval `245ms` for that case. Closes: containerd#5089 Signed-off-by: Kirill A. Korinsky <kirill@korins.ky>
On startup `gcTimeSum` might work fast and return `0`, so on this case the algorithm turns in infinity loop which simple consume CPU on timer which fires without any interval. Use `5ms` as fallback to have interval `245ms` for that case. Closes: containerd#5089 Signed-off-by: Kirill A. Korinsky <kirill@korins.ky>
On startup `gcTimeSum` might work fast and return `0`, so on this case the algorithm turns in infinity loop which simple consume CPU on timer which fires without any interval. Use `5ms` as fallback to have interval `245ms` for that case. Closes: containerd#5089 Signed-off-by: Kirill A. Korinsky <kirill@korins.ky>
After installing
containerd
Version: 1.4.3~ds1-2 on Debian testing from Debian, I noticed it is almost always at the top of thetop
output, sometimes using 10% CPU.This is a fresh installation of
containerd
with nothing configured.This is wasteful, and prevents CPU and system to go into deeper sleep mode, which is important on laptops, but also on desktop. It also interferes with very accurate time measurements of other programs.
Running
strace -f
on the running process shows that is is using a very frequently futex and short sleep of 20us.Consider using
epool
with multi-second timeout, and possibly eventfd for futex waits (which can be beepool
-ed on too), instead of explicit pooling with no timeout andnanosleep
.Or maybe my diagnosis is wrong, as I see a lot of futex wakes, despite me doing absolutly nothing with
containerd
.A ~3 seconds sample of
strace
in the attachment.containerd-strace.txt
The text was updated successfully, but these errors were encountered: