Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Linux] CPU idle time may go backward after suspension #2172

Open
giampaolo opened this issue Nov 15, 2022 · 0 comments
Open

[Linux] CPU idle time may go backward after suspension #2172

giampaolo opened this issue Nov 15, 2022 · 0 comments

Comments

@giampaolo
Copy link
Owner

Summary

  • OS: Linux (tested with 5.15.0)
  • Type: core

Description

When a Linux system is suspended system’s idle time goes backwards. This python script launched during / after suspend illustrates the problem:

import psutil
import time
import datetime

while 1:
    print(datetime.datetime.now(), psutil.cpu_percent())
    print(datetime.datetime.now(), psutil.cpu_times())
    time.sleep(1)

It prints (notice the backwards idle value right after wakeup):

2022-11-15 19:30:11.732441 12.6
2022-11-15 19:30:11.732597 scputimes(user=14667.97, nice=64.45, system=7086.26, idle=62218.47, iowait=79.98, irq=0.0, softirq=80.95, steal=0.0, guest=0.0, guest_nice=0.0)
# SYSTEM SUSPEND
# SYSTEM WAKEUP
2022-11-15 19:30:19.170989 100.0
2022-11-15 19:30:19.171226 scputimes(user=14668.67, nice=64.45, system=7086.91, idle=58733.86, iowait=76.24, irq=0.0, softirq=80.95, steal=0.0, guest=0.0, guest_nice=0.0)

Why this happens?

The idle value printed above comes directly from the kernel, aka the 4th column of /proc/stat file, so it's the kernel which is at fault here. idle time is used to calculate the system busy time (see psutil source), which is why when idle goes backward it results in a 100% CPU usage value.

What to do?

Still not sure. psutil may keep track of the last idle time, and add the difference if idle time went backwards since last call. There is a precedent for this in disk_io_counters and net_io_counters, see:

def disk_io_counters(perdisk=False, nowrap=True):
    """
    ...
    If *nowrap* is True it detects and adjust the numbers which overflow
    and wrap (restart from 0) and add "old value" to "new value" so that
    the returned numbers will always be increasing or remain the same,
    but never decrease.
    ...
    """

Those functions (disk / net IO) are somewhat different though, since they represent cumulative counters, whereas CPU timings have to do with... (passing) time. As such, perhaps it would not be correct to just add the missing time to idle time, without taking other CPU timings (user, system, etc.) into account.

Leaving this open so that we can at least keep track of this (weird) kernel behavior on Linux.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant