Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve CPU time resolution #1461

Open
VorpalBlade opened this issue Mar 13, 2019 · 7 comments
Open

Improve CPU time resolution #1461

VorpalBlade opened this issue Mar 13, 2019 · 7 comments
Assignees

Comments

@VorpalBlade
Copy link

The resolution of the CPU times seems to be artificially limited to 2 decimal places. For example:

In [1]: import subprocess                                                                                                                                                                                                            
In [2]: import psutil
In [3]: p = psutil.Popen(["yes"], stdout=subprocess.DEVNULL)
< wait a bit here > 
In [4]: print(p.cpu_times())                                                                                                                                                                                                         
pcputimes(user=1.83, system=4.31, children_user=0.0, children_system=0.0)

In fact it seems impossible to get more than 2 decimals from psutil.

But using the built in bash function on Linux I get 3 decimal places:

$ time sleep 0.235

real	0m0.236s
user	0m0.001s
sys	0m0.000s

There doesn't seem to be a good reason to artificially limit the precision of returned values (in decimal even!), when they might not be printed, but used by a program, such as for benchmarking.

@giampaolo
Copy link
Owner

giampaolo commented Mar 13, 2019

Uhm... we collect that value from /proc/pid/stat (which looks like 117), then we divide it by CLOCK_TICKS, which is 100, hence the reason why this has XX digits precision. No idea how time command manages to do better than that but it looks worth investigating.

@VorpalBlade
Copy link
Author

Interestingly it appears that the bash builtin time command has higher resolution than the program /usr/bin/time (the latter only reports two decimals as well).

@giampaolo
Copy link
Owner

Interesting. We should look at the source code of these tools.

@wiggin15
Copy link
Collaborator

This page is a good resource that explains why bash's built-in "time" is more precise than /usr/bin/time, and which calls these tools use to get the results: https://gist.github.com/linse/3737870
Unfortunately it looks like these tools measure the time difference between the start of the call and end of the call, instead of returning information about a specific process.

@VorpalBlade
Copy link
Author

VorpalBlade commented Mar 21, 2019

getrusage would only work for child processes (which is in fact my use case), but it would not work for multiple concurrent child processes (which is also my use case). It could work if you fork a separate child process to measure each process you want to run I guess, but that seems annoying and complicated.

Having maxrss as a measurement of the peak memory usage (which appears to be in getrusage) would also be pretty handy though.

An alternative that would work for me would be a tiny C wrapper that executes the program and reports the getrusage() result at the end back to the python program, and I might end up writing that instead. Would not be a general solution that would fit into psutil though.

@giampaolo
Copy link
Owner

giampaolo commented Mar 23, 2019

@wiggin15 thanks a lot for finding that out. The difference is indeed getrusage:

import psutil, resource
p = psutil.Process()
# warm up
for x in range(10000000):
    pass
for x in range(5):
    a = p.cpu_times()
    b = resource.getrusage(resource.RUSAGE_SELF)
    print(a.user, a.system)
    print(b.ru_utime, b.ru_stime)

Which prints:

0.41 0.02
0.416417 0.020187
0.41 0.02
0.416537 0.020187
0.41 0.02
0.416585 0.020187
0.41 0.02
0.41662699999999997 0.020187
0.41 0.02
0.416695 0.020187

Some considerations:

  1. We can use getrusage for current process only (os.getpid()), else use the less precise platform-specific implementation. I think it makes sense to add this exception (the first one in psutil) because the current process is more important than others (the class signature itself is psutil.Process(pid=os.getpid())).

  2. getrusage is a POSIX standard, so Linux is not the only one which can benefit from this.

  3. We can use resource.getrusage(resource.RUSAGE_CHILDREN) to enhance precision for the whole named tuple:

>>> psutil.Process().cpu_times()
pcputimes(user=0.1, system=0.01, children_user=0.0, children_system=0.0)
  1. extra: os.times() also provides 2 decimal places precision, so technically this could be landed in Python as well, probably as a doc fix which mentions the more precise alternative

@giampaolo giampaolo added the unix label Mar 23, 2019
@giampaolo
Copy link
Owner

Putting this here for now and I'll get back to it later:

diff --git a/psutil/__init__.py b/psutil/__init__.py
index ab2ed349..b3680bfb 100644
--- a/psutil/__init__.py
+++ b/psutil/__init__.py
@@ -37,6 +37,10 @@ try:
     import pwd
 except ImportError:
     pwd = None
+try:
+    import resource
+except ImportError:
+    resource = None
 
 from . import _common
 from ._common import deprecated_method
@@ -226,6 +230,7 @@ POWER_TIME_UNLIMITED = _common.POWER_TIME_UNLIMITED
 POWER_TIME_UNKNOWN = _common.POWER_TIME_UNKNOWN
 _TOTAL_PHYMEM = None
 _LOWEST_PID = None
+_THIS_PID = os.getpid()
 
 # Sanity check in case the user messed up with psutil installation
 # or did something weird with sys.path. In this case we might end
@@ -1152,7 +1157,21 @@ class Process(object):
         On macOS and Windows children_user and children_system are
         always set to 0.
         """
-        return self._proc.cpu_times()
+        if self.pid == _THIS_PID and resource is not None:
+            # better precision
+            t = resource.getrusage(resource.RUSAGE_SELF)
+            utime, stime = t.ru_utime, t.ru_stime
+            children_utime, children_stime = 0, 0
+            if hasattr(resource, "RUSAGE_CHILDREN"):
+                t = resource.getrusage(resource.RUSAGE_CHILDREN)
+                children_utime, children_stime = t.ru_utime, t.ru_stime
+            else:
+                t = self._proc.cpu_times()
+                children_utime, children_stime = t[2], t[3]
+            return _common.pcputimes(
+                utime, stime, children_utime, children_stime)
+        else:
+            return self._proc.cpu_times()
 
     @memoize_when_activated
     def memory_info(self):

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants