Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Linux] process_iter ridiculously slow (like 240x slow than ps) #1751

Closed
petersilva opened this issue May 6, 2020 · 5 comments
Closed

[Linux] process_iter ridiculously slow (like 240x slow than ps) #1751

petersilva opened this issue May 6, 2020 · 5 comments

Comments

@petersilva
Copy link

Platform

  • { ubuntu 16.04 , 18.04 }
  • { psutil version: 5.4.2 }
  • { python version: 3.6.9 }

Bug description

python psutil.process_iter runs 240 times slow than ps.
it's not usable.

sarra@ddsr1:/tmp$ time ps eaux | wc -l
1632

real	0m0.276s
user	0m0.139s
sys	0m0.131s
sarra@ddsr1:/tmp$
sarra@ddsr1:/tmp$ time python3 p.py

real	1m15.800s
user	0m28.375s
sys	0m38.346s
sarra@ddsr1:/tmp$ cat p.py

import psutil
for proc in psutil.process_iter( ['pid','cmdline','name', 'username' ] ):
     p = proc.as_dict()

sarra@ddsr1:/tmp$ 

**Test results**
{ output of `python -c psutil.tests` (failures only, not full result) }
@petersilva petersilva added the bug label May 6, 2020
@petersilva
Copy link
Author

oh, and I tried selecting fields... it made no difference.

@petersilva
Copy link
Author

the following is also an order of magnitude faster:


import psutil
import pwd
import os.path


for pid in psutil.pids(): 
            p={}
            p[ 'pid' ] = pid

            pidir= '/proc/%d' % pid 
            if not os.path.exists( pidir ):
               continue
            with open( pidir + '/cmdline', 'r' ) as cf:
                 p[ 'cmdline' ] = cf.read().split('\x00')
            with open( pidir + '/loginuid', 'r' ) as luf:
                 uid=luf.read()
            p[ 'name' ] = p['cmdline'][0] 
            print('.', end='' )           

It looks like psutil gathers many data structures that the user is not asking for, and it is very
expensive to get them, resulting in many hundred fold performance penalty.

@giampaolo
Copy link
Owner

You're using it wrong. It should be:

import psutil
for proc in psutil.process_iter( ['pid','cmdline','name', 'username' ] ):
-      p = proc.as_dict()
+      proc.info

@petersilva
Copy link
Author

Thanks!

@Scylla2020
Copy link

Scylla2020 commented Jun 16, 2024

@giampaolo I already had it like that in my case on windows and its still super slow, getting 6 attributes takes about 12 seconds for 180 processes. Any other ideas? The slowness is in the psutil.process_iter step. Im curious how other tools like Process hacker can handle many fields and thousands of processes super fast,

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants