Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

task id / pthread_self() id #186

Closed
giampaolo opened this issue May 23, 2014 · 8 comments
Closed

task id / pthread_self() id #186

giampaolo opened this issue May 23, 2014 · 8 comments

Comments

@giampaolo
Copy link
Owner

From tiran79 on July 12, 2011 12:28:25

On Linux and possible other platforms with /proc interface, 
psutil.get_threads() returns thread information with the kernel's task id 
(TID). However Python threads are using a different type of identifier from 
pthread_self(). 

I'm interested in getting per thread CPU utilization. psutil gives me that 
information but I can't map the data to Python threads because POSIX thread ids 
can't be mapped to TIDs. At least I haven't found a simple way yet.

The attached patch implements gettid() (a Linux specific syscall) for psutil. 
With gettid() a thread can get its own CPU utilization.

Attachment: gettid.patch

Original issue: http://code.google.com/p/psutil/issues/detail?id=186

@giampaolo
Copy link
Owner Author

From g.rodola on July 12, 2011 04:38:59

How do you intend to map threads exactly? 
threading.Thread.get_ident [1] doc says:

> Return the ‘thread identifier’ of the current thread. This is a nonzero integer. 
> Its value has no direct meaning; it is intended as a magic cookie to be used e.g. 
> to index a dictionary of thread-specific data.

...hence I'm not sure you can use it to map python threads.

[1] http://docs.python.org/library/thread.html#thread.get_ident

@giampaolo
Copy link
Owner Author

From tiran79 on July 13, 2011 05:06:05

Python uses pthread_self() as thread identifier on Linux and other pthread 
platforms. On Linux pthreads are build on top of clone(). Cloned processes 
share the same PID but have a different TID (task id). The /proc kernel 
interface just exposes the low level kernel tasks and TIDs but not the pthread 
identifier. The __NR_gettid syscall returns the thread local TID.

On Windows, psutil's get_threads() and Python's threading API are using the 
same thread idents. thread_nt.h implements PyThread_get_thread_ident() with  
GetCurrentThreadId().

>>> p = psutil.Process(os.getpid())
>>> p.get_threads()
[thread(id=548, user_time=0.6875, system_time=0.1875), thread(id=2656, 
user_time=0.0, system_time=0.0)]
>>> threading.enumerate()
[<Thread(Thread-1, started daemon 2656)>, <_MainThread(MainThread, started 548)>]

I can't test psutil on other platforms.

We have hooks in our application that are run inside the thread whenever a 
thread starts and terminates. I'm working on a PEP with a similar interface for 
Python. In the meantime people could monkey patch 
threading.Thread.__bootstrap(), too. I'd like to have gettid (or similar) in 
psutil because I find it tedious to have a C extension module just for the one 
function. ctypes isn't a good option because the __NR_gettid syscall number is 
architecture specific. X86_64 and X86 have different numbers; ARM, PPC etc. too.

@giampaolo
Copy link
Owner Author

From g.rodola on July 13, 2011 05:51:29

AFAICT both GetCurrentThreadId() and syscall( __NR_gettid ) refer to the 
current process (os.getpid()), hence cannot be used in Process class which is 
supposed to be used with *any* process/pid.

Also, it's not clear to me what would you achieve by exposing gettid() alone. 
Could you provide a pratical example on how you would retrieve the CPU times of 
a thread by using gettid()/psutil?

Perhaps I'm misunderstanding you but I have a feeling this is not in the real 
of problems which should be dealt with by base psutil.

@giampaolo
Copy link
Owner Author

From tiran79 on July 13, 2011 06:38:39

Ah, you didn't get my initial use case. I'm sorry for the misunderstanding

psutil 0.3 gives me detailed CPU usage information for each thread of a 
process. I'd like to map the data to Python threads for the current Python 
process in order to find CPU intensive threads. All threads in our application 
have meaningful names.

On Windows it's trivial to map the CPU usage information to Python threads 
because psutil's thread information and threading.enumerate() share equal 
thread identifiers. For Linux I need the information from gettid() for each thread.

Practical example:
On Linux I monkey patch the threading.Thread class so that every thread stores 
its TID in the threading.Thread instance. With the additional information I can 
now map Python's threading.Threads to psutil's thread infos.

All except gettid can be implemented in Python easily. Therefor I suggest that 
psutil implements a function that returns the get_threads() specific thread id 
for each platform. On Linux it's gettid, on Windows it's thread.get_ident(). I 
don't have access to BSD to test it there.

@giampaolo
Copy link
Owner Author

From g.rodola on July 13, 2011 06:43:43

What about processes != os.getpid()?

@giampaolo
Copy link
Owner Author

From tiran79 on July 13, 2011 06:51:48

My proposal and use case is restricted to introspection of the current process.

For other processes there isn't a (simple) way to access Python's thread 
metadata from threading.enumerate(), too.

@giampaolo
Copy link
Owner Author

From g.rodola on July 13, 2011 07:15:07

Then it's not something which is up to psutil, imo.

@giampaolo
Copy link
Owner Author

From g.rodola on July 29, 2011 07:08:04

Status: WontFix

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant