Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

process.create_time() and hash(process) are not immutable on linux #711

Closed
adrianheron opened this issue Nov 13, 2015 · 8 comments
Closed
Labels

Comments

@adrianheron
Copy link

I've run into two related issues using psutil on linux (Ubuntu 14.04.3):

  1. Because of the way Process.create_time() is calculated it changes if the system time changes (commonly from ntp corrections, but in my instance I noticed it because the clock changed during the boot sequence). This may or may not be a big deal depending on what you want the creation time for.
  2. The documentation states that hash(process) is calculated from the pid and the creation_time, thus if the creation_time changes the hash also changes. This is a big deal since it defeats the point of a hash.

The problem does not appear to be present on Windows, presumably due to the way creation_time is calculated.

To reproduce:

  1. Pick a process, e.g. 12345
  2. Start python from the command line (Note: the actual values are just for example purposes)
import psutil
p = psutil.Process(12345)
p.creation_time()
>>>1447426717.05
hash(p)
>>>10728053
  1. Change the date: date -s "Fri Nov 13 10:09:44 EST 2015"
  2. Start python again (Note: it is important to restart python since psutil caches the value BASE_TIME which it uses to calculate create_time)
import psutil
p = psutil.Process(12345)
p.creation_time()
>>><creation time will be different>
hash(p)
>>><hash will be different>

I do not know what the solution to calculating creation_time is, however, the hash could be made immutable by using the "start time in jiffies since system boot" value (value[19] from

return (float(values[19]) / CLOCK_TICKS) + bt
) directly rather than the full create_time, since that appears to be constant.

@giampaolo
Copy link
Owner

I am not at home right now so I cannot take a look at the code but if you
say the base time value is cached and you need to restart python in order
to reproduce the issue then I would argue there's no issue.
Il 13/nov/2015 17:00, "adrianheron" notifications@github.com ha scritto:

I've run into two related issues using psutil on linux (Ubuntu 14.04.3):

  1. Because of the way Process.create_time() is calculated it changes if
    the system time changes (commonly from ntp corrections, but in my instance
    I noticed it because the clock changed during the boot sequence). This may
    or may not be a big deal depending on what you want the creation time for.
  2. The documentation states that hash(process) is calculated from the pid
    and the creation_time, thus if the creation_time changes the hash also
    changes. This is a big deal since it defeats the point of a hash.

The problem does not appear to be present on Windows, presumably due to
the way creation_time is calculated.

To reproduce:

  1. Pick a process, e.g. 12345
  2. Start python from the command line (Note: the actual values are just
    for example purposes)

import psutil
p = psutil.Process(12345)
p.creation_time()>>>1447426717.05hash(p)>>>10728053

  1. Change the date: date -s "Fri Nov 13 10:09:44 EST 2015"
  2. Start python again (Note: it is important to restart python since
    psutil caches the value BASE_TIME which it uses to calculate create_time)

import psutil
p = psutil.Process(12345)
p.creation_time()>>>hash(p)>>>

I do not know what the solution to calculating creation_time is, however,
the hash could be made immutable by using the "start time in jiffies since
system boot" value (value[19] from

return (float(values[19]) / CLOCK_TICKS) + bt
)
directly rather than the full create_time, since that appears to be
constant.


Reply to this email directly or view it on GitHub
#711.

@adrianheron
Copy link
Author

It became a problem for me because I was sending pids between different
processes and comparing hashes to verify their identity, which would fail
if the clock changed between when the different processes started. While
it might be an edge case for most people, I would say that a hash has to be
consistent under all circumstances to be called a hash.

On Fri, Nov 13, 2015 at 12:20 PM, giampaolo notifications@github.com
wrote:

I am not at home right now so I cannot take a look at the code but if you
say the base time value is cached and you need to restart python in order
to reproduce the issue then I would argue there's no issue.
Il 13/nov/2015 17:00, "adrianheron" notifications@github.com ha scritto:

I've run into two related issues using psutil on linux (Ubuntu 14.04.3):

  1. Because of the way Process.create_time() is calculated it changes if
    the system time changes (commonly from ntp corrections, but in my
    instance
    I noticed it because the clock changed during the boot sequence). This
    may
    or may not be a big deal depending on what you want the creation time
    for.
  2. The documentation states that hash(process) is calculated from the pid
    and the creation_time, thus if the creation_time changes the hash also
    changes. This is a big deal since it defeats the point of a hash.

The problem does not appear to be present on Windows, presumably due to
the way creation_time is calculated.

To reproduce:

  1. Pick a process, e.g. 12345
  2. Start python from the command line (Note: the actual values are just
    for example purposes)

import psutil
p = psutil.Process(12345)
p.creation_time()>>>1447426717.05hash(p)>>>10728053

  1. Change the date: date -s "Fri Nov 13 10:09:44 EST 2015"
  2. Start python again (Note: it is important to restart python since
    psutil caches the value BASE_TIME which it uses to calculate create_time)

import psutil
p = psutil.Process(12345)
p.creation_time()>>>hash(p)>>>

I do not know what the solution to calculating creation_time is, however,
the hash could be made immutable by using the "start time in jiffies
since
system boot" value (value[19] from

return (float(values[19]) / CLOCK_TICKS) + bt

)
directly rather than the full create_time, since that appears to be
constant.


Reply to this email directly or view it on GitHub
#711.


Reply to this email directly or view it on GitHub
#711 (comment).

@giampaolo
Copy link
Owner

I understand the problem but that is a critical part of the code which I'm not willing to change for what I consider to be a very rare edge case. I may have agreed in case the problem occurred within the same python process (that is why they came up with time.monotonic() as a replacement for time.time()) but here you're talking about starting two different python processes and changing the system date/clock in between. time.monotonic does not give you that kind of guarantee and neither should psutil IMHO.

@giampaolo
Copy link
Owner

...also, I'm kinda curious why you're relying on hash(). Could you describe your use case?

@adrianheron
Copy link
Author

My use case is roughly as follows:

  1. Process A is providing a service and launches at startup
  2. Process B starts some time later and registers with A by providing its pid and hash to A
  3. Process A looks up Process B via psutil.Process(pid) and compares the hash to make sure the returned Process object is actually Process B and not some other process

Due to the boot process the clock would change between steps 1 and 2, causing the hash comparison in step 3 to fail.

It could be argued that the hash check isn't really necessary, I primarily put it in because I don't well understand how pids get recycled (it seems to be handled differently on each platform) and this seemed like an easy check.

I've made a workaround to compute the hash using "jiffies since boot" rather than create_time and that has resolved the issue for me.

I don't agree with your justifications about why it's ok for a hash to not be constant, but I understand that your time is limited. If you like I could submit a pull request, or if you decide not to fix it perhaps a warning in the documentation is in order?

@giampaolo
Copy link
Owner

Due to the boot process the clock would change between steps 1 and 2, causing the hash comparison in step 3 to fail.

You don't explain why the system clock changes. Are you changing it? If so why? And why between steps 1 and 2? Why can't you launch your application(s) after you change the system clock?

It could be argued that the hash check isn't really necessary, I primarily put it in because I don't well
understand how pids get recycled (it seems to be handled differently on each platform) and this
seemed like an easy check.

Using hash() is a good idea and Process class already supports that natively, but not if you change the system clock in between. Another thing you may wanna do is using pid + cmdline.

I'm still skeptical about this also because apparently this can be fixed only on Linux, which seems to be the only platform where "jiffies" are retrievable. All other platform implementations rely on system clock and are subject to system clock updates.

@adrianheron
Copy link
Author

The system clock changes as a part of the linux boot process, I don't
change it myself. From what I saw it starts off with some default date
every time (something like Jan 1st 1980) then gets updated to the correct
time at some point. Since "Process A" is launching during the boot it gets
that strange time. I tried various upstart options to change when the
process launches but none of them seemed to make much difference.

I tried to reproduce the problem on Windows by manually changing the clock
but because it uses an entirely different method to get the process
creation time it seems to be immune to this issue.

On Fri, Dec 4, 2015 at 3:54 PM, giampaolo notifications@github.com wrote:

Due to the boot process the clock would change between steps 1 and 2,
causing the hash comparison in step 3 to fail.

You don't explain why the system clock changes. Are you changing it? If so
why? And why between steps 1 and 2? Why can't you launch your
application(s) after you change the system clock?

It could be argued that the hash check isn't really necessary, I primarily
put it in because I don't well
understand how pids get recycled (it seems to be handled differently on
each platform) and this
seemed like an easy check.

Using hash() is a good idea and Process class already supports that
natively, but not if you change the system clock in between. Another thing
you may wanna do is using pid + cmdline.

I'm still skeptical about this also because apparently this can be fixed
only on Linux, which seems to be the only platform where "jiffies" are
retrievable. All other implementations (all other POSIX and Windows) rely
on system clock and are subject to clock updates.


Reply to this email directly or view it on GitHub
#711 (comment).

@giampaolo
Copy link
Owner

Looking back at this, I still consider this a marginal issue. Closing this out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants