Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change stat data types to match the related kernel data types #403

Closed
vykulakov opened this issue Jul 27, 2021 · 4 comments
Closed

Change stat data types to match the related kernel data types #403

vykulakov opened this issue Jul 27, 2021 · 4 comments

Comments

@vykulakov
Copy link
Contributor

vykulakov commented Jul 27, 2021

The man page for /proc/[pid]/stat describes data types that should be used to parse particular values in this stat file. So the following information about data types may be obtained from there:

(1) pid  %d
(2) comm  %s
(3) state  %c
(4) ppid  %d
(5) pgrp  %d
(6) session  %d
(7) tty_nr  %d
(8) tpgid  %d
(9) flags  %u
(10) minflt  %lu
(11) cminflt  %lu
(12) majflt  %lu
(13) cmajflt  %lu
(14) utime  %lu
(15) stime  %lu
(16) cutime  %ld
(17) cstime  %ld
(18) priority  %ld
(19) nice  %ld
(20) num_threads  %ld
(21) itrealvalue  %ld
(22) starttime  %llu
(23) vsize  %lu
(24) rss  %ld
(25) rsslim  %lu
(26) startcode  %lu  [PT]
(27) endcode  %lu  [PT]
(28) startstack  %lu  [PT]
(29) kstkesp  %lu  [PT]
(30) kstkeip  %lu  [PT]
(31) signal  %lu
(32) blocked  %lu
(33) sigignore  %lu
(34) sigcatch  %lu
(35) wchan  %lu  [PT]
(36) nswap  %lu
(37) cnswap  %lu
(38) exit_signal  %d  (since Linux 2.1.22)
(39) processor  %d  (since Linux 2.2.8)
(40) rt_priority  %u  (since Linux 2.5.19)
(41) policy  %u  (since Linux 2.5.19)
(42) delayacct_blkio_ticks  %llu  (since Linux 2.6.18)
(43) guest_time  %lu  (since Linux 2.6.24)
(44) cguest_time  %ld  (since Linux 2.6.24)
(45) start_data  %lu  (since Linux 3.3)  [PT]
(46) end_data  %lu  (since Linux 3.3)  [PT]
(47) start_brk  %lu  (since Linux 3.3)  [PT]
(48) arg_start  %lu  (since Linux 3.5)  [PT]
(49) arg_end  %lu  (since Linux 3.5)  [PT]
(50) env_start  %lu  (since Linux 3.5)  [PT]
(51) env_end  %lu  (since Linux 3.5)  [PT]
(52) exit_code  %d  (since Linux 3.5)  [PT]

Here things like %d and %lu are scanf(3) format specifiers (according to the man page). The full list of the used above specifiers is:

  • %d - for int;
  • %u - for unsigned int;
  • %ld - for long;
  • %lu - for unsigned long;
  • %llu - for unsigned long long;
  • %s - matches a sequence of non-white-space characters;
  • %c - matches a sequence of characters whose length is specified by the maximum field width (default 1).

But in the code we have the following (just a few examples):

UTime uint
STime uint
CUTime uint
CSTime unit

So instead of parsing UTime into unsigned long (according to the man page), the library is trying to parse it into uint that may be uint32 on machines with the 32-bit arch. As a result, we may get the value out of range errors. With CUTime it is even worse: instead of signed long the library is trying to use uint so any negative values may raise errors too.

Such errors have been already faced in real use. Check #401 for details.

I propose to fix the field types according to the list above. It will break compatibility with any old code but as the library is only on the pre-1.0 stage it can be done easily. Additionally, in golang_client such values are converted into float64 so the future migration may happen unnoticed.

Useful links:

@vykulakov
Copy link
Contributor Author

Well, I made a mistake in my conclusion. In C they just specify the minimum size of data types like int and long. So the real data type sizes depend on implementations and finally on particular machine architectures.

You may check the Wikipedia for details and other resources:

As you may guess, they do the same in the Go for the int and uint data types so these data types may take 4 or 8 bytes as well as long and unsigned long do in C.

As a result, the existent code to parse the stat fields work perfectly except for the fields CUTime and CSTime - they both should be signed long or just int in Go instead of uint currently.

@vykulakov
Copy link
Contributor Author

I'll prepare a fix for those two fields.

SuperQ pushed a commit that referenced this issue Aug 30, 2021
* Fix data types for CUTime and CSTime stat fields #403

These two stat fields (CUTime and CSTime) in the /proc/[pid]/stat file should have the signed long data type according to the documentation. But currently in the code their data type is just unsigned int. This commit fixes it and adds more tests.

See for details:
* https://man7.org/linux/man-pages/man5/proc.5.html
* https://man7.org/linux/man-pages/man3/scanf.3.html

Signed-off-by: Vyacheslav Kulakov <kulakov.home@gmail.com>
remijouannet pushed a commit to remijouannet/procfs that referenced this issue Oct 20, 2022
…etheus#404)

* Fix data types for CUTime and CSTime stat fields prometheus#403

These two stat fields (CUTime and CSTime) in the /proc/[pid]/stat file should have the signed long data type according to the documentation. But currently in the code their data type is just unsigned int. This commit fixes it and adds more tests.

See for details:
* https://man7.org/linux/man-pages/man5/proc.5.html
* https://man7.org/linux/man-pages/man3/scanf.3.html

Signed-off-by: Vyacheslav Kulakov <kulakov.home@gmail.com>
@rexagod
Copy link
Contributor

rexagod commented Mar 22, 2024

ACK, I believe this can be closed now.

@vykulakov
Copy link
Contributor Author

All seems to be fine so far, closing the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants