Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can we read /proc/PID/cmdline to get more information? #121

Open
lars-t-hansen opened this issue Oct 30, 2023 · 0 comments
Open

Can we read /proc/PID/cmdline to get more information? #121

lars-t-hansen opened this issue Oct 30, 2023 · 0 comments
Labels
question Further information is requested

Comments

@lars-t-hansen
Copy link
Collaborator

Currently we grab the command field from the /proc/PID/stat output. This contains no command options/flags/arguments and it's chopped off after (ISTR) 16 characters by the kernel so we don't even get the entire executable name. The fact that it contains no options means that every job on some types of systems - the UiO ML nodes are like this - in the log is going to be python or java, and this is far from ideal.

Now, /proc/PID/cmdline has more information (if the process itself has not redacted it). Unfortunately it may not be safe to read /proc/PID/cmdline, a coworker alerted me to this, and https://rachelbythebay.com/w/2014/10/27/ps/ goes into some detail. Basically, if the process whose command line you want is in uninterruptible sleep, the process asking for the information goes into sleep too, and it will never come out of it - you can kill the latter process but the zombie will supposedly hang around until reboot. Another report: moby/moby#15204. Most reports I find are old (8-10 years). I don't know how much of a problem this is, as it seems related to memory-constrained containers that are stuck because their memory limits have been exceeded, but I could see that being an issue on HPC systems.

One could imagine making sonar resilient against this by forking off a process to read the command line and making the parent time out if no response is received quickly, but we wouldn't want to fork off one process per process we're monitoring, for one thing. Another mechanism may be to fork off a single process to get all the command lines and if it hangs, then oh well - we'll maybe get the information on the next sonar run, and sonarlog can clean things up. Obviously this is also not ideal.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

1 participant