Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

pmatop crashes when it encounters pid >= 1000000 #58

Closed
sklochkov opened this Issue Dec 16, 2015 · 12 comments

Comments

Projects
None yet
3 participants

When kernel.pid_max is set to a high value (> 1000000) and a process with pid >= 1000000 appears, pmatop segfaults with the following stack.
#0 __strncpy_sse2_unaligned () at ../sysdeps/x86_64/multiarch/strcpy-sse2-unaligned.S:296
#1 0x000000000040bb69 in strncpy (__len=255, __src=, __dest=0x7f94cda7ceb8 "") at /usr/include/bits/string3.h:120
#2 update_task (dp=0x62cdc0 <descs.9608>, rp=0x1acde20, name=, pid=1033589, task=0x7f94cda7ce60) at photoproc.c:32
#3 photoproc (tasks=, taskslen=) at photoproc.c:152
#4 0x0000000000402e6e in main (argc=, argv=) at atop.c:577

Contributor

lberk commented Dec 16, 2015

Hi,

Thanks for reporting this issue! I'm working to reproduce this now, however, I'm weary that the large PID might be a red herring here. From the output, it appears that the 'name' var is empty, which in the context of the update_task function:

strsep(&name, " "); /* remove process identifier prefix */
strncpy(task->gen.cmdline, name, CMDLEN);

Means, the return of strsep would be NULL, and name would be empty (running through this with gdb and

set name = ""

lead to a similar SEGV. Regardless of whether or not the large PID is causing this issue (which I'm still looking into), it appears we need to be more careful with our assumptions of the name being passed to update_task.

Hello Iberk,
The large pid does not seem to be red herring. I took my time to reproduce this issue on a fresh system, and pmatop worked fine until a process with pid > 1000000 appeared. After it happened, it began crashing every time.

Contributor

fche commented Dec 16, 2015

strsep sets name to NULL if it couldn't find the space delimiter, which could happen if the delimiter wasn't printed because the preceding/following number was too large.

Contributor

fche commented Dec 16, 2015

http://paste.fedoraproject.org/301833/94002145/ <-- untested, to fix the proc-pmda side of the problem

Hi fche,

Looks like you are right. I used strace to monitor all data that gets readen by pmatop. The pid that casuses core dump in this case is 1099396. This is how the line in question looks.

1099396-bash\0\32\263_<...>

Indeed, there is no space between pid and process name.

http://paste.fedoraproject.org/301833/94002145/ <-- untested, to fix the proc-pmda side of the problem

I'll give this patch a try.

Contributor

fche commented Dec 16, 2015

http://paste.fedoraproject.org/301836/50294335 <-- untested, to fix the pcp-atop side of the problem

Thank you very much, the patches work.

Contributor

lberk commented Dec 16, 2015

Thanks @fche for the patch.

@sklochkov would you mind providing a tiny archive so that we can test for this in the future? Assuming you still have a setup which includes a PID >= 1000000

If you used a pcp-atop specific config file (such as http://paste.fedoraproject.org/301879/50298891/ )
and then ran a command similar to (assuming you saved the config in /tmp/atop-config )...

pmlogger -c /tmp/atop-config -t 1s -T10seconds /tmp/atop-large-pid

This should produce atop-large-pid.{0,index,meta} files.
If you could send the resulting files to me ( lberk -at- redhat.com ) that would be really helpful.

If you could send the resulting files to me ( lberk -at- redhat.com ) that would be really helpful.

I've sent these files to your email address.

Contributor

lberk commented Dec 16, 2015

Got them, thank you very much for taking the time to produce/send them!

Contributor

lberk commented Jan 4, 2016

I'm going to close out this issue for now. The pmdaproc fix and QA is upstream, I've also merged the pmatop specific patch to my tree, which will be sent out for updates soon.

Thanks again for the report/fixes.

@lberk lberk closed this Jan 4, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment