Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

child terminated by signal: 11: Segmentation fault #62

Closed
Taeung opened this issue Feb 8, 2017 · 21 comments
Closed

child terminated by signal: 11: Segmentation fault #62

Taeung opened this issue Feb 8, 2017 · 21 comments

Comments

@Taeung
Copy link
Collaborator

Taeung commented Feb 8, 2017

Hi Namhyung !

I found a error in the situation as below.
(I tested on the latest master branch.)

$ cat hello.c
#include <stdio.h>

void main()
{
	printf("hello\n");
}
$ gcc -pg -o hello hello.c
$ uftrace record hello 
child terminated by signal: 11: Segmentation fault

I a bit looked into the situation debugging uftrace.
And I found several things as below.
First, segment fault happened in libmcount-fast.so

$ tail -f /var/log/kern.log
... (omitted) ...

Feb  8 22:50:55 taeung-ThinkPad-X1-Carbon-3rd kernel: [427108.691576] hello[1245]: segfault at 7ffd3e049fd8 ip 00007ff209b03e0f sp 00007ffd3e049fe0 error 6 in libmcount-fast.so[7ff209aff000+15000]
...

And after poll() at cmd-record.c, POLLHUP was returned so read_record_mmap() didn't work even once.

$ cat cmd-record.c
...
1481         while (!uftrace_done) {
1482                 struct pollfd pollfd = {
1483                         .fd = pfd[0],
1484                         .events = POLLIN,
1485                 };
1486                 int ret;
1487 
1488                 ret = poll(&pollfd, 1, 1000);
1489                 if (ret < 0 && errno == EINTR)
1490                         continue;
1491                 if (ret < 0)
1492                         pr_err("error during poll");
1493 
1494                 if (pollfd.revents & POLLIN)
1495                         read_record_mmap(pfd[0], opts->dirname, opts->bufsize);
1496 
1497                 if (pollfd.revents & (POLLERR | POLLHUP))
1498                         break;
1499         }
...

So the uftrace.data has not task.txt as below but we can know info as below

$ ls uftrace.data
hello.sym  info

$ cat uftrace.data/info 
Ftrace!�(��c���exename:/home/taeung/workspace/perf-test/test_hello
build_id:d456fe41bf063dc8db3135b8d8d58723ad3dd23c
exit_status:139
cmdline:/home/taeung/git/uftrace/uftrace record test_hello 
cpuinfo:lines=2
cpuinfo:nr_cpus=4 / 4 (online/possible)
cpuinfo:desc=Intel(R) Core(TM) i7-5500U CPU @ 2.40GHz
meminfo:1.6 / 7.4 GB (free / total)
osinfo:lines=3
osinfo:kernel=Linux 4.5.0-rc4+
osinfo:hostname=taeung-ThinkPad-X1-Carbon-3rd
osinfo:distro="Ubuntu 16.04 LTS"
usageinfo:lines=6
usageinfo:systime=0.004000
usageinfo:usrtime=0.000000
usageinfo:ctxsw=3 / 1 (voluntary / involuntary)
usageinfo:maxrss=10736
usageinfo:pagefault=0 / 2214 (major / minor)
usageinfo:iops=0 / 0 (read / write)
loadinfo:0.31 / 0.30 / 0.27

And output of runtest.py

Test case             pg             finstrument-fu
--------------------: O0 O1 O2 O3 Os O0 O1 O2 O3 Os
001 basic           : OK OK OK OK OK OK OK OK OK OK
...
016 alloca          : OK OK OK OK OK OK NG NG NG NG
...
052 nested_func     : OK OK OK OK OK NG NG NG NG NG
...
058 arg_int         : OK OK OK OK OK SK SK SK SK SK
059 arg_str         : OK OK OK OK OK SK SK SK SK SK
060 arg_fmt         : OK OK OK OK OK SK SK SK SK SK
061 arg_plt         : OK OK OK OK OK OK OK OK OK OK
062 arg_char        : OK OK OK OK OK SK SK SK SK SK
063 retval          : OK OK OK OK OK SK SK SK SK SK
064 trigger_trace   : OK OK OK OK OK OK OK OK OK OK
065 arg_order       : OK OK OK OK OK SK SK SK SK SK
066 no_demangle     : OK OK OK OK OK OK OK OK OK OK
067 report_diff     : OK OK OK OK OK OK OK OK OK OK
068 filter_time_A   : OK OK OK OK OK SK SK SK SK SK
...
079 replay_kernel_D : SK SK SK SK SK SK SK SK SK SK
080 replay_kernel_D2: SK SK SK SK SK SK SK SK SK SK
081 kernel_depth    : SK SK SK SK SK SK SK SK SK SK
...
103 dump_kernel     : SK SK SK SK SK SK SK SK SK SK
104 graph_kernel    : SK SK SK SK SK SK SK SK SK SK
...
124 exception       : NG SG SG SG SG OK OK OK OK OK
125 report_range    : OK OK OK OK OK OK OK OK OK OK
...

Don't libmcount-fast.so handle the error ?
I don't find out the root cause..

@namhyung
Copy link
Owner

namhyung commented Feb 8, 2017

Could you please show me the backtrace using gdb with a coredump?

@Taeung
Copy link
Collaborator Author

Taeung commented Feb 8, 2017

I got a core file and then use gdb with it but function names is unshown.
Should I use other way to see function names in the backtrace ? 😢

$ gdb ./uftrace ./core 
GNU gdb (Ubuntu 7.11-0ubuntu1) 7.11
Copyright (C) 2016 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from ./uftrace...done.

warning: core file may not match specified executable file.
[New LWP 17633]
Core was generated by `/home/taeung/workspace/perf-test/hello'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00007f23836df62d in ?? ()
(gdb) bt
#0  0x00007f23836df62d in ?? ()
#1  0x0000000000000000 in ?? ()

@namhyung
Copy link
Owner

namhyung commented Feb 8, 2017

Core was generated by `/home/taeung/workspace/perf-test/hello'.

It's not uftrace gets segfault, it's hello.

@namhyung
Copy link
Owner

namhyung commented Feb 8, 2017

It's also strange that the normal testcases succeed but the hello fails.

@Taeung
Copy link
Collaborator Author

Taeung commented Feb 9, 2017

Hum.. I think hello program is fine as itself

$ cat hello.c
#include <stdio.h>

void main()
{
	printf("hello\n");
}
$ gcc -pg -o hello hello.c
$ ./hello
hello

But if this error is related to mcount(), core file can be generated by hello program ?
Because uftrace preload libmcount and mcount() in hello program work as mcount() of libmcount.
So core file was generated by hello program ?
Is it wrong ? 😢

And if running uftrace as below

$ date && uftrace record ./hello
Thu Feb  9 09:39:32 KST 2017
child terminated by signal: 11: Segmentation fault

At same time, a segfault error log related to libmcount-fast.so happen

$ tail -f /var/log/kern.log
...
Feb  9 09:39:32 taeung-ThinkPad-X1-Carbon-3rd kernel: [436056.514009] hello[19435]: segfault at 7ffe40242ff8 ip 00007f8fc7133e0f sp 00007ffe40243000 error 6 in libmcount-fast.so[7f8fc712f000+15000]

My idea is wrong ?

@namhyung
Copy link
Owner

namhyung commented Feb 9, 2017

Your reasoning is correct. So you should use hello (instead of uftrace) with gdb to get the symbols:

$ gdb hello core

@Taeung
Copy link
Collaborator Author

Taeung commented Feb 9, 2017

Okey 😄

$ gdb ./hello ./core
GNU gdb (Ubuntu 7.11-0ubuntu1) 7.11
Copyright (C) 2016 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from ./hello...(no debugging symbols found)...done.
[New LWP 21461]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `hello'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00007f55c16c0623 in mcount () from /usr/local/lib/libmcount-fast.so
(gdb) bt
#0  0x00007f55c16c0623 in mcount () from /usr/local/lib/libmcount-fast.so
#1  0x00007f55c16b4e15 in mcount_entry (parent_loc=0x7f55c16b4e0c <mcount_entry+12>, 
    child=140722745200760, regs=0x0) at /home/taeung/git/uftrace/libmcount/mcount.c:483
#2  0x00007f55c16c064d in mcount () from /usr/local/lib/libmcount-fast.so
#3  0x00007f55c16b4e15 in mcount_entry (parent_loc=0x7f55c16b4e0c <mcount_entry+12>, 
    child=140722745200904, regs=0x7ffc91bd4318)
    at /home/taeung/git/uftrace/libmcount/mcount.c:483
#4  0x00007f55c16c064d in mcount () from /usr/local/lib/libmcount-fast.so
#5  0x00007f55c16b4e15 in mcount_entry (parent_loc=0x7f55c16b4e0c <mcount_entry+12>, 
    child=140722745201048, regs=0x7ffc91bd4318)
    at /home/taeung/git/uftrace/libmcount/mcount.c:483
#6  0x00007f55c16c064d in mcount () from /usr/local/lib/libmcount-fast.so
#7  0x00007f55c16b4e15 in mcount_entry (parent_loc=0x7f55c16b4e0c <mcount_entry+12>, 
    child=140722745201192, regs=0x7ffc91bd4318)
    at /home/taeung/git/uftrace/libmcount/mcount.c:483
#8  0x00007f55c16c064d in mcount () from /usr/local/lib/libmcount-fast.so
#9  0x00007f55c16b4e15 in mcount_entry (parent_loc=0x7f55c16b4e0c <mcount_entry+12>, 
    child=140722745201336, regs=0x7ffc91bd4318)
    at /home/taeung/git/uftrace/libmcount/mcount.c:483
#10 0x00007f55c16c064d in mcount () from /usr/local/lib/libmcount-fast.so
#11 0x00007f55c16b4e15 in mcount_entry (parent_loc=0x7f55c16b4e0c <mcount_entry+12>, 
    child=140722745201480, regs=0x7ffc91bd4318)
    at /home/taeung/git/uftrace/libmcount/mcount.c:483
#12 0x00007f55c16c064d in mcount () from /usr/local/lib/libmcount-fast.so
#13 0x00007f55c16b4e15 in mcount_entry (parent_loc=0x7f55c16b4e0c <mcount_entry+12>, 
    child=140722745201624, regs=0x7ffc91bd4318)
    at /home/taeung/git/uftrace/libmcount/mcount.c:483
#14 0x00007f55c16c064d in mcount () from /usr/local/lib/libmcount-fast.so
#15 0x00007f55c16b4e15 in mcount_entry (parent_loc=0x7f55c16b4e0c <mcount_entry+12>, 
    child=140722745201768, regs=0x7ffc91bd4318)
    at /home/taeung/git/uftrace/libmcount/mcount.c:483
#16 0x00007f55c16c064d in mcount () from /usr/local/lib/libmcount-fast.so
#17 0x00007f55c16b4e15 in mcount_entry (parent_loc=0x7f55c16b4e0c <mcount_entry+12>, 
    child=140722745201912, regs=0x7ffc91bd4318)
    at /home/taeung/git/uftrace/libmcount/mcount.c:483
#18 0x00007f55c16c064d in mcount () from /usr/local/lib/libmcount-fast.so
#19 0x00007f55c16b4e15 in mcount_entry (parent_loc=0x7f55c16b4e0c <mcount_entry+12>, 
    child=140722745202056, regs=0x7ffc91bd4318)
    at /home/taeung/git/uftrace/libmcount/mcount.c:483
#20 0x00007f55c16c064d in mcount () from /usr/local/lib/libmcount-fast.so
#21 0x00007f55c16b4e15 in mcount_entry (parent_loc=0x7f55c16b4e0c <mcount_entry+12>, 
    child=140722745202200, regs=0x7ffc91bd4318)
    at /home/taeung/git/uftrace/libmcount/mcount.c:483
#22 0x00007f55c16c064d in mcount () from /usr/local/lib/libmcount-fast.so
#23 0x00007f55c16b4e15 in mcount_entry (parent_loc=0x7f55c16b4e0c <mcount_entry+12>, 
    child=140722745202344, regs=0x7ffc91bd4318)
    at /home/taeung/git/uftrace/libmcount/mcount.c:483
#24 0x00007f55c16c064d in mcount () from /usr/local/lib/libmcount-fast.so
#25 0x00007f55c16b4e15 in mcount_entry (parent_loc=0x7f55c16b4e0c <mcount_entry+12>, 
    child=140722745202488, regs=0x7ffc91bd4318)
    at /home/taeung/git/uftrace/libmcount/mcount.c:483
#26 0x00007f55c16c064d in mcount () from /usr/local/lib/libmcount-fast.so
---Type <return> to continue, or q <return> to quit---q
Quit
(gdb) info 0
Undefined info command: "0".  Try "help info".
(gdb) info f 0
Stack frame at 0x7ffc913d5020:
 rip = 0x7f55c16c0623 in mcount; saved rip = 0x7f55c16b4e15
 called by frame at 0x7ffc913d5070
 Arglist at 0x7ffc913d4fd8, args: 
 Locals at 0x7ffc913d4fd8, Previous frame's sp is 0x7ffc913d5020
 Saved registers:
  rdx at 0x7ffc913d4ff8, rsi at 0x7ffc913d5000, rdi at 0x7ffc913d5008, rip at 0x7ffc913d5018
(gdb) info f 1
Stack frame at 0x7ffc913d5070:
 rip = 0x7f55c16b4e15 in mcount_entry (/home/taeung/git/uftrace/libmcount/mcount.c:483); 
    saved rip = 0x7f55c16c064d
 called by frame at 0x7ffc913d50b0, caller of frame at 0x7ffc913d5020
 source language c.
 Arglist at 0x7ffc913d5060, args: parent_loc=0x7f55c16b4e0c <mcount_entry+12>, 
    child=140722745200760, regs=0x0
 Locals at 0x7ffc913d5060, Previous frame's sp is 0x7ffc913d5070
 Saved registers:
  rbx at 0x7ffc913d5040, rbp at 0x7ffc913d5060, r12 at 0x7ffc913d5048, r13 at 0x7ffc913d5050,
  r14 at 0x7ffc913d5058, rip at 0x7ffc913d5068
(gdb) info f 2
Stack frame at 0x7ffc913d50b0:
 rip = 0x7f55c16c064d in mcount; saved rip = 0x7f55c16b4e15
 called by frame at 0x7ffc913d5100, caller of frame at 0x7ffc913d5070
 Arglist at 0x7ffc913d5068, args: 
 Locals at 0x7ffc913d5068, Previous frame's sp is 0x7ffc913d50b0
 Saved registers:
  rax at 0x7ffc913d5068, rcx at 0x7ffc913d5080, rdx at 0x7ffc913d5088, rsi at 0x7ffc913d5090,
  rdi at 0x7ffc913d5098, r8 at 0x7ffc913d5078, r9 at 0x7ffc913d5070, rip at 0x7ffc913d50a8
(gdb) info f 3
Stack frame at 0x7ffc913d5100:
 rip = 0x7f55c16b4e15 in mcount_entry (/home/taeung/git/uftrace/libmcount/mcount.c:483); 
    saved rip = 0x7f55c16c064d
 called by frame at 0x7ffc913d5140, caller of frame at 0x7ffc913d50b0
 source language c.
 Arglist at 0x7ffc913d50f0, args: parent_loc=0x7f55c16b4e0c <mcount_entry+12>, 
    child=140722745200904, regs=0x7ffc91bd4318
 Locals at 0x7ffc913d50f0, Previous frame's sp is 0x7ffc913d5100
 Saved registers:
  rbx at 0x7ffc913d50d0, rbp at 0x7ffc913d50f0, r12 at 0x7ffc913d50d8, r13 at 0x7ffc913d50e0,
  r14 at 0x7ffc913d50e8, rip at 0x7ffc913d50f8
(gdb) info f 4
Stack frame at 0x7ffc913d5140:
 rip = 0x7f55c16c064d in mcount; saved rip = 0x7f55c16b4e15
 called by frame at 0x7ffc913d5190, caller of frame at 0x7ffc913d5100
 Arglist at 0x7ffc913d50f8, args: 
 Locals at 0x7ffc913d50f8, Previous frame's sp is 0x7ffc913d5140
 Saved registers:
  rax at 0x7ffc913d50f8, rcx at 0x7ffc913d5110, rdx at 0x7ffc913d5118, rsi at 0x7ffc913d5120,
  rdi at 0x7ffc913d5128, r8 at 0x7ffc913d5108, r9 at 0x7ffc913d5100, rip at 0x7ffc913d5138
(gdb) info f 5
Stack frame at 0x7ffc913d5190:
 rip = 0x7f55c16b4e15 in mcount_entry (/home/taeung/git/uftrace/libmcount/mcount.c:483); 
    saved rip = 0x7f55c16c064d
 called by frame at 0x7ffc913d51d0, caller of frame at 0x7ffc913d5140
 source language c.
 Arglist at 0x7ffc913d5180, args: parent_loc=0x7f55c16b4e0c <mcount_entry+12>, 
    child=140722745201048, regs=0x7ffc91bd4318
 Locals at 0x7ffc913d5180, Previous frame's sp is 0x7ffc913d5190
 Saved registers:
  rbx at 0x7ffc913d5160, rbp at 0x7ffc913d5180, r12 at 0x7ffc913d5168, r13 at 0x7ffc913d5170,
  r14 at 0x7ffc913d5178, rip at 0x7ffc913d5188
(gdb) quit

Is the segfault related to mcount_entry() of libmcount/mcount.c:483 ?

@namhyung
Copy link
Owner

namhyung commented Feb 9, 2017

Seems like a recursion. Did you install a correct version? Note that the test program uses a just compiled binary.

@Taeung
Copy link
Collaborator Author

Taeung commented Feb 9, 2017

Correct version ? Which version ?
I tested on your master branch.
And I tested v0.6.1 and v0.6 with 'make clean && make '.

$ git reset --hard upstream/master
$ make clean && make -j4
$ ./uftrace record hello
child terminated by signal: 11: Segmentation fault
$ git reset --hard v0.6.1
HEAD is now at 5d9881b uftrace: Bump up version to 0.6.1
$ make clean && make -j4
$ ./uftrace record hello
child terminated by signal: 11: Segmentation fault
$ git reset --hard v0.6
HEAD is now at ee64ffe uftrace: Bump up version to 0.6
$ make clean && make -j4
$ ./uftrace record hello
child terminated by signal: 11: Segmentation fault

But the segfault still remain. I think the root cause might be from system wide.
Is it wrong guess ? If I reboot computer and then the segfault can disappear.
Or,
How do I investigate the segfault ? when compiling libmcount, use -g option ?
I don't know exactly debugging libmcount-fast.so.. 😢

@namhyung
Copy link
Owner

namhyung commented Feb 9, 2017

Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x00007f55c16c0623 in mcount () from /usr/local/lib/libmcount-fast.so

This comes from /usr/local/lib so I guess you installed a faulty version. I'd like to see the result after you run sudo make install.

@namhyung
Copy link
Owner

namhyung commented Feb 9, 2017

If you want to run a compiled version, you need to give -L . option to tell uftrace to load libmcount*.so from the current directory.

@namhyung
Copy link
Owner

@Taeung ping!

@Taeung
Copy link
Collaborator Author

Taeung commented Feb 10, 2017

Sorry, I'm late.

I do sudo make install and then check libmcount*.so in /usr/local/lib.

$ sudo make install && date
  GEN      version.h
  CC       uftrace.o
  CC       cmd-recv.o
  CC       cmd-dump.o
  CC       cmd-graph.o
  CC       cmd-report.o
  CC       cmd-replay.o
  CC       cmd-info.o
  CC       cmd-live.o
  CC       cmd-record.o
  CC       utils/session.o
  CC       utils/rbtree.o
  CC       utils/data-file.o
  CC       utils/pager.o
  CC       utils/demangle.o
  CC       utils/filter.o
  CC       utils/fstack.o
  CC       utils/debug.o
  CC       utils/kernel.o
  CC       utils/symbol.o
  CC       utils/utils.o
  CC       arch/x86_64/cpuinfo.o
  CC       arch/x86_64/regs.o
  FLAGS:   * new build flags or cross compiler
  CC FPIC  libtraceevent/event-parse.o
  CC FPIC  libtraceevent/event-plugin.o
  CC FPIC  libtraceevent/trace-seq.o
  CC FPIC  libtraceevent/parse-filter.o
  CC FPIC  libtraceevent/parse-utils.o
  CC FPIC  libtraceevent/kbuffer-parse.o
  LINK     libtraceevent/libtraceevent.a
  LINK     uftrace
  CC FPIC  libmcount/mcount.op
  CC FPIC  libmcount/record.op
  CC FPIC  libmcount/plthook.op
  CC FPIC  utils/symbol.op
  CC FPIC  utils/debug.op
  CC FPIC  utils/rbtree.op
  CC FPIC  utils/filter.op
  CC FPIC  utils/demangle.op
  CC FPIC  utils/utils.op
  CC FPIC  arch/x86_64/mcount-support.op
  CC FPIC  arch/x86_64/regs.op
  ASM      arch/x86_64/mcount.op
  ASM      arch/x86_64/fentry.op
  ASM      arch/x86_64/plthook.op
  LINK     arch/x86_64/entry.op
  LINK     libmcount/libmcount.so
  CC FPIC  libmcount/mcount-fast.op
  CC FPIC  libmcount/record-fast.op
  CC FPIC  libmcount/plthook-fast.op
  LINK     libmcount/libmcount-fast.so
  CC FPIC  libmcount/mcount-single.op
  CC FPIC  libmcount/record-single.op
  CC FPIC  libmcount/plthook-single.op
  LINK     libmcount/libmcount-single.so
  CC FPIC  libmcount/mcount-fast-single.op
  CC FPIC  libmcount/record-fast-single.op
  CC FPIC  libmcount/plthook-fast-single.op
  LINK     libmcount/libmcount-fast-single.so
  CC FPIC  libmcount/mcount-nop.op
  LINK     libmcount/libmcount-nop.so
  INSTALL  uftrace
  INSTALL  libmcount
  INSTALL  bash-completion
  GEN      uftrace.1
  GEN      uftrace-record.1
  GEN      uftrace-replay.1
  GEN      uftrace-live.1
  GEN      uftrace-report.1
  GEN      uftrace-recv.1
  GEN      uftrace-info.1
  GEN      uftrace-dump.1
  GEN      uftrace-graph.1
  INSTALL  man-pages
Fri Feb 10 10:45:41 KST 2017
$ ll /usr/local/lib | grep libmcount
-rwxr-xr-x  1 root root    398688 2017-02-10 10:45:41 libmcount-fast-single.so*
-rwxr-xr-x  1 root root    399352 2017-02-10 10:45:41 libmcount-fast.so*
-rwxr-xr-x  1 root root      9496 2017-02-10 10:45:41 libmcount-nop.so*
-rwxr-xr-x  1 root root    464576 2017-02-10 10:45:41 libmcount-single.so*
-rwxr-xr-x  1 root root    465136 2017-02-10 10:45:41 libmcount.so*

@namhyung
Copy link
Owner

@Taeung hmm.. do you still see the segfault with the current version?

@Taeung
Copy link
Collaborator Author

Taeung commented Feb 10, 2017

No, it disappeared!

When I do sudo make install and libmcount*.so are renewedly changed, the segfault problem disappeared..
I think the root cause is that I used -pg option when compiling uftrace before.
A couple of days ago, I tried to do uftrace for uftrace tool itself.
But I found the segfault error and then without -pg I recompiled uftrace but the segfault still remained.

As you said, I used faulty version libmcount*.so that were compiled with -pg that's why the segfault still remained even though I recompiled without -pg on current master branch..

What do you think about it ?
And I think it is right that the segfault is due to a recursion as you said.

But already did you fix the recursion problem ?
uftrace can trace itself a few days ago.

@namhyung
Copy link
Owner

The problem is in libmcount*.so not uftrace tool itself. After rebuilding (but not installing it), you used uftrace without -pg but the libmcount*.so are still built with -pg.

It's currently not possible to trace libmcount*.so but uftrace itself is possible..

@namhyung
Copy link
Owner

I hope github provides a way to fold long code output..

@Taeung
Copy link
Collaborator Author

Taeung commented Feb 10, 2017

I understood !
But I wonder that even though uftrace didn't trace itself, why do the segfault happened with uftrace record ./hello

$ git diff
diff --git a/Makefile b/Makefile
index 330522b..c5c006d 100644
--- a/Makefile
+++ b/Makefile
@@ -60,7 +60,7 @@ INSTALL = install
 
 export ARCH CC AR LD RM srcdir objdir
 
-COMMON_CFLAGS := -O2 -g -D_GNU_SOURCE $(CFLAGS) $(CPPFLAGS)
+COMMON_CFLAGS := -O2 -pg -g -D_GNU_SOURCE $(CFLAGS) $(CPPFLAGS)
 COMMON_CFLAGS +=  -iquote $(srcdir) -iquote $(objdir) -iquote $(srcdir)/arch/$(ARCH)
 #CFLAGS-DEBUG = -g -D_GNU_SOURCE $(CFLAGS_$@)
 COMMON_LDFLAGS := -lelf -lrt -ldl -pthread $(LDFLAGS)
$ make clean && sudo make install
  CLEAN    uftrace
  CLEAN    test
  CLEAN    man-pages
  CLEAN    libtraceevent
[sudo] password for taeung: 
  GEN      version.h
  CC       uftrace.o
  CC       cmd-recv.o
  CC       cmd-dump.o
  CC       cmd-graph.o
  CC       cmd-report.o
...
$ date
Fri Feb 10 11:22:26 KST 2017

$ ll /usr/local/lib | grep libmcount
-rwxr-xr-x  1 root root    394256 2017-02-10 11:22:25 libmcount-fast-single.so*
-rwxr-xr-x  1 root root    394800 2017-02-10 11:22:25 libmcount-fast.so*
-rwxr-xr-x  1 root root    464624 2017-02-10 11:22:25 libmcount-single.so*
-rwxr-xr-x  1 root root      9504 2017-02-10 11:22:25 libmcount-nop.so*
-rwxr-xr-x  1 root root    465056 2017-02-10 11:22:25 libmcount.so*
$ uftrace record ~/workspace/perf-test/hello
child terminated by signal: 11: Segmentation fault

@Taeung
Copy link
Collaborator Author

Taeung commented Feb 10, 2017

Ah... when I use libmcount*.so that are compiled without -pg, there is't the problem.

I think we shouldn't use libmcount*.so compiled with -pg, whatever uftrace trace (uftrace itself or other program)
Is it right ?

@namhyung
Copy link
Owner

Right.

If libmcount*.so is compiled with -pg, gcc will add a call to mcount() in mcount(), hence the recursion. :)

@Taeung
Copy link
Collaborator Author

Taeung commented Feb 10, 2017

I understood !
Thank you !! 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants