Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segmentation Fault on x86_64 CentOS on Intel Xeon #268

Closed
alk opened this issue Aug 22, 2015 · 10 comments
Closed

Segmentation Fault on x86_64 CentOS on Intel Xeon #268

alk opened this issue Aug 22, 2015 · 10 comments

Comments

@alk
Copy link
Contributor

@alk alk commented Aug 22, 2015

Originally reported on Google Code with ID 265

What steps will reproduce the problem?
1. I have a number crunching app build with perftools 1.6 and libunwind-0.99-beta on
x86_64 (intel xeon).
2. It's compiled and run on a CentOS 5.5:
GNU C Library stable release version 2.5, by Roland McGrath et al.
Compiled by GNU CC version 4.1.2 20080704 (Red Hat 4.1.2-48).
Compiled on a Linux 2.6.9 system on 2010-07-27.
Available extensions:
        The C stubs add-on version 2.1.2.
        crypt add-on version 2.1 by Michael Glad and others
        GNU Libidn by Simon Josefsson
        GNU libio by Per Bothner
        NIS(YP)/NIS+ NSS modules 0.19 by Thorsten Kukuk
        Native POSIX Threads Library by Ulrich Drepper et al
        BIND-8.2.3-T5B
        RT using linux kernel aio
Thread-local storage support included.
3. This is the stack trace when it SIGSEGVs:
#0  access_mem (as=0x2aaab0e368a0, addr=26, val=0x7ffffffc9b00, write=0, arg=0x7ffffffca200)
at x86_64/Ginit.c:164
#1  0x00002aaab0c2dc0d in dwarf_get (c=0x7ffffffca200, rs=<value optimized out>) at
../include/tdep/libunwind_i.h:137
#2  apply_reg_state (c=0x7ffffffca200, rs=<value optimized out>) at dwarf/Gparser.c:766
#3  0x00002aaab0c2e1d7 in _ULx86_64_dwarf_find_save_locs (c=0x7ffffffca200) at dwarf/Gparser.c:849
#4  0x00002aaab0c2e3f9 in _ULx86_64_dwarf_step (c=0x2aaab0e368a0) at dwarf/Gstep.c:35
#5  0x00002aaab0c3094a in _ULx86_64_step (cursor=0x2aaab0e368a0) at x86_64/Gstep.c:42
#6  0x00002aaaaaab5331 in ?? () from /home/apopov/google-perftools-1.6/lib/libprofiler.so.0
#7  0x0000000000000000 in ?? ()

What is the expected output? What do you see instead?
Random behavior.

What version of the product are you using? On what operating system?
google-perftools 1.6, libunwind 0.99-beta, CentOS 5.5 64bit on Intel Xeon

Please provide any additional information below.
I use it by running ProfilerStart()/Stop and ProfilerFlush().
The places where it segfaults are no where near those functions.

Reported by long404 on 2010-08-27 12:50:51

@alk
Copy link
Contributor Author

@alk alk commented Aug 22, 2015

Reported by `` on 2012-01-18 21:56:12

Loading

@alk
Copy link
Contributor Author

@alk alk commented Aug 22, 2015

This looks the same as issue 57.  We never figured it out then, unfortunately.  It looks
like a libunwind bug, though it's hard to say for sure.

Can you try recompiling libunwind with -g -O0?  That will give more information in
the stacktrace at failure.  You may want to do the same with perftools as well.

You may want to bring this up with the libunwind folks as well, so they can look into
it concurrently.

Reported by csilvers on 2010-08-28 03:19:55

  • Labels added: Type-Defect, Priority-Medium

Loading

@alk
Copy link
Contributor Author

@alk alk commented Aug 22, 2015

I'll do that, If I get some free time this week I'll also try to look into it myself.
Will post the trace again when I have unwind and pertools compiled with debug and will
take it with the unwing guys as well.

Reported by long404 on 2010-08-30 07:02:03

Loading

@alk
Copy link
Contributor Author

@alk alk commented Aug 22, 2015

This is the backtrace with the -g and -O0:

Program received signal SIGSEGV, Segmentation fault.
0x00002aaab0c38c2a in access_mem (as=0x2aaab0e40980, addr=114, val=0x7fffffee8c80,
write=0, arg=0x7fffffee9120) at x86_64/Ginit.c:164
(gdb) bt
#0  0x00002aaab0c38c2a in access_mem (as=0x2aaab0e40980, addr=114, val=0x7fffffee8c80,
write=0, arg=0x7fffffee9120) at x86_64/Ginit.c:164
#1  0x00002aaab0c350dd in dwarf_get (c=0x7fffffee9120, loc=..., val=0x7fffffee8c80)
at ../include/tdep/libunwind_i.h:137
#2  0x00002aaab0c34f38 in apply_reg_state (c=0x7fffffee9120, rs=0x2aaab0e44718) at
dwarf/Gparser.c:766
#3  0x00002aaab0c35461 in _ULx86_64_dwarf_find_save_locs (c=0x7fffffee9120) at dwarf/Gparser.c:849
#4  0x00002aaab0c360f1 in _ULx86_64_dwarf_step (c=0x7fffffee9120) at dwarf/Gstep.c:35
#5  0x00002aaab0c399c1 in _ULx86_64_step (cursor=0x7fffffee9120) at x86_64/Gstep.c:42
#6  0x00002aaaaaab7ac1 in GetStackTraceWithContext (result=0x7fffffee9938, max_depth=63,
skip_count=-1, ucp=0x7fffffee9bc0) at src/stacktrace_libunwind-inl.h:114
#7  0x00002aaaaaab4937 in CpuProfiler::prof_handler (sig=27, signal_ucontext=0x7fffffee9bc0,
cpu_profiler=0x2aaaaacbf980) at src/profiler.cc:280
#8  0x00002aaaaaab563f in ProfileHandler::SignalHandler (sig=27, sinfo=0x7fffffee9cf0,
ucontext=0x7fffffee9bc0) at src/profile-handler.cc:439
#9  <signal handler called>
#10 0x0000003186c7299c in _int_malloc () from /lib64/libc.so.6
#11 0x00002aaaace886c2 in std::_Rb_tree<atpco::ATPCO_Object const*, std::pair<atpco::ATPCO_Object
const* const, match_engine::Record3Node const*>, std::_Select1st<std::pair<atpco::ATPCO_Object
const* const, matc
h_engine::Record3Node const*> >, std::less<atpco::ATPCO_Object const*>, std::allocator<std::pair<atpco::ATPCO_Object
const* const, match_engine::Record3Node const*> > >::end (this=Cannot access memory
at address
 0x52
) at /usr/lib/gcc/x86_64-redhat-linux/4.1.2/../../../../include/c++/4.1.2/bits/stl_tree.h:604

Reported by long404 on 2010-08-30 14:00:56

Loading

@alk
Copy link
Contributor Author

@alk alk commented Aug 22, 2015

Definitely suspicious that the segfault is happening while in malloc.  My guess is that
access_mem is calling malloc for some reason, and malloc isn't re-entrant, so you're
getting this crash.

If there's any way you can run with -fno-omit-frame-pointer rather than using libunwind,
that may be more reliable for you here.  Of course, adding in frame pointers will make
your code slower, which may (or may not) defeat the purpose of your cpu profiling.

Reported by csilvers on 2010-08-30 22:16:14

Loading

@alk
Copy link
Contributor Author

@alk alk commented Aug 22, 2015

Thanks csilvers, I'll have it in mind when I start looking into it.
I did compile with framepointers (I was hoping to give you the trace without libunwind)
and even though it segfaults alot more often the stack seems to get corrupted and I
can't extract a stack trace. Will try again a few times with framepointers and if something
comes out I'll post the trace.

cheers,
sasho

Reported by long404 on 2010-08-31 09:16:30

Loading

@alk
Copy link
Contributor Author

@alk alk commented Aug 22, 2015

Did you ever manage to get any more info about this?

Reported by csilvers on 2011-01-10 03:03:24

Loading

@alk
Copy link
Contributor Author

@alk alk commented Aug 22, 2015

Unfortunately, no. I hope I'll get some "free" time next month (Feb) and will probably
look into it.

Reported by long404 on 2011-01-12 17:18:13

Loading

@alk
Copy link
Contributor Author

@alk alk commented Aug 22, 2015

See issue 321: I'll bet this is a libunwind bug, since the stack trace looks the same.
Try libunwind from the git repository. Hopefully they'll put out a new release soon,
so the distributions will actually pick it up.

Reported by evanj@csail.mit.edu on 2011-03-31 14:12:30

Loading

@alk
Copy link
Contributor Author

@alk alk commented Aug 22, 2015

Thanks for the report!  I'm going to close this as a dup of 321.  If you find out it's
something different, feel free to reopen.

Reported by csilvers on 2011-09-01 01:15:02

  • Status changed: Duplicate
  • Merged into: #321

Loading

@alk alk closed this Aug 22, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
1 participant