New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Running valgrind throws General Protection Fault on dl-misc.c #1295

Closed
saschanaz opened this Issue Oct 30, 2016 · 10 comments

Comments

Projects
None yet
6 participants
@saschanaz

saschanaz commented Oct 30, 2016

  • A brief description

Running make;valgrind ./tracegen -M 32 -N 32 -F 1 throws General Protection Fault while not on original Ubuntu 16.04.

  • Expected results: The command should pass without error
  • Actual results (with terminal output if applicable)
==742== Memcheck, a memory error detector
==742== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==742== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
==742== Command: ./tracegen -M 32 -N 32 -F 1
==742==
==742==
==742== Process terminating with default action of signal 11 (SIGSEGV): dumping core
==742==  General Protection Fault
==742==    at 0x40117B1: _dl_name_match_p (dl-misc.c:288)
==742==
==742== HEAP SUMMARY:
==742==     in use at exit: 0 bytes in 0 blocks
==742==   total heap usage: 2 allocs, 2 frees, 1,064 bytes allocated
==742==
==742== All heap blocks were freed -- no leaks are possible
==742==
==742== For counts of detected and suppressed errors, rerun with: -v
==742== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
Segmentation fault (core dumped)
  • Your Windows build number

Insider build 14955

  • Steps / All commands required to reproduce the error from a brand new installation
  1. Download and unzip https://1drv.ms/u/s!ArDhQNmhsUj02YdmxTJYPm69bWj4cg
  2. make;valgrind ./tracegen -M 32 -N 32 -F 1
  3. Check a segmentation fault happens
  • Strace of the failing command

https://gist.github.com/SaschaNaz/cdf4081a03ed13e97090862aff98dd28

  • Required packages and commands to install
  1. sudo apt-get install valgrind
@aseering

This comment has been minimized.

Contributor

aseering commented Oct 30, 2016

Hi @saschanaz -- thanks for reporting this! It sounds like a specific instance of the issue discussed in #120 . You might want to follow up there. In particular, it would be interesting to know whether the workaround on that thread (compiling Valgrind from source rather than installing the Ubuntu binary version) works for you.

@saschanaz

This comment has been minimized.

saschanaz commented Oct 30, 2016

Hi @aseering, unfortunately installing from source doesn't work but generates a different error stack:

==23483== Memcheck, a memory error detector
==23483== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==23483== Using Valgrind-3.12.0 and LibVEX; rerun with -h for copyright info
==23483== Command: ./tracegen -M 32 -N 32 -F 1
==23483==
==23483==
==23483== Process terminating with default action of signal 11 (SIGSEGV): dumping core
==23483==  General Protection Fault
==23483==    at 0x400A476: do_lookup_x (dl-lookup.c:423)
==23483==    by 0xFFEFFF247: ???
==23483==    by 0x400A94E: _dl_lookup_symbol_x (dl-lookup.c:829)
==23483==    by 0x2: ???
==23483==    by 0x403059F: ???
==23483==
==23483== HEAP SUMMARY:
==23483==     in use at exit: 0 bytes in 0 blocks
==23483==   total heap usage: 2 allocs, 2 frees, 1,064 bytes allocated
==23483==
==23483== All heap blocks were freed -- no leaks are possible
==23483==
==23483== For counts of detected and suppressed errors, rerun with: -v
==23483== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
Segmentation fault (core dumped)

PS: Just did an apt-get upgrade and now I'm getting more specific one:

==1881== Memcheck, a memory error detector
==1881== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==1881== Using Valgrind-3.12.0 and LibVEX; rerun with -h for copyright info
==1881== Command: ./tracegen -M 32 -N 32 -F 1
==1881==
==1881==
==1881== Process terminating with default action of signal 11 (SIGSEGV): dumping core
==1881==  General Protection Fault
==1881==    at 0x4009B05: check_match (dl-lookup.c:92)
==1881==    by 0x4E49FF7: ??? (in /lib/x86_64-linux-gnu/libc-2.23.so)
==1881==    by 0x400A47A: do_lookup_x (dl-lookup.c:423)
==1881==    by 0x400A94E: _dl_lookup_symbol_x (dl-lookup.c:829)
==1881==    by 0x400F8F5: _dl_fixup (dl-runtime.c:111)
==1881==    by 0x4017752: _dl_runtime_resolve_avx (dl-trampoline.h:112)
==1881==    by 0x400E2D: main (tracegen.c:100)
==1881==
==1881== HEAP SUMMARY:
==1881==     in use at exit: 0 bytes in 0 blocks
==1881==   total heap usage: 2 allocs, 2 frees, 1,064 bytes allocated
==1881==
==1881== All heap blocks were freed -- no leaks are possible
==1881==
==1881== For counts of detected and suppressed errors, rerun with: -v
==1881== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
Segmentation fault (core dumped)
@aseering

This comment has been minimized.

Contributor

aseering commented Oct 30, 2016

Thanks @saschanaz for the quick turnaround! That error actually looks different to me; the previous error looked like Valgrind itself was segfaulting, this error looks like Valgrind is intercepting a segfault in your application. That implies to me that it may have actually identified a bug in your application.

The stack trace is in some function that's performing dynamic loading. My first guesses would be that you're either trying to find a symbol in a ".so" file that is corrupted in some way, or you have some other sort of buffer overflow (possibly on the stack so Valgrind wouldn't catch it?; also possibly a use-after-free() that, in this run, also occurs after the relevant block has been re-allocated so Valgrind wouldn't catch it as using un-allocated memory). Could either of those be the case here? (I admit, I haven't looked closely at your code; I figure you know more about it than I do.)

Also -- do you know whether Valgrind runs this application correctly on a real Ubuntu Linux system? I'm not sitting at one right now so I haven't checked. But Valgrind basically runs your entire application in an emulated virtual environment; I have hit cases in the past where that emulation was not complete and, as a result, Valgrind failed to run a binary that worked just fine when running natively.

@saschanaz

This comment has been minimized.

saschanaz commented Oct 30, 2016

@aseering The code is a "cachelab" assignment template from CMU so I don't think there is any big issue. And yes, it runs well on original Ubuntu 16.04 running on VM.

@aseering

This comment has been minimized.

Contributor

aseering commented Oct 30, 2016

@saschanaz -- hm... Yeah, I don't see anything wrong with that code. I'm not sure what's going on here. I'm just another WSL user; I think someone from the WSL team might know more?

For what it's worth, per #120 , Valgrind has been known to not work for a while now; my assumption is that it depends on some Linux kernel functionality that's somewhere in the WSL team's backlog.

@misenesi misenesi added the bug label Nov 8, 2016

@misenesi

This comment has been minimized.

misenesi commented Nov 8, 2016

I have a bug internally opened for this.

@Grauniad

This comment has been minimized.

Grauniad commented Nov 20, 2016

Hi @misenesi , do we know when there will be a fix for this, or if there is a work-around?

@misenesi

This comment has been minimized.

misenesi commented Dec 9, 2016

@Grauniad It is one of the items on my todo list, I am not sure yet when I get to it. If there is anyone more familiar with valgrind and could identify the source of the issue/difference between Ubuntu/WSL, I could get in the fix quicker.

@Grauniad

This comment has been minimized.

Grauniad commented Dec 15, 2016

Hi @misenesi , as per the investigation here: https://github.com/Grauniad/valgrind I'm fairly sure the issue is with the si_code being raised when a process attempts to access unmapped memory.

Logs / repro binary here: https://github.com/Grauniad/valgrind/tree/master/logs

I think #120 is the same issue.

@misenesi

This comment has been minimized.

misenesi commented Dec 30, 2016

@Grauniad thank you for your investigation, it is deeply appreciated! I have checked in a fix for this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment