Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use Linux madvise(MADV_DONTDUMP) to exclude ASan shadow regions from core dumps #345

Closed
ramosian-glider opened this issue Sep 1, 2015 · 17 comments

Comments

Projects
None yet
6 participants
@ramosian-glider
Copy link
Member

commented Sep 1, 2015

Originally reported on Google Code with ID 345

AddressSanitizer maps huge regions to support its state tracking, so core dumps from
ASan-managed processes are very large on 32-bit and unmanageably large on 64-bit. 
The default-enabled feature of disabling core dumps prevents these dumps from being
generated, but in some cases, it would be very useful to get a manageable dump from
an ASan-enabled process.  On Linux 3.4 and later, the system call madvise accepts the
command MADV_DONTDUMP to exclude a region from being written to a core file.  The attached
proof of concept patch uses this command to exclude the ASan shadow ranges.  A test
program using the patched libsanitizer generates core files that, although larger than
an ASan-free build, are quite manageable (~151M core for a trivial crash program).
 This test was done with the libsanitizer that ships with gcc-4.9, but should apply
equally to clang libsanitizer.

Reported by google.8eaf7cd8e5128d8191fe@spamgourmet.com on 2014-09-23 03:00:44


- _Attachment: [0001-asan-madvise.patch](https://storage.googleapis.com/google-code-attachments/address-sanitizer/issue-345/comment-0/0001-asan-madvise.patch)_
@ramosian-glider

This comment has been minimized.

Copy link
Member Author

commented Sep 1, 2015

Hm, this looks much better than disabling core dumping completely.

Reported by eugenis@google.com on 2014-09-23 13:43:43

@ramosian-glider

This comment has been minimized.

Copy link
Member Author

commented Sep 1, 2015

did you try ASAN_OPTIONS=unmap_shadow_on_exit=1 ? 

Reported by konstantin.s.serebryany on 2014-09-23 16:30:08

@ramosian-glider

This comment has been minimized.

Copy link
Member Author

commented Sep 1, 2015

Regarding ASAN_OPTIONS=unmap_shadow_on_exit=1:

No, I did not previously try that.  It's not listed on the Wiki page of known flags
<https://code.google.com/p/address-sanitizer/wiki/Flags> and I did not notice it while
reading the libsanitizer source to implement the madvise patch.  However, now that
you pointed it out, I tried it and it does not seem to do what I want.  I see that
it is supported in both gcc-4.8 and gcc-4.9.  To test it, I used a program where main()
calls abort(), to simulate a program which dies due to failing an internal consistency
check, as opposed to dying because AddressSanitizer found a memory misuse.  I compiled
it with both gcc-4.8, which uses the stock libsanitizer, and with gcc-4.9, which has
my proof of concept madvise patch applied.

When using the stock libsanitizer of gcc-4.8 with ASAN_OPTIONS=unmap_shadow_on_exit=1,disable_core=0
./abort48, the program goes to 100% CPU in system mode and ultimately generates an
apparently 14T core (actual size per du: 3.4M).  I also tried using a colon to separate
the options, with the same result.  The core took about 2 minutes to write, despite
its effective very small size.

When using the locally patched libsanitizer of gcc-4.9 with ASAN_OPTIONS=disable_core=0
./abort49, the program dumps an apparently 48M core (actual size per du: 2.4M) and
exits almost instantly.

I also tried a test program which writes to *(int*)nullptr, to trigger an AddressSanitizer
trap.  In that case, I needed to add abort_on_error=1, otherwise no core file was generated
after AddressSanitizer trapped the SIGSEGV.  This is a bit counterintuitive, since
an unsanitized program would have dumped core on a null pointer write, but an AddressSanitizer-instrumented
program requires both disable_core=0 *and* abort_on_error=1 to dump core on a null
pointer write.  I expected abort_on_error=1 was only needed if I wanted to abort on
errors found specific to AddressSanitizer (redzone, malloc/delete mismatch, etc.) and
that regular errors were affected only by disable_core.  Using abort_on_error=1 here
also generates a core file that is recorded as an ABRT (gdb says "Program terminated
with signal SIGABRT, Aborted."), which while technically true, is misleading since
the abort happened in response to a SIGSEGV.  The frame where the SIGSEGV happened
seems to be well recorded, so the core is still usable for debugging.
- Using ASAN_OPTIONS=disable_core=0 ./segv48, I get an AddressSanitizer report, no
core file, and immediate return to shell.
- Using ASAN_OPTIONS=disable_core=0:abort_on_error=1 ./segv48, I get the long stall
and 14T core file, returning to the shell after about 2 minutes.
- Using ASAN_OPTIONS=disable_core=0:unmap_shadow_on_exit=1 ./segv48, I get no core
file and an immediate return to shell.
- Using ASAN_OPTIONS=disable_core=0:abort_on_error=1:unmap_shadow_on_exit=1 ./segv48,
I get an abort and immediate small core file.
Thus, I understand what you were hoping to see when you suggested unmap_shadow_on_exit=1,
but it does not solve the problem fully, since it only works on scenarios where AddressSanitizer
triggered the abort, but not on scenarios where the program called abort() on its own
(nor scenarios with less common core-generating signals, such as SIGQUIT).  Marking
the tables as not-dumpable ensures that they are not written regardless of why the
kernel writes a core dump.

Reported by google.8eaf7cd8e5128d8191fe@spamgourmet.com on 2014-09-24 22:41:28

@ramosian-glider

This comment has been minimized.

Copy link
Member Author

commented Sep 1, 2015

makes sense. Please send a patch to llvm-commits@ (see https://code.google.com/p/address-sanitizer/wiki/HowToContribute)

Note that we do not #include system headers in asan_rtl.cc.
You will need to create a separate function similar to FlushUnneededShadowMemory in

lib/sanitizer_common/sanitizer_posix_libcdep.cc

The change will also need a tests in test/asan/TestCases/Linux

Thanks for the detailed explanation!

Reported by konstantin.s.serebryany on 2014-09-24 22:54:52

@ramosian-glider

This comment has been minimized.

Copy link
Member Author

commented Sep 1, 2015

Can somebody please summarize whether here is something actionable for me to do or not
(I am assigned as the owner)?

Reported by dvyukov@google.com on 2014-09-26 02:16:28

@ramosian-glider

This comment has been minimized.

Copy link
Member Author

commented Sep 1, 2015

Reported by konstantin.s.serebryany on 2014-09-26 04:45:34

@ramosian-glider

This comment has been minimized.

Copy link
Member Author

commented Sep 1, 2015

I am not a regular committer on llvm or gcc.  I posted the original attachment as a
demonstration for how to implement the change, with the hope that it would be useful
for other one-off users and as a reference for whoever picks up the change to merge
it into libsanitizer.  I do not expect to have enough free time soon to complete all
the steps required to merge the change.

Reported by google.8eaf7cd8e5128d8191fe@spamgourmet.com on 2014-09-27 00:42:27

@ramosian-glider

This comment has been minimized.

Copy link
Member Author

commented Sep 1, 2015

I was wondering if we could do this for both asan and tsan and made this patch. Would
be great if someone could comment on it.

Reported by nischayn22 on 2014-12-03 10:12:02


- _Attachment: [common_madvise_fix.patch](https://storage.googleapis.com/google-code-attachments/address-sanitizer/issue-345/comment-8/common_madvise_fix.patch)_
@ramosian-glider

This comment has been minimized.

Copy link
Member Author

commented Sep 1, 2015

Sure, the approach makes sense. 
You are more than welcome to contribute the full patch:
  - add a run-time flag use_madv_dontdump, on by default
  - add a test which tests use_madv_dontdump=1, use_madv_dontdump=0, and default

Reported by konstantin.s.serebryany on 2014-12-03 17:12:22

@ramosian-glider

This comment has been minimized.

Copy link
Member Author

commented Sep 1, 2015

I've filed a patch for this at http://reviews.llvm.org/D7294

Reported by tetra2005x on 2015-01-30 14:59:10

@ramosian-glider

This comment has been minimized.

Copy link
Member Author

commented Sep 1, 2015

Adding Project:AddressSanitizer as part of GitHub migration.

Reported by ramosian.glider on 2015-07-30 09:06:34

  • Labels added: ProjectAddressSanitizer
@ghost

This comment has been minimized.

Copy link

commented Dec 1, 2015

This was commited long ago, could you close?

@RoelVdP

This comment has been minimized.

Copy link

commented Aug 26, 2016

How is this supposed to work? I tried all sorts of combinations with mysqld but either the dump is very large or it is too small (13 or 14MB) and it does not have suitable information. I tried things like;

export ASAN_OPTIONS=disable_core=0:abort_on_error=1:unmap_shadow_on_exit=1
export ASAN_OPTIONS=disable_core=0:abort_on_error=1
export ASAN_OPTIONS=abort_on_error=1:use_madv_dontdump=1

@yugr

This comment has been minimized.

Copy link

commented Aug 29, 2016

@chefmax

This comment has been minimized.

Copy link
Collaborator

commented Aug 29, 2016

@RoelVdP What GCC/Clang version do you use?

@RoelVdP

This comment has been minimized.

Copy link

commented Aug 30, 2016

@chefmax thanks.

$ cat /etc/redhat-release
CentOS Linux release 7.2.1511 (Core)
$ gcc --version | grep ^gcc
gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-4)

@chefmax

This comment has been minimized.

Copy link
Collaborator

commented Aug 30, 2016

Oh, it seems that your GCC is too old. The madvise change was implemented in clang at the beginning of 2015, so I suspect you'll need at least GCC 6+ .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.