Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[compiler-rt] Avoid generating coredumps when piped to a tool #83701

Conversation

arichardson
Copy link
Member

@arichardson arichardson commented Mar 3, 2024

I was trying to debug why ninja check-compiler-rt was taking so long
to run on my system and after some debugging it turned out that most of
the time was being spent generating core dumps.

On many current Linux systems, coredumps are no longer dumped in the CWD
but instead piped to a utility such as systemd-coredumpd that stores
them in a deterministic location. This can be done by setting the
kernel.core_pattern sysctl to start with a '|'. However, when using such
a setup the kernel ignores a coredump limit of 0 (since there is no file
being written) and we can end up piping many gigabytes of data to
systemd-coredumpd which causes the test suite to freeze for a long time.
While most piped coredump handlers do respect the crashing processes'
RLIMIT_CORE, this is notable not the case for Debian's systemd-coredump
due to a local patch that changes sysctl.d/50-coredump.conf to ignore
the specified limit and instead use RLIM_INFINITY
(https://salsa.debian.org/systemd-team/systemd/-/commit/64599ffe44f0d).

Fortunately there is a workaround: the kernel recognizes the magic value
of 1 for RLIMIT_CORE to disable coredumps when piping. One byte is also
too small to generate any coredump, so it effectively behaves as if we
had set the value to zero.

The alternative to using RLIMIT_CORE=1 would be to use prctl() with the
PR_SET_DUMPABLE flag, however that also prevents ptrace(), so makes it
impossible to attach a debugger.

Fixes: #45797

Created using spr 1.3.4
@llvmbot
Copy link
Collaborator

llvmbot commented Mar 3, 2024

@llvm/pr-subscribers-compiler-rt-sanitizer

Author: Alexander Richardson (arichardson)

Changes

I was trying to debug why ninja check-compiler-rt was taking so long
to run on my system and after some debugging it turned out that most of
the time was being spent generating core dumps.

On many current Linux systems, coredumps are no longer dumped in the CWD
but instead piped to a utility such as systemd-coredumpd that stores
them in a deterministic location. This can be done by setting the
kernel.core_pattern sysctl to start with a '|'. However, when using such
a setup the kernel ignores a coredump limit of 0 (since there is no file
being written) and we can end up piping many gigabytes of data to
systemd-coredumpd which causes the test suite to freeze for a long time.

Fortunately there is a workaround: the kernel recognizes the magic value
of 1 for RLIMIT_CORE to disable coredumps when piping. One byte is also
too small to generate any coredump, so it effectively behaves as if we
had set the value to zero.

Fixes: #45797


Full diff: https://github.com/llvm/llvm-project/pull/83701.diff

1 Files Affected:

  • (modified) compiler-rt/lib/sanitizer_common/sanitizer_posix_libcdep.cpp (+20-1)
diff --git a/compiler-rt/lib/sanitizer_common/sanitizer_posix_libcdep.cpp b/compiler-rt/lib/sanitizer_common/sanitizer_posix_libcdep.cpp
index e88e654eec5a1c..7472a256870c16 100644
--- a/compiler-rt/lib/sanitizer_common/sanitizer_posix_libcdep.cpp
+++ b/compiler-rt/lib/sanitizer_common/sanitizer_posix_libcdep.cpp
@@ -103,9 +103,28 @@ static void setlim(int res, rlim_t lim) {
 }
 
 void DisableCoreDumperIfNecessary() {
+  rlimit rlim;
+  CHECK_EQ(0, getrlimit(RLIMIT_CORE, &rlim));
   if (common_flags()->disable_coredump) {
-    setlim(RLIMIT_CORE, 0);
+#  ifdef __linux__
+    // On Linux, if the kernel.core_pattern sysctl starts with a '|' (i.e. it
+    // is being piped to a coredump handler such as systemd-coredumpd), the
+    // kernel ignores RLIMIT_CORE (since we aren't creating a file in the file
+    // system) except for the magic value of 1, that disables coredumps when
+    // piping. 1 byte is also too small for any kind of valid core dump, so it
+    // also disables coredumps if kernel.core_pattern creates files directly.
+    rlim.rlim_cur = Min<rlim_t>(1, rlim.rlim_max);
+#  else
+    rlim.rlim_cur = 0;
+#  endif
+  } else {
+    // Set the limit to 1GB to avoid blocking the system for multiple minutes
+    // while dumping all (potentially huge) mappings.
+    // We are only setting the soft limit, so if it's really wanted, users can
+    // override it to generate a massive core dump.
+    rlim.rlim_cur = Min<rlim_t>(1024 * 1024 * 1024, rlim.rlim_max);
   }
+  CHECK_EQ(0, setrlimit(RLIMIT_CORE, &rlim));
 }
 
 bool StackSizeIsUnlimited() {

Created using spr 1.3.4
@MaskRay
Copy link
Member

MaskRay commented Mar 4, 2024

How about PR_SET_DUMPABLE? #83703 (comment)

arichardson added a commit to arichardson/upstream-llvm-project that referenced this pull request Mar 7, 2024
I was trying to debug why `ninja check-compiler-rt` was taking so long
to run on my system and after some debugging it turned out that most of
the time was being spent generating core dumps.

On many current Linux systems, coredumps are no longer dumped in the CWD
but instead piped to a utility such as systemd-coredumpd that stores
them in a deterministic location. This can be done by setting the
kernel.core_pattern sysctl to start with a '|'. However, when using such
a setup the kernel ignores a coredump limit of 0 (since there is no file
being written) and we can end up piping many gigabytes of data to
systemd-coredumpd which causes the test suite to freeze for a long time.

Fortunately there is a workaround: the kernel recognizes the magic value
of 1 for RLIMIT_CORE to disable coredumps when piping. One byte is also
too small to generate any coredump, so it effectively behaves as if we
had set the value to zero.

Fixes: llvm#45797

Pull Request: llvm#83701
arichardson added a commit to arichardson/upstream-llvm-project that referenced this pull request Mar 7, 2024
On Linux, if the kernel.core_pattern sysctl starts with a '|' (i.e. it
is being piped to a coredump handler such as systemd-coredumpd), the
kernel ignores RLIMIT_CORE (since we aren't creating a file in the file
system) except for the magic value of 1, that disables coredumps when
piping. 1 byte is also too small for any kind of valid core dump, so it
also disables coredumps if kernel.core_pattern creates files directly.

See llvm#83701 and
llvm#45797

Pull Request: llvm#83703
@arichardson
Copy link
Member Author

Thanks for the suggestions. Will address shortly and will add more comments explaining this behaviour and that it's needed to avoid overloading the system (at least for Debian due to an incorrect systemd-coredumpd patch).

@MaskRay
Copy link
Member

MaskRay commented Mar 8, 2024

Since PR_SET_DUMPABLE has the downside of not working with PTRACE_ATTACH, I agree that setting RLIMIT_CORE is better, and Vitaly's code suggestion looks good.

arichardson added a commit that referenced this pull request Mar 9, 2024
On many current Linux systems, coredumps are no longer dumped in the CWD
but instead piped to a utility such as systemd-coredumpd that stores
them in a deterministic location. This can be done by setting the
kernel.core_pattern sysctl to start with a '|'. However, when using such
a setup the kernel ignores a coredump limit of 0 (since there is no file
being written) and we can end up piping many gigabytes of data to
systemd-coredumpd which causes the test suite to freeze for a long time.
While most piped coredump handlers do respect the crashing processes'
RLIMIT_CORE, this is notable not the case for Debian's systemd-coredump
due to a local patch that changes sysctl.d/50-coredump.conf to ignore
the specified limit and instead use RLIM_INFINITY
(https://salsa.debian.org/systemd-team/systemd/-/commit/64599ffe44f0d).

Fortunately there is a workaround: the kernel recognizes the magic value
of 1 for RLIMIT_CORE to disable coredumps when piping. One byte is also
too small to generate any coredump, so it effectively behaves as if we
had set the value to zero.

The alternative to using RLIMIT_CORE=1 would be to use prctl() with the
PR_SET_DUMPABLE flag, however that also prevents ptrace(), so makes it
impossible to attach a debugger.

See #83701 and
#45797

Reviewed By: MaskRay

Pull Request: #83703
Copy link
Collaborator

@vitalybuka vitalybuka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@arichardson arichardson merged commit 27e5312 into main Mar 13, 2024
3 of 4 checks passed
@arichardson arichardson deleted the users/arichardson/spr/compiler-rt-avoid-generating-coredumps-when-piped-to-a-tool branch March 13, 2024 18:32
@antmox
Copy link
Contributor

antmox commented Mar 14, 2024

Hello, some bots are broken, probably due to this commit:
https://lab.llvm.org/buildbot/#/builders/185/builds/6312
https://lab.llvm.org/buildbot/#/builders/179/builds/9606
Could you please take a look?

// The alternative to using RLIMIT_CORE=1 would be to use prctl() with the
// PR_SET_DUMPABLE flag, however that also prevents ptrace(), so makes it
// impossible to attach a debugger.
rlim.rlim_cur = Min<rlim_t>(SANITIZER_LINUX ? 1 : 0, rlim.rlim_max);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if rlim.rlim_max == 0? The test in corelimit.cpp might fail on Linux?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@arichardson should this rather be:
rlim.rlim_cur = Min<rlim_t>(SANITIZER_LINUX ? 1 : 0, rlim.rlim_cur);

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, will land a fixed version tomorrow

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, looking at this again it was intended. If rlim_max is zero and we request 1 the syscall will fail with an error, so the min() call should look at rlim_max rather than rlim_cur. Will add a comment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

disable_coredump=1 does not disable coredump on systemd
6 participants