Prevent overwrite of signal mask when installing ANR handler #520
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Goal
Fixes a bug reported in #502 where the process crashes with
SIGPIPE
if the network connection is lost during a request and ANR detection is enabled.Design
The following may be useful for further reading: https://www.gnu.org/software/libc/manual/html_node/Signal-Handling.html#Signal-Handling
Our current implementation installs a
SIGQUIT
handler to record when an ANR occurred. When this handler is installed we update the thread mask to block any concurrentSIGQUIT
signals. After the handler is installed, the thread mask is updated to unblockSIGQUIT
, so that we can handle incoming signals once more.However, one apparent issue with the current implementation is that it overwrites all pre-existing values in the thread mask, by using the
SIG_SETMASK
option when callingpthread_sigmask
. This has the effect of setting onlySIGQUIT
in the thread mask, meaning any previously default values such asSIGPIPE
are no longer blocked.This has the obvious effect of crashing the app when writing to a broken network socket, as typically
SIGPIPE
would be ignored and return anerrno
instead, rather than terminating the process. Other signals may also be affected by this, which we have not yet observed.Proposed Fix
We should use
SIG_BLOCK
instead ofSIG_SETMASK
.SIG_BLOCK
sets the thread mask to a union of the existing values and theset
parameter, meaning that after the handler is installed the previous mask values will be restored.An alternative approach would be to ignore
SIGPIPE
usingsigaction
but this would not address the underlying issue.Tests
I first verified the behaviour by following the instructions in this example app: https://github.com/fxdemolisher/bugsnag-android-crash-minimal-repro
I then modified the example app to use an artefact with this changeset from mavenLocal, and verified that the app no longer crashed. I also verified that a
SIGSEGV
and ANR could still be reported as normal.