Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent overwrite of signal mask when installing ANR handler #520

Merged
merged 2 commits into from Jul 10, 2019

Conversation

fractalwrench
Copy link
Contributor

@fractalwrench fractalwrench commented Jul 10, 2019

Goal

Fixes a bug reported in #502 where the process crashes with SIGPIPE if the network connection is lost during a request and ANR detection is enabled.

Design

The following may be useful for further reading: https://www.gnu.org/software/libc/manual/html_node/Signal-Handling.html#Signal-Handling

Our current implementation installs a SIGQUIT handler to record when an ANR occurred. When this handler is installed we update the thread mask to block any concurrent SIGQUIT signals. After the handler is installed, the thread mask is updated to unblock SIGQUIT, so that we can handle incoming signals once more.

However, one apparent issue with the current implementation is that it overwrites all pre-existing values in the thread mask, by using the SIG_SETMASK option when calling pthread_sigmask. This has the effect of setting only SIGQUIT in the thread mask, meaning any previously default values such as SIGPIPE are no longer blocked.

This has the obvious effect of crashing the app when writing to a broken network socket, as typically SIGPIPE would be ignored and return an errno instead, rather than terminating the process. Other signals may also be affected by this, which we have not yet observed.

Proposed Fix

We should use SIG_BLOCK instead of SIG_SETMASK. SIG_BLOCK sets the thread mask to a union of the existing values and the set parameter, meaning that after the handler is installed the previous mask values will be restored.

An alternative approach would be to ignore SIGPIPE using sigaction but this would not address the underlying issue.

Tests

I first verified the behaviour by following the instructions in this example app: https://github.com/fxdemolisher/bugsnag-android-crash-minimal-repro

I then modified the example app to use an artefact with this changeset from mavenLocal, and verified that the app no longer crashed. I also verified that a SIGSEGV and ANR could still be reported as normal.

…QUIT handler

the ANR implementation installs a SIGQUIT handler that used SIG_SETMASK to set the thread mask,
which overwrites the previous mask values. Using SIG_BLOCK instead creates a union of the two sets,
meaning SIGQUIT is still blocked while the handler is installed, along with any pre-existing signals
in the mask. This fixes a crash in the ANR implementation where a SIGPIPE was raised when writing to
a broken network socket, crashing the process.
Copy link
Contributor

@kattrali kattrali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the ideal solution! Nice catch.

@fractalwrench fractalwrench merged commit 50327da into master Jul 10, 2019
@fractalwrench fractalwrench deleted the fix-sigpipe-crash branch July 10, 2019 12:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants