Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sentry causes process to quit due to signal SIGPIPE #863

Closed
1 of 3 tasks
daniel-falk opened this issue Jul 9, 2023 · 6 comments
Closed
1 of 3 tasks

Sentry causes process to quit due to signal SIGPIPE #863

daniel-falk opened this issue Jul 9, 2023 · 6 comments

Comments

@daniel-falk
Copy link

Description

I have a complex program running on a custom Linux distribution (open-embedded). The program involves a lot of gstreamer code. I'm trying to add Sentry logging to this process, but if i call the sentry_init method in my program, then the program starts exiting sporadically. The program exits with no log messages and no crash/message being sent to Sentry.

When does the problem happen

One failure case causes the application to exit with return code 141 which is typically related to SIGPIPE not being handled. This happens when gstreamer fails to write data to a network socket, which causes a SIGPIPE signal. Gstreamer is automatically adding a signal handler to catch this signal, it does that whenever it is doing any network I/O, i.e. after sentry_init is called.

For some reason, it seems like this signal is not handled whenever I have called sentry_init. Does Sentry remove other signal handlers in some way, even if they are added after the call to the init method? Perhaps Sentry is also adding signal handlers for SIGPIPE to deal with it's own communications?

I have also tried to set sentry_options_set_backend(options, NULL) since I figured the crash handling might be more aggressive on the signal handlers, but this did not make any difference.

How can I solve this, or further debug the issue?

  • During build
  • During run-time
  • When capturing a hard crash

Environment

  • OS: Linux (Open Embedded / YOCTO)
  • Compiler: gcc
    Sentry native from main / d30e96d

Steps To Reproduce

Log output
No output, only exit code 141 is set.

@daniel-falk
Copy link
Author

If it is a data transport issue - is there any way I can pause the transport? Currently I get around this by calling sentry_cleanup before using gstreamer, and staring Sentry again when I'm not using gstreamer. This does however cause me to lose all my breadcrumbs and crash handling. If I could only pause the data transmission while using gstreamer, than breadcrumbs would still persist and any logged errors or crashes would be sent next time I enable data transport.

@supervacuus
Copy link
Collaborator

Hi, @daniel-falk! Can I ask you whichsentry-native backend you are using in your scenario? This would be useful as a baseline because breakpad doesn't handle SIGPIPE on Posix at all, while crashpad only installs a terminate-handler instead of a crash-handler.

If you want to test whether the transport is the cause, you turn off the transport in the build (pass -DSENTRY_TRANSPORT=no to cmake) or just for testing you can set the transport to NULL before sentry_init().

It is possible that libcurl is configured in your OE build so that the request path triggers a SIGPIPE to handle requests. This configuration is currently not supported by the Native SDK, and should trigger an error in sentry_init(). In your case it could be that the gstreamer and libcurl SIGPIPE handlers conflict. Still, since it is a very rare configuration: can send me the output of curl --version from your deployment target?

@daniel-falk
Copy link
Author

Hi @supervacuus,

Thanks for quick answer!

I will try with -DSENTRY_TRANSPORT=no or sentry_options_set_transport(options, NULL); and see if that makes it work.

We are also building curl from source. We do not use any specific build options related to that, do you have any more information you could point me to for finding out how that works? I was not aware this was an option in curl, but after having a quick look it seems like the default in curl is to install signal handlers for SIGPIPE unless you explicitly tell it not do with curl_easy_setopt(curl, CURLOPT_NOSIGNAL, 1L); (as per link default to 0).

Here's the output from curl --version:

curl 8.0.1-DEV (aarch64-unknown-linux-gnu) libcurl/8.0.1-DEV OpenSSL/1.0.2i
Release-Date: [unreleased]
Protocols: dict file ftp ftps gopher gophers http https imap imaps mqtt pop3 pop3s rtsp smb smbs smtp smtps telnet tftp
Features: alt-svc AsynchDNS HSTS HTTPS-proxy IPv6 Largefile NTLM NTLM_WB SSL threadsafe TLS-SRP UnixSockets

I will see if I can dig out any more about curls usage of pipes.

@supervacuus
Copy link
Collaborator

If you use the defaults and have AsynchDNS enabled, no SIGPIPE will come from libcurl directly (the signal setup is compile-time disabled in that case). I would still test if disabling the transport changes anything concerning an unhandled SIGPIPE, so we can isolate the issue further.

Please also let me know whether you are using breakpad or crashpad as a crash backend.

@daniel-falk
Copy link
Author

Sorry for slow reply. When digging into it more I found the issue: we were compiling the Sentry client with a newer version of OpenSSL (1.1.1t) and using it in the target with an older version of OpenSSL (1.0.2i). When correcting the build step to use the same version of curl and openssl that was used in the target device this was no longer any issue.

For future reference:

Please also let me know whether you are using breakpad or crashpad as a crash backend.
We have not explicitly set this, so I believe breakpad is the default for Linux?

Setting sentry_options_set_transport(options, NULL); also made the issues stop since it does not trigger the use of curl.

For anyone looking for information about AsynchDNS, here is some info. The selected option is visible in the config.log file, in our case:

configure:38248: checking whether to enable the threaded resolver
configure:38264: result: yes
configure:38269: checking whether to use POSIX threads for threaded resolver
configure:38285: result: auto
  resolver:         POSIX threaded

@supervacuus
Copy link
Collaborator

Hi, @daniel-falk! Happy to hear you found a solution to the problem. When you sent me the curl configuration, I was sure the issue could no longer be in libcurl. Curious to see that a mismatch in OpenSSL versions led to a SIGPIPE (so it seems it was an actual SIGPIPE and not an abused event handler like it was used in libcurl before AsynchDNS was introduced many moons ago).

Thanks for the info regarding the sentry backend; although, given you found a solution to the problem, it is no longer relevant. Yes, breakpad is the default on Linux, and breakpad doesn't handle SIGPIPE (which is why you didn't get a crash report).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Archived in project
Archived in project
Development

No branches or pull requests

2 participants