Skip to content

fix(native): Respect shutdown timeout in daemon#1691

Merged
jpnurmi merged 2 commits intomasterfrom
jpnurmi/feat/native-timeout
Apr 30, 2026
Merged

fix(native): Respect shutdown timeout in daemon#1691
jpnurmi merged 2 commits intomasterfrom
jpnurmi/feat/native-timeout

Conversation

@jpnurmi
Copy link
Copy Markdown
Collaborator

@jpnurmi jpnurmi commented Apr 30, 2026

Pass the configured SDK shutdown timeout through the native crash context so the daemon uses the same limit as the SDK during transport shutdown.

This is likely the first thing users will try to adjust when it comes to large attachment uploads on shutdown. While a full-screen game may want a relatively high timeout to allow as many crash reports as possible to get uploaded right away, in a desktop environment, it may feel less nice if a GUI app window is stuck in a crashed state for a long time...

As for tests, add a shutdown-timeout argument for those tests that most likely need a longer 10s timeout under sanitizers, while keeping most native tests on the production default (2s).

jpnurmi and others added 2 commits April 30, 2026 17:41
Pass the configured SDK shutdown timeout through the native crash context so
the daemon uses the same limit as the SDK during transport shutdown.

Add a sentry_example argument for tests that need a longer timeout under
sanitizers while keeping most native tests on the production default.

Co-Authored-By: OpenAI Codex <noreply@openai.com>
@jpnurmi jpnurmi force-pushed the jpnurmi/feat/native-timeout branch from 0c8a59c to 619a5df Compare April 30, 2026 15:43
Comment on lines +3431 to 3432
// envelopes (crash envelope + logs envelope, etc.).
int rv = sentry__transport_shutdown(
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: The default shutdown timeout for the crash daemon was reduced from 10s to 2s, which may cause crash reports to fail sending on slow networks.
Severity: HIGH

Suggested Fix

Revert the default shutdown timeout for the crash daemon to 10,000ms to ensure reliable delivery of crash envelopes in various network conditions. Alternatively, if the 2,000ms default is desired, this breaking change should be clearly documented for users, advising them to manually set sentry_options_set_shutdown_timeout if they require a longer timeout for reliability.

Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent. Verify if this is a real issue. If it is, propose a fix; if not, explain why it's
not valid.

Location: src/backends/native/sentry_crash_daemon.c#L3431-L3432

Potential issue: The crash daemon's transport shutdown timeout has been reduced from a
hardcoded 10,000ms to the SDK's default of 2,000ms. This change is silent for users who
have not explicitly configured `sentry_options_set_shutdown_timeout`. The shorter
2-second window significantly increases the likelihood that crash envelopes will fail to
be delivered, particularly in production environments with slow or high-latency
networks. The previous 10-second timeout was intended for reliability beyond just
accommodating slower sanitizer builds, which are handled separately in test
infrastructure.

Did we get this right? 👍 / 👎 to inform future reviews.

Copy link
Copy Markdown
Collaborator Author

@jpnurmi jpnurmi Apr 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it's a difficult balance. If I remember correctly, it was originally the same 2 seconds but was bumped up to 10 seconds specifically for TSan and ASan tests, so now we're only using it for a small subset of tests when running with sanitizers.

While we do generally want most envelopes to go through at crash time, it can feel a bit awkward if an app window is stuck on the screen for 10 seconds, which can easily happen with large attachments.

Copy link
Copy Markdown
Collaborator Author

@jpnurmi jpnurmi Apr 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's the commit where it was bumped up from 2 to 10 seconds: e7ef468. Unfortunately, logs for the preceding CI runs have expired, so we cannot see which exact tests failed with ASan or TSan back then.

@jpnurmi jpnurmi requested review from mujacica and tustanivsky April 30, 2026 16:13
@jpnurmi jpnurmi merged commit a5dccfd into master Apr 30, 2026
64 checks passed
@jpnurmi jpnurmi deleted the jpnurmi/feat/native-timeout branch April 30, 2026 17:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants