Skip to content

fix(native): cache parent process handle to survive OOM crashes on Windows#1603

Merged
jpnurmi merged 6 commits intomasterfrom
jpnurmi/fix/native-oom
Mar 31, 2026
Merged

fix(native): cache parent process handle to survive OOM crashes on Windows#1603
jpnurmi merged 6 commits intomasterfrom
jpnurmi/fix/native-oom

Conversation

@jpnurmi
Copy link
Copy Markdown
Collaborator

@jpnurmi jpnurmi commented Mar 27, 2026

On Windows, the crash daemon polled OpenProcess() each iteration to check if the parent is still alive. During OOM, OpenProcess() failed because kernel pool memory is exhausted, causing the daemon to incorrectly conclude the parent has exited - and shut down before it can capture the crash.

Fix: open the parent process handle once at daemon startup and reuse it for liveness checks via WaitForSingleObject().

Tested with sentry-unreal:

Fixes: #1590

@jpnurmi jpnurmi changed the title Jpnurmi/fix/native oom [WIP] native: OOM Mar 27, 2026
@jpnurmi jpnurmi force-pushed the jpnurmi/fix/native-oom branch from 3d8a973 to 895add7 Compare March 27, 2026 18:29
@jpnurmi jpnurmi changed the title [WIP] native: OOM native: fix OOM on Windows Mar 27, 2026
@jpnurmi jpnurmi force-pushed the jpnurmi/fix/native-oom branch 5 times, most recently from 8b84370 to 56da312 Compare March 30, 2026 13:00
jpnurmi added a commit to getsentry/sentry-unreal that referenced this pull request Mar 30, 2026
jpnurmi and others added 3 commits March 30, 2026 22:12
Add `oom` argument to the example app that triggers out-of-memory by
allocating in a loop until the OS kills the process. Add integration
tests for both native and crashpad backends, skipped on Linux (OOM
killer sends uncatchable SIGKILL), ASAN, and Valgrind.

Refs: #1590

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Raise an exception (Windows) or write to an invalid address (Unix)
when malloc fails, matching how UE terminates on OOM.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
During OOM, OpenProcess fails because kernel pool memory is exhausted,
causing the daemon to incorrectly conclude the parent is dead. Open the
handle once at startup and reuse it for liveness checks.

Fixes: #1590

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@jpnurmi jpnurmi force-pushed the jpnurmi/fix/native-oom branch from 22ab77e to 5e12871 Compare March 30, 2026 20:15
@jpnurmi jpnurmi force-pushed the jpnurmi/fix/native-oom branch from 5e12871 to b6d7dc4 Compare March 30, 2026 20:23
@jpnurmi jpnurmi changed the title native: fix OOM on Windows fix(native): cache parent process handle to survive OOM crashes on Windows Mar 31, 2026
@jpnurmi jpnurmi marked this pull request as ready for review March 31, 2026 06:44
@jpnurmi jpnurmi requested review from mujacica and tustanivsky March 31, 2026 06:47
jpnurmi and others added 2 commits March 31, 2026 08:51
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@jpnurmi jpnurmi merged commit a848058 into master Mar 31, 2026
52 checks passed
@jpnurmi jpnurmi deleted the jpnurmi/fix/native-oom branch March 31, 2026 08:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Native backend doesn't capture out-of-memory errors

3 participants