test: stabilize flaky tests by jpnurmi · Pull Request #1614 · getsentry/sentry-native

jpnurmi · 2026-03-31T10:18:32Z

Fixes two distinct issues with assert_timestamp in the test suite:

1. Hard-coded 10s threshold too tight for busy CI

assert_timestamp checked that timestamps were less than 10 seconds old, but tests can exceed that in busy CI environments (e.g. test_breakpad_oom_stdout took 10.6s on the Windows ClangCL CI runner):

FAILED tests/test_integration_stdout.py::test_breakpad_oom_stdout - assert 10.609089 < 10

Replace the arbitrary threshold with a before/after approach: an autouse fixture records the test start time, and the assertion verifies the timestamp falls between test start and now. This eliminates timing sensitivity entirely — it doesn't matter how long the test takes, while still verifying that the timestamp is sensible.

2. Host-device clock skew on Android

The test binary runs on the Android device but pytest runs on the host, so their clocks can differ. Even though CI syncs the emulator clock before test execution, the clock can drift during the run (e.g. NTP correction). Measure the offset per test via adb shell date +%s and adjust timestamps accordingly.

Other changes

Extract a shared adb() helper into tests/__init__.py to replace duplicated helpers across test files.
Revert flaky inproc & breakpad OOM stdout tests added in a848058 (keep native & crashpad OOM tests).

`assert_timestamp` checked that timestamps were less than 10 seconds old, but tests can exceed that in busy CI environments (e.g. `test_breakpad_oom_stdout` took 10.6s on Windows CI). Replace the hard-coded threshold with a before/after approach: an autouse fixture records the test start time, and the assertion verifies the timestamp falls between test start and now. This eliminates timing sensitivity entirely — it doesn't matter how long the test takes, while still verifying that the timestamp is sensible. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

The test binary runs on the Android device but pytest runs on the host, so their clocks can differ. Measure the offset once per session via `adb shell date +%s` and expose `tests.now()` that adjusts for it. Also extract a shared `adb()` helper into `tests/__init__.py` to replace duplicated helpers in test_dotnet_signals and test_inproc_stress. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

The "in the future" check compared device timestamps against raw host time, while the "in the past" check correctly used the offset-adjusted test_start. Apply the clock offset to both bounds. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

The emulator clock can drift after initial sync (e.g. NTP correction), so measuring the offset once at session start is not reliable enough. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

When ANDROID_API is set but no device is connected, `adb shell date` returns empty output. Fall back to zero offset instead of crashing. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

is_android from conditions.py is the raw ANDROID_API env string, not a bool. When the skipif expression evaluates to a string, pytest tries to eval it as Python code causing a SyntaxError. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

The OOM tests added in a848058 cause flaky failures (empty output, proxy test interference). Keep the native & crashpad OOM tests. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

supervacuus · 2026-03-31T14:22:02Z

Revert flaky inproc & breakpad OOM stdout tests added in a848058 (keep native & crashpad OOM tests).

Just saw this:

_______________________________ test_native_oom _______________________________

cmake = <bound method CMake.compile of <tests.cmake.CMake object at 0x000001DFE7C2CA70>>
httpserver = <HTTPServer host=localhost port=49541>

    @pytest.mark.skipif(not has_oom, reason="OOM test unreliable in this environment")
    def test_native_oom(cmake, httpserver):
        """Test OOM crash capture with native backend"""
        tmp_path = cmake(["sentry_example"], {"SENTRY_BACKEND": "native"})
    
        httpserver.expect_request("/api/123456/envelope/").respond_with_data("OK")
    
        run_crash(
            tmp_path,
            "sentry_example",
            ["log", "stdout", "oom"],
            env=dict(os.environ, SENTRY_DSN=make_dsn(httpserver)),
        )
    
        time.sleep(2)
    
        run(
            tmp_path,
            "sentry_example",
            ["log", "no-setup"],
            env=dict(os.environ, SENTRY_DSN=make_dsn(httpserver)),
        )
    
>       assert len(httpserver.log) >= 1
E       assert 0 >= 1
E        +  where 0 = len([])
E        +    where [] = <HTTPServer host=localhost port=49541>.log

tests\test_integration_native.py:120: AssertionError

here: https://github.com/getsentry/sentry-native/actions/runs/23800241648/job/69358394729#step:30:36751
and here: https://github.com/getsentry/sentry-native/actions/runs/23800368814/job/69358858226#step:30:42018

Maybe the OOM tests are generally flaky.

jpnurmi · 2026-03-31T14:25:20Z

getsentry/sentry-native/actions/runs/23800241648/job/69358394729#step:30:36751

Maybe the OOM tests are generally flaky.

D*mn, even with native. Sorry about that! We've been running similar tests with Crashpad in sentry-unreal without any issues. 🙁

Edit: or maybe the native daemon just needs more tweaks to work reliably under memory pressure... ==> native skipped for now

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

supervacuus · 2026-03-31T14:46:33Z

D*mn, even with native. Sorry about that! We've been running similar tests with Crashpad in sentry-unreal without any issues. 🙁

no need to be sorry, i am actually happy when test fail sometimes... not all tests are regression tests, but our red/green fetish completely eliminated any nuance.

Edit: or maybe the native daemon just needs more tweaks to work reliably under memory pressure... ==> native skipped for now

i am not sure this is backend specific, at least not in the handling sense (but could imagine in the timing sense).

supervacuus · 2026-03-31T14:46:57Z

this is also a new one: https://github.com/getsentry/sentry-native/actions/runs/23802849311/job/69367984681?pr=1614#step:30:410

choco install returns 0 even when the install fails. Add a sccache --version check to catch installation failures early. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Chocolatey can fail due to transient network issues. Add continue-on-error to the CI step, and guard sccache calls in conftest.py with shutil.which so tests still run without it. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

This reverts commit 76173ec.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Autofix Details

Bugbot Autofix prepared a fix for the issue found in the latest run.

✅ Fixed: sccache continue-on-error won't degrade gracefully without cmake guard
- cmake_configure now only sets sccache compiler launcher options when both USE_SCCACHE is set and the sccache binary is available in PATH.

Or push these changes by commenting:

@cursor push fef1bdf1ab

Preview (fef1bdf1ab)

diff --git a/tests/cmake.py b/tests/cmake.py
--- a/tests/cmake.py
+++ b/tests/cmake.py
@@ -130,7 +130,8 @@
     __tracebackhide__ = True
 
     options = dict(options)
-    if os.environ.get("USE_SCCACHE"):
+    has_sccache = os.environ.get("USE_SCCACHE") and shutil.which("sccache")
+    if has_sccache:
         options.update(
             {
                 "CMAKE_C_COMPILER_LAUNCHER": "sccache",

_{This Bugbot Autofix run was free. To enable autofix for future PRs, go to the Cursor dashboard.}

The native backend's daemon sends the crash envelope directly via HTTP during crash processing, so there's no need to restart the app. Wait for the daemon's request instead of restarting and hoping for a resend. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Move has_sccache to conditions.py and use it in cmake.py to skip compiler launcher options when sccache is missing. The Ninja generator is still used when USE_SCCACHE is set regardless. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Add a restart run inside the wait block as a safety net to send any remaining crash data the daemon may not have sent. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

^{Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

jpnurmi · 2026-03-31T15:47:41Z

Great, now the whole infra broke. What a day... 😅

Cloning into 'kcov'...
remote: Internal Server Error
fatal: unable to access 'https://github.com/SimonKagstrom/kcov.git/': The requested URL returned error: 500

fetch https://dl-cdn.alpinelinux.org/alpine/v3.21/main/x86_64/APKINDEX.tar.gz
WARNING: updating https://dl-cdn.alpinelinux.org/alpine/v3.21/main: temporary error (try again later)

The reinit stress tests spin 8 threads in tight loops, which causes extreme slowdown under valgrind. Adding sched_yield() lets the OS scheduler (and valgrind's thread serializer) make progress without throttling the test's throughput. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

jpnurmi and others added 2 commits March 31, 2026 12:15

jpnurmi force-pushed the jpnurmi/test/flaky-timestamps branch 2 times, most recently from fe944fa to 891cabb Compare March 31, 2026 11:19

cursor Bot reviewed Mar 31, 2026

View reviewed changes

Comment thread tests/assertions.py Outdated

jpnurmi requested review from JoshuaMoelans and mujacica March 31, 2026 11:35

mujacica approved these changes Mar 31, 2026

View reviewed changes

jpnurmi and others added 3 commits March 31, 2026 13:46

test: measure device clock offset per test instead of per session

bc6d62a

The emulator clock can drift after initial sync (e.g. NTP correction), so measuring the offset once at session start is not reliable enough. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

test: handle missing Android device in clock offset measurement

615dc67

When ANDROID_API is set but no device is connected, `adb shell date` returns empty output. Fall back to zero offset instead of crashing. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

sentry Bot reviewed Mar 31, 2026

View reviewed changes

Comment thread tests/conftest.py

jpnurmi and others added 2 commits March 31, 2026 15:04

test: catch KeyError when ANDROID_HOME is unset

7631797

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

test: revert inproc & breakpad OOM stdout tests

3f7fdf7

The OOM tests added in a848058 cause flaky failures (empty output, proxy test interference). Keep the native & crashpad OOM tests. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

jpnurmi changed the title ~~test: stabilize flaky timestamp assertions~~ test: stabilize flaky tests Mar 31, 2026

test: skip native OOM test until crash handling is fixed

76173ec

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

jpnurmi requested a review from supervacuus March 31, 2026 14:34

jpnurmi and others added 4 commits March 31, 2026 16:48

test: verify sccache installation in CI

72c9fd5

choco install returns 0 even when the install fails. Add a sccache --version check to catch installation failures early. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Revert "test: skip native OOM test until crash handling is fixed"

98d6874

This reverts commit 76173ec.

test: stabilize native OOM test

3b3a1b9

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

supervacuus approved these changes Mar 31, 2026

View reviewed changes

cursor Bot reviewed Mar 31, 2026

View reviewed changes

Comment thread .github/workflows/ci.yml

jpnurmi and others added 2 commits March 31, 2026 17:17

cursor Bot reviewed Mar 31, 2026

View reviewed changes

Comment thread tests/test_integration_native.py

test: restore restart fallback in test_native_oom

9efaa45

Add a restart run inside the wait block as a safety net to send any remaining crash data the daemon may not have sent. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

cursor Bot reviewed Mar 31, 2026

View reviewed changes

Comment thread tests/cmake.py

jpnurmi merged commit bcd4633 into master Mar 31, 2026
53 checks passed

jpnurmi deleted the jpnurmi/test/flaky-timestamps branch March 31, 2026 17:21

BernhardMarconato pushed a commit to elgatosf/sentry-native that referenced this pull request Apr 21, 2026

test: stabilize flaky tests (getsentry#1614)

5e83545

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

test: stabilize flaky tests#1614

test: stabilize flaky tests#1614
jpnurmi merged 17 commits into
masterfrom
jpnurmi/test/flaky-timestamps

jpnurmi commented Mar 31, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

supervacuus commented Mar 31, 2026 •

edited

Loading

Uh oh!

jpnurmi commented Mar 31, 2026 •

edited

Loading

Uh oh!

supervacuus commented Mar 31, 2026

Uh oh!

supervacuus commented Mar 31, 2026

Uh oh!

cursor Bot left a comment •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

cursor Bot left a comment

Uh oh!

Uh oh!

jpnurmi commented Mar 31, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

jpnurmi commented Mar 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

1. Hard-coded 10s threshold too tight for busy CI

2. Host-device clock skew on Android

Other changes

Uh oh!

Uh oh!

Uh oh!

supervacuus commented Mar 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jpnurmi commented Mar 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

supervacuus commented Mar 31, 2026

Uh oh!

supervacuus commented Mar 31, 2026

Uh oh!

cursor Bot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jpnurmi commented Mar 31, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

jpnurmi commented Mar 31, 2026 •

edited

Loading

supervacuus commented Mar 31, 2026 •

edited

Loading

jpnurmi commented Mar 31, 2026 •

edited

Loading

cursor Bot left a comment •

edited

Loading