Skip to content

Increase fudge factor for sleep tests#3411

Merged
A5rocks merged 1 commit intopython-trio:mainfrom
A5rocks:less-flaky
Mar 21, 2026
Merged

Increase fudge factor for sleep tests#3411
A5rocks merged 1 commit intopython-trio:mainfrom
A5rocks:less-flaky

Conversation

@A5rocks
Copy link
Contributor

@A5rocks A5rocks commented Mar 19, 2026

Fixes #1664

Just encountered again in https://github.com/python-trio/trio/actions/runs/23277899067/job/67684688370. I've attached the relevant logs below so I can find this PR again if this happens again. For future me, I don't think the fudge factor should be increased more! If we do in fact have an issue with our sleeping, it seems likely to me it's something like "we accidentally slept twice."

Instead, next time, we should just increase TARGET (or better yet, maybe there's some configuration we can do to make our GitHub provided runners less bad....)

  _________________________________ test_sleep __________________________________
  
      @slow
      async def test_sleep() -> None:
          async def sleep_1() -> None:
              await sleep_until(_core.current_time() + TARGET)
      
          await check_takes_about(sleep_1, TARGET)
      
          async def sleep_2() -> None:
              await sleep(TARGET)
      
  >       await check_takes_about(sleep_2, TARGET)
  
  C:\hostedtoolcache\windows\Python\3.14.3\x86-freethreaded\Lib\site-packages\trio\_tests\test_timeouts.py:75: 
  _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
  
  f = <function test_sleep.<locals>.sleep_2 at 0x10AD7B00>, expected_dur = 1.0
  
      async def check_takes_about(f: Callable[[], Awaitable[T]], expected_dur: float) -> T:
          start = time.perf_counter()
          result = await outcome.acapture(f)
          dur = time.perf_counter() - start
          print(dur / expected_dur)
          # 1.5 is an arbitrary fudge factor because there's always some delay
          # between when we become eligible to wake up and when we actually do. We
          # used to sleep for 0.05, and regularly observed overruns of 1.6x on
          # Appveyor, and then started seeing overruns of 2.3x on Travis's macOS, so
          # now we bumped up the sleep to 1 second, marked the tests as slow, and
          # hopefully now the proportional error will be less huge.
          #
          # We also also for durations that are a hair shorter than expected. For
          # example, here's a run on Windows where a 1.0 second sleep was measured
          # to take 0.9999999999999858 seconds:
          #   https://ci.appveyor.com/project/njsmith/trio/build/1.0.768/job/3lbdyxl63q3h9s21
          # I believe that what happened here is that Windows's low clock resolution
          # meant that our calls to time.monotonic() returned exactly the same
          # values as the calls inside the actual run loop, but the two subtractions
          # returned slightly different values because the run loop's clock adds a
          # random floating point offset to both times, which should cancel out, but
          # lol floating point we got slightly different rounding errors. (That
          # value above is exactly 128 ULPs below 1.0, which would make sense if it
          # started as a 1 ULP error at a different dynamic range.)
  >       assert (1 - 1e-8) <= (dur / expected_dur) < 1.5
  E       assert (1.6301656000000548 / 1.0) < 1.5

@A5rocks A5rocks changed the title Increase fudge factor for sleeping Increase fudge factor for sleep tests Mar 19, 2026
@codecov
Copy link

codecov bot commented Mar 19, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 100.00000%. Comparing base (370c2e5) to head (124a9b7).
⚠️ Report is 5 commits behind head on main.

Additional details and impacted files
@@               Coverage Diff               @@
##                 main        #3411   +/-   ##
===============================================
  Coverage   100.00000%   100.00000%           
===============================================
  Files             128          128           
  Lines           19424        19424           
  Branches         1318         1318           
===============================================
  Hits            19424        19424           
Files with missing lines Coverage Δ
src/trio/_tests/test_timeouts.py 100.00000% <100.00000%> (ø)
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Member

@jakkdl jakkdl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't love 2.0 as a factor, but I suppose even if we do sleep twice that still wouldn't hit 2.0 exactly.
I'm almost tempted to have a different fudge factor if it's run in CI vs run on local computer, but not sure it's worth the hassle.

@A5rocks
Copy link
Contributor Author

A5rocks commented Mar 21, 2026

I agree that it would be nice to have some sort of customization of this (e.g. measure how much variability or maybe rerunning with a longer sleep amount if it's too variable?). I guess that can be a followup for if this test fails again.

@A5rocks A5rocks merged commit d0c5e6d into python-trio:main Mar 21, 2026
44 checks passed
@A5rocks A5rocks deleted the less-flaky branch March 21, 2026 21:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

trio.tests.test_timeouts.py::test_sleep fails intermittently on Windows

2 participants