Increase fudge factor for sleep tests by A5rocks · Pull Request #3411 · python-trio/trio

A5rocks · 2026-03-19T03:30:50Z

Just encountered again in https://github.com/python-trio/trio/actions/runs/23277899067/job/67684688370. I've attached the relevant logs below so I can find this PR again if this happens again. For future me, I don't think the fudge factor should be increased more! If we do in fact have an issue with our sleeping, it seems likely to me it's something like "we accidentally slept twice."

Instead, next time, we should just increase TARGET (or better yet, maybe there's some configuration we can do to make our GitHub provided runners less bad....)

  _________________________________ test_sleep __________________________________
  
      @slow
      async def test_sleep() -> None:
          async def sleep_1() -> None:
              await sleep_until(_core.current_time() + TARGET)
      
          await check_takes_about(sleep_1, TARGET)
      
          async def sleep_2() -> None:
              await sleep(TARGET)
      
  >       await check_takes_about(sleep_2, TARGET)
  
  C:\hostedtoolcache\windows\Python\3.14.3\x86-freethreaded\Lib\site-packages\trio\_tests\test_timeouts.py:75: 
  _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
  
  f = <function test_sleep.<locals>.sleep_2 at 0x10AD7B00>, expected_dur = 1.0
  
      async def check_takes_about(f: Callable[[], Awaitable[T]], expected_dur: float) -> T:
          start = time.perf_counter()
          result = await outcome.acapture(f)
          dur = time.perf_counter() - start
          print(dur / expected_dur)
          # 1.5 is an arbitrary fudge factor because there's always some delay
          # between when we become eligible to wake up and when we actually do. We
          # used to sleep for 0.05, and regularly observed overruns of 1.6x on
          # Appveyor, and then started seeing overruns of 2.3x on Travis's macOS, so
          # now we bumped up the sleep to 1 second, marked the tests as slow, and
          # hopefully now the proportional error will be less huge.
          #
          # We also also for durations that are a hair shorter than expected. For
          # example, here's a run on Windows where a 1.0 second sleep was measured
          # to take 0.9999999999999858 seconds:
          #   https://ci.appveyor.com/project/njsmith/trio/build/1.0.768/job/3lbdyxl63q3h9s21
          # I believe that what happened here is that Windows's low clock resolution
          # meant that our calls to time.monotonic() returned exactly the same
          # values as the calls inside the actual run loop, but the two subtractions
          # returned slightly different values because the run loop's clock adds a
          # random floating point offset to both times, which should cancel out, but
          # lol floating point we got slightly different rounding errors. (That
          # value above is exactly 128 ULPs below 1.0, which would make sense if it
          # started as a 1 ULP error at a different dynamic range.)
  >       assert (1 - 1e-8) <= (dur / expected_dur) < 1.5
  E       assert (1.6301656000000548 / 1.0) < 1.5

codecov · 2026-03-19T04:07:09Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 100.00000%. Comparing base (370c2e5) to head (124a9b7).
⚠️ Report is 5 commits behind head on main.

Additional details and impacted files

@@               Coverage Diff               @@
##                 main        #3411   +/-   ##
===============================================
  Coverage   100.00000%   100.00000%           
===============================================
  Files             128          128           
  Lines           19424        19424           
  Branches         1318         1318           
===============================================
  Hits            19424        19424

Files with missing lines	Coverage Δ
src/trio/_tests/test_timeouts.py	`100.00000% <100.00000%> (ø)`

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

jakkdl

I don't love 2.0 as a factor, but I suppose even if we do sleep twice that still wouldn't hit 2.0 exactly.
I'm almost tempted to have a different fudge factor if it's run in CI vs run on local computer, but not sure it's worth the hassle.

A5rocks · 2026-03-21T21:37:00Z

I agree that it would be nice to have some sort of customization of this (e.g. measure how much variability or maybe rerunning with a longer sleep amount if it's too variable?). I guess that can be a followup for if this test fails again.

Increase fudge factor for sleeping

124a9b7

A5rocks changed the title ~~Increase fudge factor for sleeping~~ Increase fudge factor for sleep tests Mar 19, 2026

jakkdl approved these changes Mar 19, 2026

View reviewed changes

A5rocks merged commit d0c5e6d into python-trio:main Mar 21, 2026
44 checks passed

A5rocks deleted the less-flaky branch March 21, 2026 21:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Increase fudge factor for sleep tests#3411

Increase fudge factor for sleep tests#3411
A5rocks merged 1 commit intopython-trio:mainfrom
A5rocks:less-flaky

A5rocks commented Mar 19, 2026 •

edited

Loading

Uh oh!

codecov bot commented Mar 19, 2026 •

edited

Loading

Uh oh!

jakkdl left a comment

Uh oh!

A5rocks commented Mar 21, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

A5rocks commented Mar 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Mar 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

jakkdl left a comment

Choose a reason for hiding this comment

Uh oh!

A5rocks commented Mar 21, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

A5rocks commented Mar 19, 2026 •

edited

Loading

codecov bot commented Mar 19, 2026 •

edited

Loading