TimerTest.TestApproximateWaitTimes occasionally fails #549

Open
AtnNn opened this Issue Mar 28, 2013 · 14 comments

Projects

None yet

5 participants

@AtnNn
Member
AtnNn commented Mar 28, 2013

It failed 3 out of 10000 times:

src/unittest/timer_test.cc:21: Failure
Expected: (abs(diff - wait_array[i][j] * (1000LL * 1000LL))) < (2 * (1000LL * 1000LL)), actual: 2554311 vs 2000000
src/unittest/timer_test.cc:21: Failure
Expected: (abs(diff - wait_array[i][j] * (1000LL * 1000LL))) < (2 * (1000LL * 1000LL)), actual: 2109949 vs 2000000
src/unittest/timer_test.cc:21: Failure
Expected: (abs(diff - wait_array[i][j] * (1000LL * 1000LL))) < (2 * (1000LL * 1000LL)), actual: 5450565 vs 2000000
@coffeemug
Member

Is this in next or 1.4.x? Also, what was the environment? I believes timers are done very differently in linux and os x.

@AtnNn
Member
AtnNn commented Apr 3, 2013

It was on next (7d9e5bb) on newton (ubuntu oneiric).

@AtnNn
Member
AtnNn commented Jul 18, 2013

This test fails 95% of the time on my OSX desktop.

@anatol
anatol commented Oct 23, 2013

Users report it fails occasionally on Linux Arch https://aur.archlinux.org/packages/rethinkdb/

@larkost
Collaborator
larkost commented Sep 24, 2014

This is pretty regularly failing on some of my MacOS runs, so it is still an outstanding issue.

@larkost
Collaborator
larkost commented Nov 6, 2014

Putting the name unit.TimerTest in here so I can find it next time.

@larkost
Collaborator
larkost commented Jan 20, 2015

@gchpaco seems to have fixed this with #3610

@larkost larkost closed this Jan 20, 2015
@AtnNn AtnNn modified the milestone: backlog, 1.16 Jan 23, 2015
@anatol
anatol commented Jan 30, 2015

I see the same error with 1.16.0 release

[----------] 1 test from TimerTest
[ RUN      ] TimerTest.TestApproximateWaitTimes
src/unittest/timer_test.cc:26: Failure
Expected: (llabs(diff - wait_array[i][j] * (1000LL * 1000LL))) < (std::max(diff / 4, static_cast<int64_t>(2 * (1000LL * 1000LL)))), actual: 2062616 vs 2000000
@danielmewes
Member

Thanks for reporting @anatol .
It seems like the test is still too strict. Re-opening

@danielmewes danielmewes reopened this Jan 30, 2015
@danielmewes danielmewes modified the milestone: 1.16.x, 1.16 Jan 30, 2015
@danielmewes danielmewes modified the milestone: 1.16.x, 2.0.x Apr 14, 2015
@danielmewes danielmewes modified the milestone: 2.0.x, 2.2-polish Sep 1, 2015
@AtnNn AtnNn self-assigned this Nov 13, 2015
@AtnNn
Member
AtnNn commented Nov 13, 2015

The test isn't too strict, it's broken: it compares relative time and absolute time.

@AtnNn
Member
AtnNn commented Nov 13, 2015

It's not broken the way I thought, I'm just reading it wrong.

However, it has the weird behaviour of testing that nap(1) sleeps for 1±2ms but for nap(40) it expects approximately 40±10ms.

In @anatol's failure, nap(1) sleeps for 2.06ms and in #3610 (comment), there seems to always be a difference of less than 4ms.

@AtnNn
Member
AtnNn commented Nov 13, 2015

I've removed the relative error tolerance of 25%, increased the absolute error tolerance from 2ms to 5ms and added a maximum of 2ms for the average error. In branch atnnn/timertest.

@anatol
anatol commented Nov 13, 2015

I am using a VM to create Arch packages and run its tests. This might explain why CPU scheduler jitter is so big.

@danielmewes danielmewes modified the milestone: 2.2.x, 2.2-polish, 2.3, 2.3-polish Nov 13, 2015
@danielmewes danielmewes modified the milestone: 2.3-polish, subsequent Apr 7, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment