Test of `sleep` has flaked in CI #4010

rk-for-zulip · 2020-03-28T22:39:06Z

In a recent PR with no changes to either async.js or async-test.js, one of the tests for "sleep" flaked, mysteriously sleeping for only 999 milliseconds rather than 1000.

 FAIL  src/utils/__tests__/async-test.js
  ● sleep (real) › waits for approximately the right number of milliseconds
    expect(received).toBeLessThanOrEqual(expected)
    Expected: <= 999
    Received:    1000
      45 |     // `sleep` should sleep for the specified number of milliseconds (and not,
      46 |     // for example, that many seconds or microseconds).
    > 47 |     expect(expectedMs).toBeLessThanOrEqual(actualMs);
         |                                ^
      48 |     expect(actualMs).toBeLessThan(10 * expectedMs); // [α]
      49 | 
      50 |     // [α] In theory, we can't be sure of this test; the time between the
      at _callee2$ (src/utils/__tests__/async-test.js:47:32)
      at tryCatch (node_modules/regenerator-runtime/runtime.js:62:40)
      at Generator.invoke [as _invoke] (node_modules/regenerator-runtime/runtime.js:296:22)
      at Generator.prototype.(anonymous function) [as next] (node_modules/regenerator-runtime/runtime.js:114:21)
      at tryCatch (node_modules/regenerator-runtime/runtime.js:62:40)
      at invoke (node_modules/regenerator-runtime/runtime.js:152:20)
      at node_modules/regenerator-runtime/runtime.js:162:13
      at tryCallOne (node_modules/promise/lib/core.js:37:12)
      at node_modules/promise/lib/core.js:123:15
      at flush (node_modules/asap/raw.js:50:29)

The text was updated successfully, but these errors were encountered:

agrawal-d · 2020-03-30T18:47:16Z

Can you please share a link to the PR? This is an interesting situation!

rk-for-zulip · 2020-03-30T19:21:55Z

Can you please share a link to the PR? This is an interesting situation!

It is, isn't it! The PR in question was #3958.

agrawal-d · 2020-03-30T19:42:18Z

Okay, so luckily it appears to be a one-off situation. It may be because we use Date.now() to measure the time deltas here, which relies on the system clock.
Maybe this article is relevant: https://www.sitepoint.com/discovering-the-high-resolution-time-api/

chrisbobbe · 2020-03-31T18:43:56Z

Okay, so luckily it appears to be a one-off situation.

This is partly lucky, and partly very unlucky. 🙂 A "flake" like this, where the test sometimes fails when nothing is actually wrong, reduces people's confidence in the tests, and makes it more tempting for people to ignore failures instead of learning from them.

What's definitely lucky is that, as I understand it, Ray and Greg have a lot of experience with some pretty bad test flakes on projects before they were employed at Zulip, so avoiding them in Zulip has been a priority. @ray-kraesig added the use of Lolex, in #3886, for most tests involving time, to avoid exactly this kind of flake.

Corresponding to this test's naming as "(real)", as opposed to the other one that says "(ideal)", it looks like it doesn't use Lolex; it actually waits 1000-ish milliseconds, following the instruction to wait 1000 milliseconds. The reason for keeping a non-Lolex sleep test is best explained in the commit message for 019cc09e3:

The original test remains in place, to test that sleep also has
reasonably robust behavior in the presence of a non-ideal clock.

Apparently, on that failure, the actual sleep time was 999 milliseconds, which is surprising to me; I considered the instructed number of milliseconds to be an inclusive lower bound on the actual number of milliseconds. But now I'm looking, I haven't found documentation to confirm that.

Maskedman99 · 2020-04-21T08:20:08Z

PR #4045 failed Travis-CI due to this, ( my PR failing again while resolving conflicts :( )
https://travis-ci.org/github/zulip/zulip-mobile/builds/677579703
commit 309b7ff

Sometimes, `setTimeout(1000)` will fire after only 999 milliseconds. This appears to be due to an upstream bug in Node.js itself [1]. Closes zulip#4010. [1] nodejs/node#26578

When a test fails, Jest's error messages lean quite hard on this expectation. Here's a failure from zulip#4010: ● sleep (real) › waits for approximately the right number of milliseconds expect(received).toBeLessThanOrEqual(expected) Expected: <= 999 Received: 1000 (Plus the "received" and "1000" are highlighted in red, and the "expected" and "999" in green.) The truth is that we expected a value >= 1000 and received 999, so having it shown backward comes out pretty confusing. Flip the arguments so Jest shows them the right way around.

gnprice · 2020-04-24T23:24:08Z

Ray found the cause of this -- here's his commit message for the fix (which I just merged):

async: Fudge test value to defeat flake in `sleep` test.

Sometimes, `setTimeout(1000)` will fire after only 999 milliseconds.
This appears to be due to an upstream bug in Node.js itself [1].

Closes #4010.

[1] https://github.com/nodejs/node/issues/26578

As I wrote on the PR: it's reassuring, in a way, that this turns out to be a specific bug in Node with an understood cause that puts an upper bound on its effect. Less reassuring that the maintainers seem pretty blasé about whether to do anything about it. :-/

Maskedman99 mentioned this issue Apr 21, 2020

profile-screen: display user's email-id in the profile screen #4045

Open

rk-for-zulip mentioned this issue Apr 21, 2020

async: Fudge test value to defeat flake in sleep test. #4061

Merged

gnprice closed this as completed in e1a4ba0 Apr 24, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Test of `sleep` has flaked in CI #4010

Test of `sleep` has flaked in CI #4010

rk-for-zulip commented Mar 28, 2020

agrawal-d commented Mar 30, 2020

rk-for-zulip commented Mar 30, 2020

agrawal-d commented Mar 30, 2020 •

edited

Loading

chrisbobbe commented Mar 31, 2020

Maskedman99 commented Apr 21, 2020 •

edited

Loading

gnprice commented Apr 24, 2020

Test of sleep has flaked in CI #4010

Test of sleep has flaked in CI #4010

Comments

rk-for-zulip commented Mar 28, 2020

agrawal-d commented Mar 30, 2020

rk-for-zulip commented Mar 30, 2020

agrawal-d commented Mar 30, 2020 • edited Loading

chrisbobbe commented Mar 31, 2020

Maskedman99 commented Apr 21, 2020 • edited Loading

gnprice commented Apr 24, 2020

Test of `sleep` has flaked in CI #4010

Test of `sleep` has flaked in CI #4010

agrawal-d commented Mar 30, 2020 •

edited

Loading

Maskedman99 commented Apr 21, 2020 •

edited

Loading