Add Lolex for tests involving time #3886

rk-for-zulip · 2020-02-10T02:08:08Z

Separated out from #3727, with some improvements.

rk-for-zulip · 2020-02-11T21:38:50Z

Previous test breakage (only mostly due to late refactor) is now fixed.

gnprice

Thanks @ray-kraesig! This will be an excellent upgrade for our ability to write good tests for code that relies on timers. Comments below.

.eslintrc.yaml

src/__tests__/metatests.js

gnprice · 2020-02-18T01:12:17Z

src/utils/__tests__/async-test.js

+    // If this stalls, Jest will report failure.
+    expect(await expectation).toBeTrue();


Hmm interesting! What does this mean -- how does Jest decide that it's stalled, vs. that it just hasn't completed yet (but later will)?

There's a configurable timeout, defaulting to 5000ms. [1] [2]

Cool, thanks.

Just "stalls" doesn't communicate that to me. How about "If this never completes, Jest will time out and report failure."?

(Expanded somewhat more than even that, including a documentation link.)

src/utils/__tests__/async-test.js

gnprice · 2020-02-18T01:29:34Z

src/utils/__tests__/async-test.js

+      // cannot be guaranteed: arbitrary delays are possible
+      // expect(durationMs).toBeLessThan(10 * expectedMs);


This seems like a regrettable loss in test coverage: it means that if our sleep acquired a bug where, say, we were confused about units and always slept for as many seconds as we'd intended milliseconds, it'd still pass the test. Is there a reason that switching to Lolex means we need to sacrifice our test coverage of that failure mode?

It's true that a sleep in real life can't guarantee a timely wakeup. That's why the test only asks that the interval be within 10x of the target, which I think has been enough to make flakes nonexistent even on the very messy real systems that Travis CI runs our tests in. When we move to a spiffy mock setup where we have total control of the (virtual) environment, what I'd expect to happen is that the behavior only becomes more predictable, not less -- so that we could tighten this to like 1.01 * expectedMs, and it would pass every time. That would be a very appealing kind of thing to be able to do when trying to test tricky timer-using code.

That would be reasonable... so long as about half of this comment is in the code near the 10x test. Especially the bit where a 10x test is specifically intended to guard against a unit-scaling bug, rather than just being an unusually-large fudge factor.

(Done.)

Well, I think the origin of it really is as a large fudge factor. 🙂 It had been 10%, and in 935c1f5 we loosened it to 10x because that had flaked.

The possible unit-scaling bug is just a nice example of how some code like this plausibly could actually have a bug that this test with a large fudge factor would catch.

src/__tests__/metatests.js

.eslintrc.yaml

rk-for-zulip · 2020-02-20T00:26:43Z

The new version has an additional final commit (lolex: Wrap setTimeout and setInterval to error on overflow.) which is, in retrospect, probably a little too intrusive. If there are issues with it, I'd rather drop it than let it block the rest of this; we can relegate improvements to followup work if desired.

gnprice · 2020-02-25T01:25:19Z

src/utils/__tests__/async-test.js

+    // [α] In theory, we can't be sure of this test; the time between the
+    // specified timeout expiring and a timeslice actually becoming available
+    // may be arbitrarily long. (To simulate this with Lolex, one can set


Is this true with Lolex? It seems like one of the great advantages of systematically mocking out the clock as well as timers, provided the mock implementation delivers this, is that we could expect the timers to behave ideally.

Which effectively means that for timer-intensive code like this, we'd have written much more of a true unit test than was ever possible with real clocks.

Is this true with Lolex?

It's not – hence the trailing parenthetical. That could stand to be made more clear, though.

It seems like one of the great advantages of systematically mocking out the clock as well as timers, provided the mock implementation delivers this, is that we could expect the timers to behave ideally.

Alas, "ideally" contrasts with "in reality", making this simultaneously one of the great disadvantages of mocking out the clock. We don't have any simple way to ensure that our code behaves properly under real-world conditions like load-induced greater-than-epsilon timeslice delays.

Which effectively means that for timer-intensive code like this, we'd have written much more of a true unit test than was ever possible with real clocks.

That's true, but it's not an unalloyed good. 😞 An integration test may well be more informative than a unit test.

gnprice · 2020-02-25T01:27:10Z

src/utils/__tests__/async-test.js

+    const FUDGE_FACTOR = 1.01; // see discussion of [α], below
+    lolex.advanceTimersByTime(expectedMs * FUDGE_FACTOR);


The FUDGE_FACTOR here seems like it makes this test loose in a way that it isn't now. I still don't see why any tradeoff like that should be a consequence of adopting a nice spiffy mock setup, where we get to control the clock's behavior in a way we didn't have before.

Quite the opposite: it's intended to replicate the looseness that is present now. (Clarifying the preceding would also clarify this, I expect.)

I suppose we could have a separate test that's "sharp", so to speak. ~~I'm not sure it would be terribly informative, though.~~ (EDIT: I have been convinced otherwise of that last.)

gnprice · 2020-02-25T01:34:27Z

src/utils/__tests__/async-test.js

+    // `sleepPromise` should have set up its timer before returning control.
+    expect(lolex.getTimerCount()).toBe(1);


Is there something this tests about the function's behavior beyond what's tested below? This feels to me like it's only poking into the specifics of the function's implementation. (What if there were two timers involved? Not super likely for this particular function, given the simplicity of its spec... but I don't see how it'd be the caller's business if there were.)

This was intended not as a test of sleep's behavior, but as a metatest of sleepPromise's behavior (resulting from an actual failure in its original implementation).

It's probably not necessary.

gnprice · 2020-02-25T01:38:11Z

src/utils/__tests__/async-test.js

+    const sleepAsync = async () => {
+      const start = Date.now();
+      await sleep(expectedMs);
+      const actualMs = Date.now() - start;
+
+      return actualMs;
+    };
+    const sleepPromise: Promise<number> = sleepAsync();


I like this simplification, thanks!

Can we make this another step more straightforward from here? I think it'd be helpful to basically inline this single-use function, like so:

const start = Date.now(); const sleepPromise: Promise<void> = sleep(expectedMs); // ... await sleepPromise; const actualMs = Date.now() - start;

That's very much less straightforward, not more – it mixes the two control paths, and in subtly fragile ways.

In particular, sleepPromise will hang unless you advance the timer before awaiting on it, and doing so would make the computation of const actualMs [0] meaningless.

gnprice · 2020-02-25T01:46:12Z

src/__tests__/metatests.js

+  test('timers fail on overflow', () => {
+    expect(() => {
+      setTimeout(() => {}, 10e10);
+    }).toThrow();
+
+    expect(() => {
+      setInterval(() => {}, 10e10);
+    }).toThrow();
+  });


Why 10e10 in these tests and not something close to the required threshold, like 3e9 or 2**31?

I like 10e10 more than 3e9. That's about it, really.

It's probably worth adding tests that are just barely on either side of the limit, as well as versions with NaN and Infinity.

gnprice · 2020-02-25T01:47:42Z

src/__tests__/aux/lolex.js

+    // In Node.js, a timeout or interval of greater than INT32_MAX is reduced to
+    // exactly 1. (In browsers, and probably in React, it is instead truncated
+    // to an int32 value.)


"in React" is a confusing expression here -- React also runs in browsers :-) and I think also in Node. I think you probably mean "React Native"?

... yes. 🤦‍♂️

gnprice · 2020-02-25T01:50:19Z

src/utils/__tests__/async-test.js

+    // If this stalls, Jest will report failure.
+    expect(await expectation).toBeTrue();


Cool, thanks.

Just "stalls" doesn't communicate that to me. How about "If this never completes, Jest will time out and report failure."?

gnprice

Thanks @ray-kraesig for the revisions!

The added check in the new commit looks fine to me. Some comments below on that and the others. I think, zooming out a bit, the main concern I have at this point is:

In master, the sleep test is quite simple, just a few straightforward lines, for testing a simple function.
With these changes, it gets a lot more complex -- many times longer, with several additional wrinkles. It also becomes imprecise on a check that the existing version makes precisely (and doesn't materially tighten the other, quite loose, existing check.)
That doesn't seem like a sacrifice I'd expect to have to make in order to use this nice mocking library. Sure, a few lines of setup, and perhaps a line here and there in the test to drive the mocks forward; but greater control ought to mean the rest of the test logic just gets simpler and more precise, not the reverse of those.

At present, it's not possible (or at least not easy) to test any code which relies on `Date.now()` for timing purposes. Timer functions such as `setInterval()` are mockable with Jest's existing mock-timer functionality, but its mock timers do not include a mock time. [0] Fortunately, there is work underway to have Jest v25^H^H26 use the Lolex time-mocking library for its fake timer implementation, which will do exactly that! Unfortunately, we need that capability now, rather than in the indefinite future, so we'll jump the gun a bit. To ease conversion of tests once Lolex is available natively, we use a small shim taken from the code on Jest's current master branch which implements Jest's timer-control functions (not projected to change) in terms of Lolex. Tests for this shim -- which may also serve, in part, as examples for Zulip contributors who would like to use Lolex -- are included. (Additionally, inform Jest that files present in src/__tests__/aux are not supposed to contain executable tests.) [0] S2E1, 1967-09-15. GitHub-PR: #3886

gnprice · 2020-02-25T02:09:26Z

Most of these commits LGTM, including the main one, so I've gone and merged them! Thanks again.

dd53b0d eslint: Extend test-file import permissions to subdirectories.
2faad06 eslint: Disallow imports from tests directories.
6d1d1df jest: Add Lolex for proper time mocking.
6dea133 queueMarkAsRead: Adapt to use Lolex in tests.

That leaves two commits in progress: the sleep test, and the newest commit on top.

rk-for-zulip · 2020-02-25T23:20:46Z

The previous muddled Lolex-based sleep test has been replaced with a sharp Lolex-based sleep test. This only tests sleep's behavior under an ideal clock; rather than trying to write a(n improved) Lolex-based test for less-ideal clocks, the original code from master has been left in place, continuing to use the real clock for this purpose.

An optional final refactoring commit has also been included.

gnprice

Thanks @ray-kraesig for these revisions! I like the new structure of a unit test plus an integration test (the latter being what we have now), and the new shape of the unit test.

Just one item below, which I think will be easy to fix.

gnprice · 2020-02-27T01:33:12Z

src/utils/__tests__/async-test.js

+
+  test('waits for exactly the right number of milliseconds', async () => {
+    const expectedMs = 1000;
+    const sleepPromise: Promise<number> = sleepMeasure(expectedMs);


I think this line appears in a commit where sleepMeasure hasn't yet 🙂

Maybe put the "factor out sleepMeasure" commit earlier, with just one callsite, and then add this new callsite right after that? Or other sequences would be good too.

Restored(?) to what I thought it was originally: sleepMeasure created in this commit, within the unit-test's describe block.

src/__tests__/aux/lolex.js

"Lolex" !== "lolex".

The second `expect` test is theoretically unsound, and the reason for a fudge-factor of 10, specifically (as opposed to other possibly- interesting magic numbers like 2 or 137.036) is not obvious. Add a justification and explanation. Additionally, rename `durationMs` to more clearly contrast with `expectedMs`.

Test that `sleep` has ideal behavior in the presence of an ideal clock. The original test remains in place, to test that `sleep` also has reasonably robust behavior in the presence of a non-ideal clock.

Lolex follows Node's behavior in (silently!) limiting timer durations to a maximum of INT32_MAX (2^31 - 1) milliseconds (or approximately 24.86 days), substituting a timeout of 1ms on overflow. Browsers have subtly different behavior here, as they truncate to 32 bits, rather than substituting. React Native's JavaScript engine probably follows browser behavior rather than Node's, but it hardly matters -- if we issue a timer with a duration over the limit, even in tests, we're almost certainly not going to get the expected behavior. Wrap Lolex's overrides of setTimeout and setInterval to complain if a timer is ever created with an incompatibly-long timeout duration.

Factor `sleepMeasure` out of both the "ideal" and "real" tests for `sleep`.

gnprice · 2020-02-28T02:38:38Z

And merged. Thanks again!

rk-for-zulip requested a review from gnprice February 10, 2020 02:08

rk-for-zulip mentioned this pull request Feb 10, 2020

makeBackoffMachine: Improve backoff sleep logic by making it local. #3841

Merged

rk-for-zulip force-pushed the lolex-shim branch 2 times, most recently from 744a98b to 219bacb Compare February 11, 2020 21:37

gnprice reviewed Feb 18, 2020

View reviewed changes

rk-for-zulip force-pushed the lolex-shim branch 2 times, most recently from 191eaf8 to 6634ecd Compare February 20, 2020 00:19

gnprice reviewed Feb 25, 2020

View reviewed changes

rk-for-zulip force-pushed the lolex-shim branch from 6634ecd to f611e1b Compare February 25, 2020 23:06

gnprice reviewed Feb 27, 2020

View reviewed changes

gnprice mentioned this pull request Feb 27, 2020

Presence heartbeat (v2) #3727

Merged

rk-for-zulip added 5 commits February 27, 2020 16:12

tests: Fix casing typo.

e853a69

"Lolex" !== "lolex".

tests: Add Lolex-based unit test of sleep.

019cc09

Test that `sleep` has ideal behavior in the presence of an ideal clock. The original test remains in place, to test that `sleep` also has reasonably robust behavior in the presence of a non-ideal clock.

tests [nfc]: Unify old and new tests' sleep calls.

16c1687

Factor `sleepMeasure` out of both the "ideal" and "real" tests for `sleep`.

rk-for-zulip force-pushed the lolex-shim branch from f611e1b to 16c1687 Compare February 28, 2020 00:58

gnprice merged commit 16c1687 into zulip:master Feb 28, 2020

rk-for-zulip deleted the lolex-shim branch March 3, 2020 22:50

chrisbobbe mentioned this pull request Mar 31, 2020

Test of sleep has flaked in CI #4010

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Lolex for tests involving time #3886

Add Lolex for tests involving time #3886

rk-for-zulip commented Feb 10, 2020

rk-for-zulip commented Feb 11, 2020

gnprice left a comment

gnprice Feb 18, 2020

rk-for-zulip Feb 20, 2020

gnprice Feb 25, 2020

rk-for-zulip Feb 25, 2020

gnprice Feb 18, 2020

rk-for-zulip Feb 20, 2020

gnprice Feb 25, 2020

rk-for-zulip commented Feb 20, 2020

gnprice Feb 25, 2020

rk-for-zulip Feb 25, 2020

gnprice Feb 25, 2020

rk-for-zulip Feb 25, 2020 •

edited

Loading

gnprice Feb 25, 2020

rk-for-zulip Feb 25, 2020

gnprice Feb 25, 2020

rk-for-zulip Feb 25, 2020

gnprice Feb 25, 2020

rk-for-zulip Feb 25, 2020

gnprice Feb 25, 2020

rk-for-zulip Feb 25, 2020

gnprice Feb 25, 2020

gnprice left a comment

gnprice commented Feb 25, 2020

rk-for-zulip commented Feb 25, 2020 •

edited

Loading

gnprice left a comment

gnprice Feb 27, 2020

rk-for-zulip Feb 28, 2020

gnprice commented Feb 28, 2020

		// If this stalls, Jest will report failure.
		expect(await expectation).toBeTrue();

		// cannot be guaranteed: arbitrary delays are possible
		// expect(durationMs).toBeLessThan(10 * expectedMs);

		const FUDGE_FACTOR = 1.01; // see discussion of [α], below
		lolex.advanceTimersByTime(expectedMs * FUDGE_FACTOR);

		// `sleepPromise` should have set up its timer before returning control.
		expect(lolex.getTimerCount()).toBe(1);

Add Lolex for tests involving time #3886

Add Lolex for tests involving time #3886

Conversation

rk-for-zulip commented Feb 10, 2020

rk-for-zulip commented Feb 11, 2020

gnprice left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rk-for-zulip commented Feb 20, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rk-for-zulip Feb 25, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gnprice left a comment

Choose a reason for hiding this comment

gnprice commented Feb 25, 2020

rk-for-zulip commented Feb 25, 2020 • edited Loading

gnprice left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gnprice commented Feb 28, 2020

rk-for-zulip Feb 25, 2020 •

edited

Loading

rk-for-zulip commented Feb 25, 2020 •

edited

Loading