Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Lolex for tests involving time #3886

Merged
merged 5 commits into from
Feb 28, 2020
Merged

Conversation

rk-for-zulip
Copy link
Contributor

Separated out from #3727, with some improvements.

@rk-for-zulip
Copy link
Contributor Author

Previous test breakage (only mostly due to late refactor) is now fixed.

Copy link
Member

@gnprice gnprice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ray-kraesig! This will be an excellent upgrade for our ability to write good tests for code that relies on timers. Comments below.

.eslintrc.yaml Outdated Show resolved Hide resolved
.eslintrc.yaml Outdated Show resolved Hide resolved
.eslintrc.yaml Outdated Show resolved Hide resolved
.eslintrc.yaml Outdated Show resolved Hide resolved
src/__tests__/metatests.js Outdated Show resolved Hide resolved
Comment on lines 43 to 44
// If this stalls, Jest will report failure.
expect(await expectation).toBeTrue();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm interesting! What does this mean -- how does Jest decide that it's stalled, vs. that it just hasn't completed yet (but later will)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a configurable timeout, defaulting to 5000ms. [1] [2]

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, thanks.

Just "stalls" doesn't communicate that to me. How about "If this never completes, Jest will time out and report failure."?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Expanded somewhat more than even that, including a documentation link.)

src/utils/__tests__/async-test.js Outdated Show resolved Hide resolved
Comment on lines 30 to 31
// cannot be guaranteed: arbitrary delays are possible
// expect(durationMs).toBeLessThan(10 * expectedMs);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a regrettable loss in test coverage: it means that if our sleep acquired a bug where, say, we were confused about units and always slept for as many seconds as we'd intended milliseconds, it'd still pass the test. Is there a reason that switching to Lolex means we need to sacrifice our test coverage of that failure mode?

It's true that a sleep in real life can't guarantee a timely wakeup. That's why the test only asks that the interval be within 10x of the target, which I think has been enough to make flakes nonexistent even on the very messy real systems that Travis CI runs our tests in. When we move to a spiffy mock setup where we have total control of the (virtual) environment, what I'd expect to happen is that the behavior only becomes more predictable, not less -- so that we could tighten this to like 1.01 * expectedMs, and it would pass every time. That would be a very appealing kind of thing to be able to do when trying to test tricky timer-using code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would be reasonable... so long as about half of this comment is in the code near the 10x test. Especially the bit where a 10x test is specifically intended to guard against a unit-scaling bug, rather than just being an unusually-large fudge factor.

(Done.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, I think the origin of it really is as a large fudge factor. 🙂 It had been 10%, and in 935c1f5 we loosened it to 10x because that had flaked.

The possible unit-scaling bug is just a nice example of how some code like this plausibly could actually have a bug that this test with a large fudge factor would catch.

src/__tests__/metatests.js Outdated Show resolved Hide resolved
.eslintrc.yaml Outdated Show resolved Hide resolved
@rk-for-zulip rk-for-zulip force-pushed the lolex-shim branch 2 times, most recently from 191eaf8 to 6634ecd Compare February 20, 2020 00:19
@rk-for-zulip
Copy link
Contributor Author

The new version has an additional final commit (lolex: Wrap setTimeout and setInterval to error on overflow.) which is, in retrospect, probably a little too intrusive. If there are issues with it, I'd rather drop it than let it block the rest of this; we can relegate improvements to followup work if desired.

Comment on lines 43 to 45
// [α] In theory, we can't be sure of this test; the time between the
// specified timeout expiring and a timeslice actually becoming available
// may be arbitrarily long. (To simulate this with Lolex, one can set
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this true with Lolex? It seems like one of the great advantages of systematically mocking out the clock as well as timers, provided the mock implementation delivers this, is that we could expect the timers to behave ideally.

Which effectively means that for timer-intensive code like this, we'd have written much more of a true unit test than was ever possible with real clocks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this true with Lolex?

It's not – hence the trailing parenthetical. That could stand to be made more clear, though.

It seems like one of the great advantages of systematically mocking out the clock as well as timers, provided the mock implementation delivers this, is that we could expect the timers to behave ideally.

Alas, "ideally" contrasts with "in reality", making this simultaneously one of the great disadvantages of mocking out the clock. We don't have any simple way to ensure that our code behaves properly under real-world conditions like load-induced greater-than-epsilon timeslice delays.

Which effectively means that for timer-intensive code like this, we'd have written much more of a true unit test than was ever possible with real clocks.

That's true, but it's not an unalloyed good. 😞 An integration test may well be more informative than a unit test.

Comment on lines 32 to 33
const FUDGE_FACTOR = 1.01; // see discussion of [α], below
lolex.advanceTimersByTime(expectedMs * FUDGE_FACTOR);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The FUDGE_FACTOR here seems like it makes this test loose in a way that it isn't now. I still don't see why any tradeoff like that should be a consequence of adopting a nice spiffy mock setup, where we get to control the clock's behavior in a way we didn't have before.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quite the opposite: it's intended to replicate the looseness that is present now. (Clarifying the preceding would also clarify this, I expect.)

I suppose we could have a separate test that's "sharp", so to speak. I'm not sure it would be terribly informative, though. (EDIT: I have been convinced otherwise of that last.)

Comment on lines 29 to 30
// `sleepPromise` should have set up its timer before returning control.
expect(lolex.getTimerCount()).toBe(1);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there something this tests about the function's behavior beyond what's tested below? This feels to me like it's only poking into the specifics of the function's implementation. (What if there were two timers involved? Not super likely for this particular function, given the simplicity of its spec... but I don't see how it'd be the caller's business if there were.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was intended not as a test of sleep's behavior, but as a metatest of sleepPromise's behavior (resulting from an actual failure in its original implementation).

It's probably not necessary.

Comment on lines 20 to 27
const sleepAsync = async () => {
const start = Date.now();
await sleep(expectedMs);
const actualMs = Date.now() - start;

return actualMs;
};
const sleepPromise: Promise<number> = sleepAsync();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this simplification, thanks!

Can we make this another step more straightforward from here? I think it'd be helpful to basically inline this single-use function, like so:

  const start = Date.now();
  const sleepPromise: Promise<void> = sleep(expectedMs);

  // ...

  await sleepPromise;
  const actualMs = Date.now() - start;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's very much less straightforward, not more – it mixes the two control paths, and in subtly fragile ways.

In particular, sleepPromise will hang unless you advance the timer before awaiting on it, and doing so would make the computation of const actualMs [0] meaningless.

Comment on lines 179 to 204
test('timers fail on overflow', () => {
expect(() => {
setTimeout(() => {}, 10e10);
}).toThrow();

expect(() => {
setInterval(() => {}, 10e10);
}).toThrow();
});
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why 10e10 in these tests and not something close to the required threshold, like 3e9 or 2**31?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like 10e10 more than 3e9. That's about it, really.

It's probably worth adding tests that are just barely on either side of the limit, as well as versions with NaN and Infinity.

Comment on lines 53 to 55
// In Node.js, a timeout or interval of greater than INT32_MAX is reduced to
// exactly 1. (In browsers, and probably in React, it is instead truncated
// to an int32 value.)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"in React" is a confusing expression here -- React also runs in browsers :-) and I think also in Node. I think you probably mean "React Native"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... yes. 🤦‍♂️

Comment on lines 43 to 44
// If this stalls, Jest will report failure.
expect(await expectation).toBeTrue();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, thanks.

Just "stalls" doesn't communicate that to me. How about "If this never completes, Jest will time out and report failure."?

Copy link
Member

@gnprice gnprice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ray-kraesig for the revisions!

The added check in the new commit looks fine to me. Some comments below on that and the others. I think, zooming out a bit, the main concern I have at this point is:

  • In master, the sleep test is quite simple, just a few straightforward lines, for testing a simple function.
  • With these changes, it gets a lot more complex -- many times longer, with several additional wrinkles. It also becomes imprecise on a check that the existing version makes precisely (and doesn't materially tighten the other, quite loose, existing check.)
  • That doesn't seem like a sacrifice I'd expect to have to make in order to use this nice mocking library. Sure, a few lines of setup, and perhaps a line here and there in the test to drive the mocks forward; but greater control ought to mean the rest of the test logic just gets simpler and more precise, not the reverse of those.

gnprice pushed a commit that referenced this pull request Feb 25, 2020
At present, it's not possible (or at least not easy) to test any code
which relies on `Date.now()` for timing purposes. Timer functions such
as `setInterval()` are mockable with Jest's existing mock-timer
functionality, but its mock timers do not include a mock time. [0]

Fortunately, there is work underway to have Jest v25^H^H26 use the
Lolex time-mocking library for its fake timer implementation, which
will do exactly that! Unfortunately, we need that capability now,
rather than in the indefinite future, so we'll jump the gun a bit.

To ease conversion of tests once Lolex is available natively, we use a
small shim taken from the code on Jest's current master branch which
implements Jest's timer-control functions (not projected to change)
in terms of Lolex.

Tests for this shim -- which may also serve, in part, as examples for
Zulip contributors who would like to use Lolex -- are included.

(Additionally, inform Jest that files present in src/__tests__/aux are
not supposed to contain executable tests.)

[0] S2E1, 1967-09-15.

GitHub-PR: #3886
@gnprice
Copy link
Member

gnprice commented Feb 25, 2020

Most of these commits LGTM, including the main one, so I've gone and merged them! Thanks again.

dd53b0d eslint: Extend test-file import permissions to subdirectories.
2faad06 eslint: Disallow imports from tests directories.
6d1d1df jest: Add Lolex for proper time mocking.
6dea133 queueMarkAsRead: Adapt to use Lolex in tests.

That leaves two commits in progress: the sleep test, and the newest commit on top.

@rk-for-zulip
Copy link
Contributor Author

rk-for-zulip commented Feb 25, 2020

The previous muddled Lolex-based sleep test has been replaced with a sharp Lolex-based sleep test. This only tests sleep's behavior under an ideal clock; rather than trying to write a(n improved) Lolex-based test for less-ideal clocks, the original code from master has been left in place, continuing to use the real clock for this purpose.

An optional final refactoring commit has also been included.

Copy link
Member

@gnprice gnprice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ray-kraesig for these revisions! I like the new structure of a unit test plus an integration test (the latter being what we have now), and the new shape of the unit test.

Just one item below, which I think will be easy to fix.


test('waits for exactly the right number of milliseconds', async () => {
const expectedMs = 1000;
const sleepPromise: Promise<number> = sleepMeasure(expectedMs);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this line appears in a commit where sleepMeasure hasn't yet 🙂

Maybe put the "factor out sleepMeasure" commit earlier, with just one callsite, and then add this new callsite right after that? Or other sequences would be good too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Restored(?) to what I thought it was originally: sleepMeasure created in this commit, within the unit-test's describe block.

src/__tests__/aux/lolex.js Show resolved Hide resolved
@gnprice gnprice mentioned this pull request Feb 27, 2020
"Lolex" !== "lolex".
The second `expect` test is theoretically unsound, and the reason for
a fudge-factor of 10, specifically (as opposed to other possibly-
interesting magic numbers like 2 or 137.036) is not obvious. Add a
justification and explanation.

Additionally, rename `durationMs` to more clearly contrast with
`expectedMs`.
Test that `sleep` has ideal behavior in the presence of an ideal
clock.

The original test remains in place, to test that `sleep` also has
reasonably robust behavior in the presence of a non-ideal clock.
Lolex follows Node's behavior in (silently!) limiting timer durations
to a maximum of INT32_MAX (2^31 - 1) milliseconds (or approximately
24.86 days), substituting a timeout of 1ms on overflow. Browsers have
subtly different behavior here, as they truncate to 32 bits, rather
than substituting.

React Native's JavaScript engine probably follows browser behavior
rather than Node's, but it hardly matters -- if we issue a timer with
a duration over the limit, even in tests, we're almost certainly not
going to get the expected behavior.

Wrap Lolex's overrides of setTimeout and setInterval to complain if a
timer is ever created with an incompatibly-long timeout duration.
Factor `sleepMeasure` out of both the "ideal" and "real" tests for
`sleep`.
@gnprice
Copy link
Member

gnprice commented Feb 28, 2020

And merged. Thanks again!

@rk-for-zulip rk-for-zulip deleted the lolex-shim branch March 3, 2020 22:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants