CI builds sometimes only run a subset of tests #2249

robertknight · 2020-06-22T13:12:32Z

We've seen an issue in CI, and Lyza and myself have also experienced it locally too, where builds occasionally "succeed" but only run a fraction (typically 300-500 it seems) of the tests.

When this happens, codecov will typically report a sudden unexpected drop in code coverage.

Original report: #2244 (comment)

Update the software used to run tests in CI. This _may_ help with #2249.

robertknight · 2020-06-22T14:06:14Z

I noticed this in the CI logs for a recent master build. Probably unrelated, but I'll note it anyway:

[0622/134713.594272:INFO:CONSOLE(342905)] "Uncaught (in promise) TypeError: Cannot read property 'references' of undefined", source: http://localhost:9876/absolute/tmp/92430908e258191cffbe541e795258ee.browserify.js?352a30a79692f7b6bedaf8759b1b434258a231ce (342905)

robertknight · 2020-06-22T14:07:30Z

Example of a build of the master branch where this happened: https://jenkins.hypothes.is/job/client/job/master/1986/console

Looking through the recent builds list on Jenkins (29) builds, this issue happened in two of them: #1986 and #1982. It would be interesting to look at the Chrome browser logs for those builds, but unfortunately we only keep the one for the most recent build.

During CI runs, but not local development, log which individual tests have executed. This may help with debugging #2249.

robertknight · 2020-06-30T09:59:53Z

We updated the version of Alpine that is used on Jenkins CI and I don't think we've seen the problem since there, although Lyza did report that it happened on her system.

#2263 has modified the CI build so that it logs which tests have executed, which should help debugging if the issue happens again.

Reproducing the issue on my system may be more difficult now because I updated my laptop and the new one is faster.

One thing I probably will try before closing this as "maybe fixed - wait until it happens again so I can continue debugging" is to write a script that just runs the test suite in a loop many times and checking whether the issue occurred. It may also be worth checking to see if it happens in an environment where the system is resource constrained (eg. in a Docker container with reduced CPU limits).

robertknight · 2020-07-13T09:32:27Z

I was able to reproduce this issue locally with the following script:

import re
import subprocess
import sys

max_iterations = 100
for i in range(0, max_iterations):
    print(f"Running test iteration {i+1} of {max_iterations}")

    proc = subprocess.run(
        "node_modules/.bin/gulp test",
        shell=True,
        stdout=subprocess.PIPE,
        check=True,
    )
    stdout = proc.stdout.decode()

    tests_run_match = re.search("TOTAL: ([0-9]+) SUCCESS", stdout)

    if not tests_run_match:
        print(f"Tests failed on run {i}")
        print(stdout)
        sys.exit(1)

    expected_tests = 2439
    tests_run = int(tests_run_match.group(1))
    if tests_run < expected_tests:
        print(f"Only ran {tests_run} instead of {expected_tests}")
        print(stdout)
        sys.exit(1)

Output:

[I]  ~/h/client > python test-loop.py
Running test iteration 1 of 100
Running test iteration 2 of 100
Running test iteration 3 of 100
Running test iteration 4 of 100
Running test iteration 5 of 100
Running test iteration 6 of 100
Only ran 386 instead of 2439
[10:25:22] Using gulpfile ~/hypothesis/client/gulpfile.js
[10:25:22] Starting 'test'...
[10:25:22] Starting 'build-css'...
[10:25:23] Finished 'build-css' after 1.08 s
[10:25:23] Starting '<anonymous>'...

START:
13 07 2020 10:25:49.786:INFO [framework.browserify]: bundle built
13 07 2020 10:25:49.861:INFO [karma-server]: Karma v5.1.0 server started at http://0.0.0.0:9876/
13 07 2020 10:25:49.862:INFO [launcher]: Launching browsers ChromeHeadless_Custom with concurrency unlimited
13 07 2020 10:25:49.870:INFO [launcher]: Starting browser ChromeHeadless
13 07 2020 10:25:50.228:INFO [Chrome Headless 83.0.4103.0 (Mac OS 10.15.5)]: Connected on socket H3Vs5mT9j5XekMHCAAAA with id 52259759
13 07 2020 10:25:51.849:WARN [web-server]: 404: /undefined

Finished in 1.369 secs / 0.466 secs @ 10:25:52 GMT+0100 (British Summer Time)

SUMMARY:
✔ 386 tests completed
TOTAL: 386 SUCCESS
[10:25:52] Finished '<anonymous>' after 29 s
[10:25:52] Finished 'test' after 30 s

Add a script to assist with debugging #2249 by running the test suite repeatedly and checking the output.

robertknight · 2020-07-14T07:47:00Z

By adding logging to karma-mocha I was able to get more details about what happens when the test fails early. The tests are failing after sinon throws an exception about a timer being created with setTimeout and later cleared using cancelAnimationFrame:

Error: Uncaught Error: Cannot clear timer: timer created with setTimeout() but cleared with cancelAnimationFrame() (http://localhost:9876/absolute/Users/robert/hypothesis/client/node_modules/sinon/pkg/sinon.js?d78d9a206cd94a49c9bef2a301b191dbb41b7e9f:4757)
LOG: 'karma-mocha-debug: completed 386 of 2439 tests'
LOG: 'karma-mocha-debug: completed 386 of 2439 tests'
LOG: 'karma-mocha-debug: event counts', Object{testStart: 386, testEnd: 386, testFail: 1, testSkipped: 0}
LOG: 'karma-mocha-debug: event counts', Object{testStart: 386, testEnd: 386, testFail: 1, testSkipped: 0}

By modifying the sinon package I was able to get the thrown Error to include stack traces showing where the timer was created and cleared:

'Error: Uncaught Error: Cannot clear timer: timer created with setTimeout() but cleared with cancelAnimationFrame()create stack: Error: Allocating timer
    at addTimer (absolute/Users/robert/hypothesis/client/node_modules/sinon/pkg/sinon.js?1eae6c1e37094c69110615551da0c52234b202c3:4606:17)
    at Object.setTimeout (absolute/Users/robert/hypothesis/client/node_modules/sinon/pkg/sinon.js?1eae6c1e37094c69110615551da0c52234b202c3:5059:20)
    at target.<computed> (absolute/Users/robert/hypothesis/client/node_modules/sinon/pkg/sinon.js?1eae6c1e37094c69110615551da0c52234b202c3:4885:38)
    at r (absolute/var/folders/b5/m2_j1bd12rs3kvtc4vsbb1vc0000gn/T/20aae16cd07513c23c903c93ad2c5408.browserify.js?4de63f6500c582e57fa7ce15c235b88c0a5deda4:124517:1145)
clear stack: Error: clearStack
    at clearTimer (absolute/Users/robert/hypothesis/client/node_modules/sinon/pkg/sinon.js?1eae6c1e37094c69110615551da0c52234b202c3:4759:25)
    at Object.cancelAnimationFrame (absolute/Users/robert/hypothesis/client/node_modules/sinon/pkg/sinon.js?1eae6c1e37094c69110615551da0c52234b202c3:5160:20)
    at target.<computed> (absolute/Users/robert/hypothesis/client/node_modules/sinon/pkg/sinon.js?1eae6c1e37094c69110615551da0c52234b202c3:4885:38)
    at r (http://localhost:9876/absolute/var/folders/b5/m2_j1bd12rs3kvtc4vsbb1vc0000gn/T/20aae16cd07513c23c903c93ad2c5408.browserify.js?4de63f6500c582e57fa7ce15c235b88c0a5deda4:124517:1121) (http://localhost:9876/absolute/Users/robert/hypothesis/client/node_modules/sinon/pkg/sinon.js?1eae6c1e37094c69110615551da0c52234b202c3:4772)'
LOG: 'karma-mocha-debug: completed 389 of 2439 tests'

The function name in the browserify bundle is unfortunately minified and the bundle isn't normally present on the file system after the test runs. However by modifying karma-browserify it was possible to save the bundle to a file just after it is generated.

Navigating to the line and column specified on this line:

http://localhost:9876/absolute/var/folders/b5/m2_j1bd12rs3kvtc4vsbb1vc0000gn/T/20aae16cd07513c23c903c93ad2c5408.browserify.js?4de63f6500c582e57fa7ce15c235b88c0a5deda4:124517:1121

Finds that the timeout is being cleared from this minified function:

r=function(){clearTimeout(u),l&&cancelAnimationFrame(t),setTimeout(n)}

After looking around the surrounding code to figure out what package it is probably from, I think it is the done callback here: https://github.com/preactjs/preact/blob/aea0726b661732aada223093fc14dcbf80771b7c/hooks/src/index.js#L306. Both the cancelAnimationFrame and setTimeout locations are in that package. In the original source code though, the setTimeout timer ID is not even captured so there is no way it can be accidentally passed to cancelAnimationFrame later.

Timer IDs are integers and in the case of sinon's fake timers they are just allocated sequentially from 1. Timer IDs allocated by the native setTimeout function are also integers. In Chrome at least they seem to start from a low value (eg. if you open up http://example.com in a new tab and run setTimeout a few times, you'll see a sequence of small integers). I wonder if we've got a scenario where a timer is being allocated using a native function when sinon fake timers are not active and then later cleared at a time when fake timers are active?

robertknight · 2020-07-14T08:02:23Z

It looks like requestAnimationFrame maintains its own timer IDs that are sequentially allocated from 1, and that behaviour is mandated by the spec.

In sinon, it seems that setTimeout and requestAnimationFrame both use the same global counter.

Add a script to assist with debugging #2249 by running the test suite repeatedly and checking the output.

robertknight · 2020-08-07T09:23:05Z

The problem where only a subset of the tests run but the test suite exits without any apparent failure can be reproduced on my system using a test which triggers an uncaught exception some time after the test has run:

diff --git a/src/sidebar/util/test/dom-test.js b/src/sidebar/util/test/dom-test.js
index 7d4511d50..29502d410 100644
--- a/src/sidebar/util/test/dom-test.js
+++ b/src/sidebar/util/test/dom-test.js
@@ -24,6 +24,14 @@ describe('sidebar/util/dom', () => {
     });
   });
 
+  describe('TESTING', () => {
+    it('should blow up', () => {
+      setTimeout(() => {
+        throw new Error('Fail!');
+      }, 0);
+    });
+  });
+
   describe('listen', () => {
     const createFakeElement = () => ({
       addEventListener: sinon.stub(),

And then running gulp test --grep 'util'.

The output in this contains no hint of an error:

START:
07 08 2020 10:21:27.917:INFO [framework.browserify]: bundle built
07 08 2020 10:21:27.943:INFO [karma-server]: Karma v5.1.0 server started at http://0.0.0.0:9876/
07 08 2020 10:21:27.943:INFO [launcher]: Launching browsers ChromeHeadless_Custom with concurrency unlimited
07 08 2020 10:21:27.949:INFO [launcher]: Starting browser ChromeHeadless
07 08 2020 10:21:28.366:INFO [Chrome Headless 85.0.4182.0 (Mac OS 10.15.6)]: Connected on socket Fh2nCLU9QX7Fg5ZSAAAA with id 83866735

Finished in 0.14 secs / 0.06 secs @ 10:21:28 GMT+0100 (British Summer Time)

SUMMARY:
✔ 152 tests completed

But if the broken test is commented out, the output is:

START:
07 08 2020 10:22:10.731:INFO [framework.browserify]: bundle built
07 08 2020 10:22:10.755:INFO [karma-server]: Karma v5.1.0 server started at http://0.0.0.0:9876/
07 08 2020 10:22:10.755:INFO [launcher]: Launching browsers ChromeHeadless_Custom with concurrency unlimited
07 08 2020 10:22:10.759:INFO [launcher]: Starting browser ChromeHeadless
07 08 2020 10:22:11.110:INFO [Chrome Headless 85.0.4182.0 (Mac OS 10.15.6)]: Connected on socket 1IVdcw_VvDmHefVcAAAA with id 61684393

Finished in 0.598 secs / 0.239 secs @ 10:22:12 GMT+0100 (British Summer Time)

SUMMARY:
✔ 426 tests completed

robertknight · 2020-08-07T10:40:40Z

The way our test environment works is that Karma loads all the test code into a mostly empty web page and then runs Mocha using a test-runner adapter provided by the karma-mocha package. When karma-mocha runs Mocha, it sets up a custom reporter that translates Mocha test events into Karma test runner events.

By adding some instrumentation to the createMochaReporterConstructor function in node_modules/karma-mocha/lib/adapter.js I've learned that what happens in the above scenario is that:

The tests start running
The buggy test added in the code above runs, but doesn't report any error, since the setTimeout callback runs after the test completes
Some later async test is run, with test start and test end events emitted. This test is marked as having completed successfully. It is only when the "test end" event is emitted by mocha that karma-mocha reports the test status back to Karma.
Just after the test in step (3) finishes, a test fail event is emitted for the test that just finished. karma-mocha adds metadata to a test object marking it as a failure, but since the test end event has already been emitted, this failure is never reported back to Karma
An "end" event for the whole test run is emitted by mocha after the failure, and this causes karma-mocha to report to Karma that the test run has finished. At this point Karma exits, having only run a subset of the tests, but reports that they were all successful

robertknight · 2020-08-07T11:49:00Z

Looking at hypothesis/karma-mocha@25f9a6f it seems that karma-mocha expects that uncaught exceptions will always be followed by a "test end" event, which it seems is not happening when an uncaught exception is thrown in-between tests after the most recent test completed successfully. At this point I'm not clear on whether karma-mocha's expectations about a "fail" event always being followed by a "test end" event are wrong, or whether the behavior of mocha in this situation is incorrect.

Add a script to assist with debugging #2249 by running the test suite repeatedly and checking that the tests completed successfully and that the expected number of tests were run.

Due to an issue with the interaction between mocha and karma-mocha [1], uncaught exceptions or promise rejections that occur in between tests can cause the test run to stop immediately yet incorrectly report that all tests executed succesfully and hence CI runs / Jenkins builds incorrectly report their status. This commit works around the issue until the problem is resolved in mocha / karma-mocha by adding an `afterEach` hook which will re-throw any uncaught exceptions and ensure that mocha notices them and reports the failure correctly to Karma. [1] See comments on #2249

Fix a test that was failing but failing in a way that went unreported previously due to a combination of it occurring after the test function returned because of a missing `await` call and because of the issue with uncaught exceptions between tests described in #2249. This test was failing but the failure was only reported after the test function returned due to a missing `await`.

Due to an issue with the interaction between mocha and karma-mocha [1], uncaught exceptions or promise rejections that occur in between tests can cause the test run to stop immediately yet incorrectly report that all tests executed succesfully and hence CI runs / Jenkins builds incorrectly report their status. This commit works around the issue until the problem is resolved in mocha / karma-mocha by adding an `afterEach` hook which will re-throw any uncaught exceptions and ensure that mocha notices them and reports the failure correctly to Karma. [1] See comments on #2249

Fix a test that was failing but failing in a way that went unreported previously due to a combination of it occurring after the test function returned because of a missing `await` call and because of the issue with uncaught exceptions between tests described in #2249. This test was failing but the failure was only reported after the test function returned due to a missing `await`.

robertknight · 2020-08-11T14:47:53Z

Update: The problem is now partly addressed by a workaround. The issue that causes the uncaught exceptions to occur in between tests still exists, but at least the test run will fail now rather than incorrectly reporting that the tests all ran successfully.

robertknight · 2021-03-18T11:34:13Z

I haven't seen this issue happen in some time, so I'm going to close this for now.

robertknight added the Added to sprint label Jun 22, 2020

robertknight added a commit that referenced this issue Jun 22, 2020

Update Alpine in CI to v3.12

4ec0e4c

Update the software used to run tests in CI. This _may_ help with #2249.

robertknight mentioned this issue Jun 22, 2020

Update Alpine in CI to v3.12 #2251

Merged

robertknight self-assigned this Jun 22, 2020

robertknight added a commit that referenced this issue Jun 23, 2020

Log which tests have executed during CI runs

556b1a1

During CI runs, but not local development, log which individual tests have executed. This may help with debugging #2249.

robertknight mentioned this issue Jun 23, 2020

Log which tests have executed during CI runs #2263

Merged

lyzadanger removed the Added to sprint label Jun 24, 2020

robertknight added a commit that referenced this issue Jul 13, 2020

Add a script to repeatedly run the test suite

43e1641

Add a script to assist with debugging #2249 by running the test suite repeatedly and checking the output.

robertknight added a commit that referenced this issue Jul 31, 2020

Add a script to repeatedly run the test suite

fa017ea

Add a script to assist with debugging #2249 by running the test suite repeatedly and checking the output.

robertknight mentioned this issue Aug 7, 2020

Flush timers after each test #2420

Closed

robertknight mentioned this issue Aug 11, 2020

Uncaught exceptions in between Mocha tests are not reported karma-runner/karma-mocha#227

Open

robertknight closed this as completed Mar 18, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CI builds sometimes only run a subset of tests #2249

CI builds sometimes only run a subset of tests #2249

robertknight commented Jun 22, 2020

robertknight commented Jun 22, 2020

robertknight commented Jun 22, 2020 •

edited

robertknight commented Jun 30, 2020

robertknight commented Jul 13, 2020

robertknight commented Jul 14, 2020

robertknight commented Jul 14, 2020

robertknight commented Aug 7, 2020

robertknight commented Aug 7, 2020

robertknight commented Aug 7, 2020

robertknight commented Aug 11, 2020

robertknight commented Mar 18, 2021

CI builds sometimes only run a subset of tests #2249

CI builds sometimes only run a subset of tests #2249

Comments

robertknight commented Jun 22, 2020

robertknight commented Jun 22, 2020

robertknight commented Jun 22, 2020 • edited

robertknight commented Jun 30, 2020

robertknight commented Jul 13, 2020

robertknight commented Jul 14, 2020

robertknight commented Jul 14, 2020

robertknight commented Aug 7, 2020

robertknight commented Aug 7, 2020

robertknight commented Aug 7, 2020

robertknight commented Aug 11, 2020

robertknight commented Mar 18, 2021

robertknight commented Jun 22, 2020 •

edited