Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CI builds sometimes only run a subset of tests #2249

Closed
robertknight opened this issue Jun 22, 2020 · 11 comments
Closed

CI builds sometimes only run a subset of tests #2249

robertknight opened this issue Jun 22, 2020 · 11 comments
Assignees

Comments

@robertknight
Copy link
Member

We've seen an issue in CI, and Lyza and myself have also experienced it locally too, where builds occasionally "succeed" but only run a fraction (typically 300-500 it seems) of the tests.

When this happens, codecov will typically report a sudden unexpected drop in code coverage.

Original report: #2244 (comment)

robertknight added a commit that referenced this issue Jun 22, 2020
Update the software used to run tests in CI. This _may_ help with
#2249.
@robertknight robertknight self-assigned this Jun 22, 2020
@robertknight
Copy link
Member Author

I noticed this in the CI logs for a recent master build. Probably unrelated, but I'll note it anyway:

[0622/134713.594272:INFO:CONSOLE(342905)] "Uncaught (in promise) TypeError: Cannot read property 'references' of undefined", source: http://localhost:9876/absolute/tmp/92430908e258191cffbe541e795258ee.browserify.js?352a30a79692f7b6bedaf8759b1b434258a231ce (342905)

@robertknight
Copy link
Member Author

robertknight commented Jun 22, 2020

Example of a build of the master branch where this happened: https://jenkins.hypothes.is/job/client/job/master/1986/console

Looking through the recent builds list on Jenkins (29) builds, this issue happened in two of them: #1986 and #1982. It would be interesting to look at the Chrome browser logs for those builds, but unfortunately we only keep the one for the most recent build.

robertknight added a commit that referenced this issue Jun 23, 2020
During CI runs, but not local development, log which individual tests
have executed. This may help with debugging #2249.
@robertknight
Copy link
Member Author

We updated the version of Alpine that is used on Jenkins CI and I don't think we've seen the problem since there, although Lyza did report that it happened on her system.

#2263 has modified the CI build so that it logs which tests have executed, which should help debugging if the issue happens again.

Reproducing the issue on my system may be more difficult now because I updated my laptop and the new one is faster.

One thing I probably will try before closing this as "maybe fixed - wait until it happens again so I can continue debugging" is to write a script that just runs the test suite in a loop many times and checking whether the issue occurred. It may also be worth checking to see if it happens in an environment where the system is resource constrained (eg. in a Docker container with reduced CPU limits).

@robertknight
Copy link
Member Author

I was able to reproduce this issue locally with the following script:

import re
import subprocess
import sys

max_iterations = 100
for i in range(0, max_iterations):
    print(f"Running test iteration {i+1} of {max_iterations}")

    proc = subprocess.run(
        "node_modules/.bin/gulp test",
        shell=True,
        stdout=subprocess.PIPE,
        check=True,
    )
    stdout = proc.stdout.decode()

    tests_run_match = re.search("TOTAL: ([0-9]+) SUCCESS", stdout)

    if not tests_run_match:
        print(f"Tests failed on run {i}")
        print(stdout)
        sys.exit(1)

    expected_tests = 2439
    tests_run = int(tests_run_match.group(1))
    if tests_run < expected_tests:
        print(f"Only ran {tests_run} instead of {expected_tests}")
        print(stdout)
        sys.exit(1)

Output:

[I]  ~/h/client > python test-loop.py
Running test iteration 1 of 100
Running test iteration 2 of 100
Running test iteration 3 of 100
Running test iteration 4 of 100
Running test iteration 5 of 100
Running test iteration 6 of 100
Only ran 386 instead of 2439
[10:25:22] Using gulpfile ~/hypothesis/client/gulpfile.js
[10:25:22] Starting 'test'...
[10:25:22] Starting 'build-css'...
[10:25:23] Finished 'build-css' after 1.08 s
[10:25:23] Starting '<anonymous>'...

START:
13 07 2020 10:25:49.786:INFO [framework.browserify]: bundle built
13 07 2020 10:25:49.861:INFO [karma-server]: Karma v5.1.0 server started at http://0.0.0.0:9876/
13 07 2020 10:25:49.862:INFO [launcher]: Launching browsers ChromeHeadless_Custom with concurrency unlimited
13 07 2020 10:25:49.870:INFO [launcher]: Starting browser ChromeHeadless
13 07 2020 10:25:50.228:INFO [Chrome Headless 83.0.4103.0 (Mac OS 10.15.5)]: Connected on socket H3Vs5mT9j5XekMHCAAAA with id 52259759
13 07 2020 10:25:51.849:WARN [web-server]: 404: /undefined

Finished in 1.369 secs / 0.466 secs @ 10:25:52 GMT+0100 (British Summer Time)

SUMMARY:
✔ 386 tests completed
TOTAL: 386 SUCCESS
[10:25:52] Finished '<anonymous>' after 29 s
[10:25:52] Finished 'test' after 30 s

robertknight added a commit that referenced this issue Jul 13, 2020
Add a script to assist with debugging #2249 by running the test suite
repeatedly and checking the output.
@robertknight
Copy link
Member Author

By adding logging to karma-mocha I was able to get more details about what happens when the test fails early. The tests are failing after sinon throws an exception about a timer being created with setTimeout and later cleared using cancelAnimationFrame:

Error: Uncaught Error: Cannot clear timer: timer created with setTimeout() but cleared with cancelAnimationFrame() (http://localhost:9876/absolute/Users/robert/hypothesis/client/node_modules/sinon/pkg/sinon.js?d78d9a206cd94a49c9bef2a301b191dbb41b7e9f:4757)
LOG: 'karma-mocha-debug: completed 386 of 2439 tests'
LOG: 'karma-mocha-debug: completed 386 of 2439 tests'
LOG: 'karma-mocha-debug: event counts', Object{testStart: 386, testEnd: 386, testFail: 1, testSkipped: 0}
LOG: 'karma-mocha-debug: event counts', Object{testStart: 386, testEnd: 386, testFail: 1, testSkipped: 0}

By modifying the sinon package I was able to get the thrown Error to include stack traces showing where the timer was created and cleared:

'Error: Uncaught Error: Cannot clear timer: timer created with setTimeout() but cleared with cancelAnimationFrame()create stack: Error: Allocating timer
    at addTimer (absolute/Users/robert/hypothesis/client/node_modules/sinon/pkg/sinon.js?1eae6c1e37094c69110615551da0c52234b202c3:4606:17)
    at Object.setTimeout (absolute/Users/robert/hypothesis/client/node_modules/sinon/pkg/sinon.js?1eae6c1e37094c69110615551da0c52234b202c3:5059:20)
    at target.<computed> (absolute/Users/robert/hypothesis/client/node_modules/sinon/pkg/sinon.js?1eae6c1e37094c69110615551da0c52234b202c3:4885:38)
    at r (absolute/var/folders/b5/m2_j1bd12rs3kvtc4vsbb1vc0000gn/T/20aae16cd07513c23c903c93ad2c5408.browserify.js?4de63f6500c582e57fa7ce15c235b88c0a5deda4:124517:1145)
clear stack: Error: clearStack
    at clearTimer (absolute/Users/robert/hypothesis/client/node_modules/sinon/pkg/sinon.js?1eae6c1e37094c69110615551da0c52234b202c3:4759:25)
    at Object.cancelAnimationFrame (absolute/Users/robert/hypothesis/client/node_modules/sinon/pkg/sinon.js?1eae6c1e37094c69110615551da0c52234b202c3:5160:20)
    at target.<computed> (absolute/Users/robert/hypothesis/client/node_modules/sinon/pkg/sinon.js?1eae6c1e37094c69110615551da0c52234b202c3:4885:38)
    at r (http://localhost:9876/absolute/var/folders/b5/m2_j1bd12rs3kvtc4vsbb1vc0000gn/T/20aae16cd07513c23c903c93ad2c5408.browserify.js?4de63f6500c582e57fa7ce15c235b88c0a5deda4:124517:1121) (http://localhost:9876/absolute/Users/robert/hypothesis/client/node_modules/sinon/pkg/sinon.js?1eae6c1e37094c69110615551da0c52234b202c3:4772)'
LOG: 'karma-mocha-debug: completed 389 of 2439 tests'

The function name in the browserify bundle is unfortunately minified and the bundle isn't normally present on the file system after the test runs. However by modifying karma-browserify it was possible to save the bundle to a file just after it is generated.

Navigating to the line and column specified on this line:

http://localhost:9876/absolute/var/folders/b5/m2_j1bd12rs3kvtc4vsbb1vc0000gn/T/20aae16cd07513c23c903c93ad2c5408.browserify.js?4de63f6500c582e57fa7ce15c235b88c0a5deda4:124517:1121

Finds that the timeout is being cleared from this minified function:

r=function(){clearTimeout(u),l&&cancelAnimationFrame(t),setTimeout(n)}

After looking around the surrounding code to figure out what package it is probably from, I think it is the done callback here: https://github.com/preactjs/preact/blob/aea0726b661732aada223093fc14dcbf80771b7c/hooks/src/index.js#L306. Both the cancelAnimationFrame and setTimeout locations are in that package. In the original source code though, the setTimeout timer ID is not even captured so there is no way it can be accidentally passed to cancelAnimationFrame later.

Timer IDs are integers and in the case of sinon's fake timers they are just allocated sequentially from 1. Timer IDs allocated by the native setTimeout function are also integers. In Chrome at least they seem to start from a low value (eg. if you open up http://example.com in a new tab and run setTimeout a few times, you'll see a sequence of small integers). I wonder if we've got a scenario where a timer is being allocated using a native function when sinon fake timers are not active and then later cleared at a time when fake timers are active?

@robertknight
Copy link
Member Author

It looks like requestAnimationFrame maintains its own timer IDs that are sequentially allocated from 1, and that behaviour is mandated by the spec.

In sinon, it seems that setTimeout and requestAnimationFrame both use the same global counter.

robertknight added a commit that referenced this issue Jul 31, 2020
Add a script to assist with debugging #2249 by running the test suite
repeatedly and checking the output.
@robertknight
Copy link
Member Author

The problem where only a subset of the tests run but the test suite exits without any apparent failure can be reproduced on my system using a test which triggers an uncaught exception some time after the test has run:

diff --git a/src/sidebar/util/test/dom-test.js b/src/sidebar/util/test/dom-test.js
index 7d4511d50..29502d410 100644
--- a/src/sidebar/util/test/dom-test.js
+++ b/src/sidebar/util/test/dom-test.js
@@ -24,6 +24,14 @@ describe('sidebar/util/dom', () => {
     });
   });
 
+  describe('TESTING', () => {
+    it('should blow up', () => {
+      setTimeout(() => {
+        throw new Error('Fail!');
+      }, 0);
+    });
+  });
+
   describe('listen', () => {
     const createFakeElement = () => ({
       addEventListener: sinon.stub(),

And then running gulp test --grep 'util'.

The output in this contains no hint of an error:

START:
07 08 2020 10:21:27.917:INFO [framework.browserify]: bundle built
07 08 2020 10:21:27.943:INFO [karma-server]: Karma v5.1.0 server started at http://0.0.0.0:9876/
07 08 2020 10:21:27.943:INFO [launcher]: Launching browsers ChromeHeadless_Custom with concurrency unlimited
07 08 2020 10:21:27.949:INFO [launcher]: Starting browser ChromeHeadless
07 08 2020 10:21:28.366:INFO [Chrome Headless 85.0.4182.0 (Mac OS 10.15.6)]: Connected on socket Fh2nCLU9QX7Fg5ZSAAAA with id 83866735

Finished in 0.14 secs / 0.06 secs @ 10:21:28 GMT+0100 (British Summer Time)

SUMMARY:
✔ 152 tests completed

But if the broken test is commented out, the output is:

START:
07 08 2020 10:22:10.731:INFO [framework.browserify]: bundle built
07 08 2020 10:22:10.755:INFO [karma-server]: Karma v5.1.0 server started at http://0.0.0.0:9876/
07 08 2020 10:22:10.755:INFO [launcher]: Launching browsers ChromeHeadless_Custom with concurrency unlimited
07 08 2020 10:22:10.759:INFO [launcher]: Starting browser ChromeHeadless
07 08 2020 10:22:11.110:INFO [Chrome Headless 85.0.4182.0 (Mac OS 10.15.6)]: Connected on socket 1IVdcw_VvDmHefVcAAAA with id 61684393

Finished in 0.598 secs / 0.239 secs @ 10:22:12 GMT+0100 (British Summer Time)

SUMMARY:
✔ 426 tests completed

@robertknight
Copy link
Member Author

The way our test environment works is that Karma loads all the test code into a mostly empty web page and then runs Mocha using a test-runner adapter provided by the karma-mocha package. When karma-mocha runs Mocha, it sets up a custom reporter that translates Mocha test events into Karma test runner events.

By adding some instrumentation to the createMochaReporterConstructor function in node_modules/karma-mocha/lib/adapter.js I've learned that what happens in the above scenario is that:

  1. The tests start running
  2. The buggy test added in the code above runs, but doesn't report any error, since the setTimeout callback runs after the test completes
  3. Some later async test is run, with test start and test end events emitted. This test is marked as having completed successfully. It is only when the "test end" event is emitted by mocha that karma-mocha reports the test status back to Karma.
  4. Just after the test in step (3) finishes, a test fail event is emitted for the test that just finished. karma-mocha adds metadata to a test object marking it as a failure, but since the test end event has already been emitted, this failure is never reported back to Karma
  5. An "end" event for the whole test run is emitted by mocha after the failure, and this causes karma-mocha to report to Karma that the test run has finished. At this point Karma exits, having only run a subset of the tests, but reports that they were all successful

@robertknight
Copy link
Member Author

Looking at hypothesis/karma-mocha@25f9a6f it seems that karma-mocha expects that uncaught exceptions will always be followed by a "test end" event, which it seems is not happening when an uncaught exception is thrown in-between tests after the most recent test completed successfully. At this point I'm not clear on whether karma-mocha's expectations about a "fail" event always being followed by a "test end" event are wrong, or whether the behavior of mocha in this situation is incorrect.

robertknight added a commit that referenced this issue Aug 7, 2020
Add a script to assist with debugging #2249 by running the test suite
repeatedly and checking that the tests completed successfully and that
the expected number of tests were run.
robertknight added a commit that referenced this issue Aug 7, 2020
Add a script to assist with debugging #2249 by running the test suite
repeatedly and checking that the tests completed successfully and that
the expected number of tests were run.
robertknight added a commit that referenced this issue Aug 10, 2020
Due to an issue with the interaction between mocha and karma-mocha [1],
uncaught exceptions or promise rejections that occur in between tests
can cause the test run to stop immediately yet incorrectly report that
all tests executed succesfully and hence CI runs / Jenkins builds
incorrectly report their status.

This commit works around the issue until the problem is resolved in
mocha / karma-mocha by adding an `afterEach` hook which will re-throw
any uncaught exceptions and ensure that mocha notices them and reports
the failure correctly to Karma.

[1] See comments on #2249
robertknight added a commit that referenced this issue Aug 10, 2020
Fix a test that was failing but failing in a way that went unreported
previously due to a combination of it occurring after the test function returned
because of a missing `await` call and because of the issue with uncaught
exceptions between tests described in #2249.

This test was failing but the failure was only reported after the test
function returned due to a missing `await`.
robertknight added a commit that referenced this issue Aug 11, 2020
Due to an issue with the interaction between mocha and karma-mocha [1],
uncaught exceptions or promise rejections that occur in between tests
can cause the test run to stop immediately yet incorrectly report that
all tests executed succesfully and hence CI runs / Jenkins builds
incorrectly report their status.

This commit works around the issue until the problem is resolved in
mocha / karma-mocha by adding an `afterEach` hook which will re-throw
any uncaught exceptions and ensure that mocha notices them and reports
the failure correctly to Karma.

[1] See comments on #2249
robertknight added a commit that referenced this issue Aug 11, 2020
Fix a test that was failing but failing in a way that went unreported
previously due to a combination of it occurring after the test function returned
because of a missing `await` call and because of the issue with uncaught
exceptions between tests described in #2249.

This test was failing but the failure was only reported after the test
function returned due to a missing `await`.
@robertknight
Copy link
Member Author

Update: The problem is now partly addressed by a workaround. The issue that causes the uncaught exceptions to occur in between tests still exists, but at least the test run will fail now rather than incorrectly reporting that the tests all ran successfully.

@robertknight
Copy link
Member Author

I haven't seen this issue happen in some time, so I'm going to close this for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants