-
Notifications
You must be signed in to change notification settings - Fork 781
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Split tests for parallelization #947
Comments
This is very interesting but at the same time it looks it accounts for mostly for Browsers only. I would like to see if it's possible to extend it to node environments, using any sort of async instead of browser tabs. On this idea of browser tabs: is it a bad to use WebWorkers? With the next major version advent (2.0) we will be able to limit our browser support which will have most of them supporting this api. |
It is definitely targeted at browsers, though it could be used in other environments. Since the core concept is just to split tests, you could theoretically do that in different processes of any sort, not just browsers (though there may be better solutions in those cases). I've considered WebWorkers, the primary issue there, however, is they have no DOM access (amongst other things). I could see this being a plugin for QUnit, since it doesn't necessarily apply to all use cases, but I think to properly support it we would need to expose some lower-level constructs, such as how tests get queued. |
I think the worker model is a good one to shoot for, implementation agnostic with respect to Web Worker or setTimeout or sibling window or iframe window or Node.js child process or whatever. The single shared DOM as common test fixture is holding QUnit back, and while I don't see it going away, we can and should enable suites that abandon it to gain the commensurate performance benefits. To that end, I'd rather see a parallelization count (e.g., |
This window/iframe parallelization is probably already covered on qunit-composite. Maybe for tests that needs the DOM API we might end up with something similar. I am not sure, but maybe ember-cli projects also run different test htmls in parallel as well? |
The core of it maybe, but the worker model I want is generic and inherently self-balancing, rather than manually pre-partitioned. |
Not sure I completely follow this "worker model" you're thinking of. Would this simply amount to scheduling the tests in worker groups? The "worker" which then handles that group is non-specific (e.g., web worker, child process, etc.)? Or is it more along the lines of: you have a number of workers and those handle individual tests as they become available to accept a new test?
This is the big thing I am looking for as well with this idea. |
Yes. Once we have it in place, non-browser parallelism (or thread-leveraging, or tab-distributed, or …) is just a matter of implementing the appropriate workers. |
I think this would make sense for a lot of use cases. However, I think any use case involving DOM would require the iframe approach (which may not necessarily be a bad thing). I can't think of a good solution for allowing QUnit to communicate across multiple tabs or browser instances. That said, my primary concern with this approach is determinism. Since we won't really have control over the order in which the workers become available, reproducing failures of non-atomic tests might become difficult. A secondary concern with this is that it would preclude sharing state across tests. In general, I know that sharing state is bad, but I also know that there are certain patterns which could benefit from it. For instance, in the Ember community there has been talk of using a single instance of an application across tests for performance reasons. |
Rare indeed is the test that requires a complete DOM as opposed to a container div, but any such examples are free to forgo parallelism, implement synchronization (e.g., a document-level semaphore), or utilize multiple windows as you suggest.
Right, opting in to parallel execution requires more discipline in test definition. But the sequential mode (i.e., single-worker) will always remain available.
I see no reason why that would be precluded. |
That's definitely right, but I am not sure how practical it is. There will always be cases where test suite will fail and reproducing the failure is the only way I am aware of to deterministically getting closer to fixing the problem. Implementing workers model draining the queue would require some way of reproducing the same ordering (and possibly concurrency), which seems like an unnecessary difficult problem to tackle. Implementing the original proposal would gain worse utilization (since not all tests take the same time to execute), but it would be a very simple model, possibly extensible with tests weight to gain better efficiency. I think that the same model works well for node projects too (I am not really developing production node systems so correct me here please). Multi-container CI setups could run different batches in different containers. There is definitely some extra work here because somebody has to make sure that all processes running different batches pass in order to make the whole suite pass, but that's done by CI providers anyways (and trivial to implement). |
Finally circling back to this. After rereading the discussion and thinking about it, I'm onboard with the worker model. In order to deal with the reproducibility problem we can simply track the order in which tests are executed by various workers. We'll then report that in the Additionally, we have now refactored the queue for tests in QUnit to make it much cleaner which should make it much easier to implement more complex scheduling algorithms. The primary remaining hurdle I see is to define an API for actually defining tests that can run in parallel and an API for defining "workers" that run the tests. @gibson042 I know it's been a looong time since the last discussion on this, but if you had any ideas at the time of a potential API, I'd love to hear it. |
That would work 😊 |
I've put together a gist with a proposal for the potential API here: https://gist.github.com/trentmwillis/c8c9a8e1dcf85b9afa8fbfc4d8a4c5b1 Take a look and feel free to either leave comments on the gist or here. |
Is it possible to load javascript, not execute tests and communicate just test examples, not js objects (let's say we have some unique identifier for each test)? That way the testing process and all workers can have their independent copy of objects, we can split with better granularity (test example instead of a file for example), but avoid hard problem of transferring js objects around. (It is possible i am getting something completely wrong :)) |
Thank you for mentioning me. I have not intentionally abandoned QUnit, I just ran out of spare time to work on it (there are about 100 QUnit notifications in my inbox, waiting for me to get to "someday"). So as a result, I really don't know how it has progressed over the past year. Regardless, though, my general position on this point hasn't changed. I always imagined that there would be a single queue of tests instead of files, though I hadn't considered the challenges of sharing host objects. Still, I think it's possible... imagine something like this iframeWorker implementation, where the worker presented to QUnit exists in the correct realm and uses postMessage to manage the "true" worker (which loads QUnit just like the parent window, but then turns it off and retrieves/runs tests directly): let workerKey = QUnit.urlParams.iframeWorkerKey;
let controller = workerKey && window.parent;
if ( !controller ) {
// bikesheddable name and interface (callback vs. promise)
QUnit.registerWorkerFactory(() => new Promise(function(ready, fail) {
const worker = {
runTest(testId, assert) {
const testPromise = new Promise(function(resolve, reject) {
testPromises.set(testId, { resolve, reject, assert });
});
const done = assert.async();
workerWindow.postMessage({ key: workerKey, testId }, "*");
return testPromise.then( v => { done(); return v }, v => { done(); throw v } );
}
};
let testPromises = new Map();
let workerWindow;
const workerKey = String(Math.random()).replace(0, window.performance.now());
const iframe = document.createElement("iframe");
iframe.src = setUrl({ iframeWorkerKey: workerKey });
window.addEventListener("message", function( evt ) {
// Only process messages from our iframe.
if ( typeof evt.data !== "object" || evt.data.key !== workerKey ) {
return;
}
// The first message communicates worker initialization.
if ( !workerWindow ) {
workerWindow = evt.source;
ready(worker);
return;
}
// Check for errors.
if ( evt.data.error ) {
testPromises.get(evt.data.testId).reject(new Error(evt.data.error));
return;
}
// Proxy assertions.
if ( evt.data.assertion ) {
testPromises.get(evt.data.testId).assert(…evt.data.assertion);
return;
}
// Conclude.
testPromises.get(evt.data.testId)[evt.data.promiseAction](evt.data.result);
});
}));
return;
}
// Take advantage of a new (bikesheddable) interface to control QUnit in the worker iframe.
QUnit.convertToWorker();
window.addEventListener("message", function( evt ) {
// Only process messages from our parent.
if ( typeof evt.data !== "object" || evt.data.key !== workerKey ) {
return;
}
// Take advantage of a new (bikesheddable) interface to get and run tests.
const testId = evt.data.testId;
const test = QUnit.getTest(testId);
if ( !test ) {
controller.postMessage({ key: workerKey, testId, error: "test not found" }, "*");
return;
}
runTest(test, getAssertProxy(controller, workerKey, testId));
});
// Register ourselves.
controller.postMessage(null, "*"); |
@mariokostelac yes. I hadn't really considered that approach since it potentially means a large amount of overhead for large test suites, but it could be possible. Also aligns with the example @gibson042 just provided (thanks for that!). If we don't have to recreate state/context and just duplicate it by loading it into each worker, then the test-by-test parallelization should work. However, I see two additional hurdles:
The first hurdle I think is solvable, but I don't really see a path forward for the second point. Let me know what y'all think! |
I think the answer is for worker plugin authors (i.e., "us") to scrape the relevant information by whatever means makes sense for the master–worker pair (maybe URL for page–page as in my example, DOM selection for page–WebWorker, CLI args for Node–Node, etc.). And it won't work for every single test, but that's OK because this is just an optimization... we will always have the host runner, and if necessary can add worker selection metadata to avoid sending certain tests into the parallel queue. P.S. As for the client interface, I'd like to avoid new surface are like |
Great points. We'll have to note any possible limitations and edge cases, but I think the approached outlined here will provided a better user experience than my suggestion above. I believe this gives us a good starting point for doing an initial implementation. |
Wanted to follow up as this is something that my team was interested in adding to qunit. Is qunit still interested in adding this as a core functionality? opt in via some flag seems like the most straightforward way of adding this without having unintended consequences. |
I think there is still interest in this. I prototyped it out at one point, but never got around to tidying it up. In my mind, if we do add this, then the default should essentially be running in parallel with 1 worker. I don't think there should be two different sets of logic. |
@trentmwillis makes sense to me. Would be free to go over an RFC I have to add this feature? |
@gabrielcsapo feel free to post anything here. We don't really have a formal RFC process. |
A rough idea for how a "qunit parallel node" plugin could work. (I'd prefer this be a plugin, but we will most likely need to add a few things in core to make it work, which we track together under this ticket.) As end-user you'd use In the main process, the plugin would a use a new core hook to override the way tests are spawned. The signature of this hook could be that it is given something to run and then return an EventEmitter. The default implementation would do what we do today, in-process. With the only difference being that we'd formalise the use of an EventEmitter between this and main EventEmitter used by the current reporter(s). The plugin could then use the hook to instead spawn a subprocess of the same nodejs and qunit paths and make that run the test(s) in question. The plugin could also add a special environment variable to the sub process to help its copy in the child process understand that it is a child process and thus set up an IPC channel with the main one to send the expected event emitter data. In terms of granularity, I'm not sure what the right answer is: By file, or by moduleId/testId? By file seems the simplest. Doing so might not run optimal speed though, if some modules are much larger than others. On the other hand, it might also actually be faster in practice because it would mean the workers only need to process one test file, instead of executing all test files only to have the filter run a small subset of it. If we go for supporting splitting by file, then the core hook would be placed before the CLI loads files, and would default to just requiring the file. The plugin would then spawn the subprocess the same as the parent, but with an environment variable to tell its child that it is a child, and the files argument replaced with just the given one. If we go for supporting splittting by moduleId or testId, then the hook would be much later in the CLI process, around the time where we would normally start executing tests. The hook would then allow the plugin to decide whether to run the tests, e.g. based on random sampling, or hashing, or based on having organised the full set of registered into some buckets and determining whether or not the current process should run this one or not. Whatever it wants to do. The plugin would then need to communicate its decision on what to run to its child, e.g. by sharing the hash through an env variable, or by sending an JSON list of module IDs or test IDs to the child process early on. (The child process's requiring of the plugin would e.g. read this from stdin or some such and block until done, and then let the QUnit CLI do the rest, forwarding the events back up through IPC). Thoughts? |
Recently, I've seen a lot of interest in and discussion on running JS tests in parallel. Most of these rely on faking concurrency through async behavior, while that is definitely an interesting approach I am not convinced it is the proper solution for browser-based tests that often rely on the
document
and other global/shared state. That said, I think finding a way to parallelize tests would be hugely beneficial, especially for larger code bases.I'd like to propose that QUnit bake in support for splitting tests into groups at runtime. This would then allow users to parallelize test runs by using multiple, independent instances of their test page.
I imagine this could be based on url params like so:
This would split the tests into 10 different batches and then run the first batch. You can then adjust the
batch
parameter to run the other groups in other browser instances/tabs.As an initial implementation, this could be super simple and just break the tests up into equally sized batches. Then, as an enhancement, we could standardize a test output that can be fed back into QUnit to allow weighted splitting to ensure the batches run in approximately equal amounts of time.
The primary benefit of this would be to allow parallelization, but in doing so it would also help identify non-atomic tests in the same way as the recently introduced test randomization feature does.
For full disclosure, this would benefit a recently implemented Testem feature to support running multiple test pages in parallel.
The text was updated successfully, but these errors were encountered: