[Feature] Run tests in a random order #9297

humbienri · 2021-10-04T16:40:15Z

For your test runner, playwright-test, please add the ability to be able to run the individual tests in a random fashion. I consider this one of those true acid tests for how well a test automation suite and overall architecture is evaluated. It can get rather hard to be watchful of not inadvertently building in dependency between tests and being able to specify a command line switch --random would be a gate to prevent that without exactly spending a whole lot of eyes time on it.

Thanks for the hard work you've been putting in so far.

The text was updated successfully, but these errors were encountered:

mxschmitt · 2021-10-04T16:41:35Z

Linking #7040

pavelfeldman · 2021-10-05T03:14:44Z

This is unlikely to appear due to the reasons described in #7040. Leaving it open to collect voices. Once again, in every project I worked on, including Chromium, this has been enabled and then inevitably disabled. While great in theory, it never actually worked.

dimkin-eu · 2021-10-07T13:40:22Z

maybe a simple solution - run them in reverse order? Not a random, but still dependent tests will fail

humbienri · 2021-10-20T22:34:25Z

@pavelfeldman, it never worked or it was never used? Or are you saying that when the tests were ran in random order, there would be failures and so folks would decide to just run them in order?

pavelfeldman · 2021-10-20T22:41:18Z

It is less about 'in order' and more about repeatability of the tests. The cost of non-determinism is just too high.

Meemaw · 2022-04-08T02:08:32Z

@pavelfeldman we have a very uneven test execution length between tests. Coupling that with sharding, it results into some shards to finish in < 2 minutes, and some taking 15 minutes.

This happens because most of the "expensive" tests are in same file which will fall into same shard.

Randomising/shuffling tests would partially mitigate this, and help us big time, unless you have any other ideas on how to solve this problem.

Meemaw · 2022-04-15T20:13:58Z

cc @mxschmitt in case you have any ideas how to tackle ☝️ problem.

Meemaw · 2022-04-19T17:33:21Z

Wondering if you would be open to contributions for this feature? We'll have to add it somehow, and we really wouldn't want to fork playwright for a feature I think could be beneficial to others.

zyulyaev · 2022-06-22T21:12:13Z

Given fullyParallel option which IMO contradicts your statement in #7040, I don't see any reason not to allow running tests in random order. We have a very similar issue to the one @Meemaw described. Our tests depend on the feature flags values, and in some environments, a lot of tests are skipped because of the features being disabled. So the load is distributed extremely unevenly between the shards.

zyulyaev · 2022-06-23T12:42:17Z

I created a patch using patch-package as a hotfix to this issue. Really hope to see this as a native feature though.

AlexDaniel · 2022-07-04T16:07:15Z

@pavelfeldman we have a very uneven test execution length between tests. Coupling that with sharding, it results into some shards to finish in < 2 minutes, and some taking 15 minutes.

Same. This is not strictly a problem with sharding, but with parallelization in general. Some tests can run longer than others, and not everything is strictly bound by CPU. To achieve the best wall clock time you have to take into consideration how fast/slow each test runs.

Back in the day, prove tool for perl5 tests had a slow option to run slow tests first. This was based on the results of the previous run. I wish something like this existed in Playwright too, then it'd be able to optimize both the parallelization and sharding. Note that there are many options for handling the stats file – you can either commit it to the repo from time to time, or use CI features to cache it from the previous run.

Meemaw · 2022-07-05T07:45:25Z

I created a patch using patch-package as a hotfix to this issue. Really hope to see this as a native feature though.

@zyulyaev does this also shuffle tests from within a single file across multiple shards? We have most of our "expensive" tests in a single file, and when we tried patching Playwright to shuffle the tests, could only get it shuffled on the group/file level.

zyulyaev · 2022-07-11T09:35:17Z

I created a patch using patch-package as a hotfix to this issue. Really hope to see this as a native feature though.

@zyulyaev does this also shuffle tests from within a single file across multiple shards? We have most of our "expensive" tests in a single file, and when we tried patching Playwright to shuffle the tests, could only get it shuffled on the group/file level.

@Meemaw I believe so. We have a very similar issue to yours, and we've been successfully using this patch for several weeks now.

conanbatt · 2022-08-10T21:48:06Z

Pure randomness to distribute slow tests seem like the wrong solution - it makes the original issue less likely but still possible, and thus frequent.

There is always a trade-off between pure entropy and loadbalancing the test length. For speed you would be absolutely deterministic, relying on information from previous runs. Something in the middle would be to tag tests as slow, or different priorities, so there are multiple buckets.

BrianEdwardHoover · 2022-08-30T14:16:58Z

I'd love to see the ability to selectively enable randomness via a cli flag. Please reconsider this feature.

jeffcasavant · 2022-08-30T14:24:17Z

Regarding the determinism/repeatability issue:

It totally makes sense why folks end up turning it off again. If you have a failure flushed out by running the tests in random order, you're unlikely to see it again because the order will be different next time the tests run.

So there's not much value add unless you can make it repeatable. I think that's solvable.

What if we did the following:

Use a deterministic random number generator
In the normal case when running in a randomized order use a random seed, but print it out
In the rerun case, allow the user to pass in the seed so they can get the same test order again and reproduce the failure

lforst · 2023-04-26T07:41:59Z

I think this would be really cool. An option to provide a seed would be really pragmatic and really useful.

In our case, we have around 7000 tests where a lot of tests are skipped for a lot of configurations and certain shards will run all tests and other shards run only a very small number. We would like to distribute this imbalance a bit.

Slow shard: https://github.com/getsentry/sentry-javascript/actions/runs/4786799183/jobs/8535829572?pr=7934#step:10:58
Fast shard: https://github.com/getsentry/sentry-javascript/actions/runs/4786799183/jobs/8535831595?pr=7934#step:10:58

ltsuda · 2023-04-26T12:17:55Z

I don't use playwright for work related projects so I'm just adding these pytest plugins to add more information/context and see if it helps or something.

https://github.com/pytest-dev/pytest-randomly
https://github.com/pytest-dev/pytest-order

xyrilj · 2023-10-24T18:52:49Z

This would be very helpful. In a similar boat as @Meemaw

john-griffin · 2023-12-24T21:33:42Z

Rspec is good at this. It generates a random seed for each test run order and outputs it. You can then plug the seed in when rerunning to reproduce the sort order.

https://rspec.info/features/3-12/rspec-core/command-line/randomization/

PsiKai · 2024-02-21T16:22:17Z

Plus one to the deterministic random number generator based on a seed. Also, for the fact that several other testing frameworks offer this type of feature.

muhqu · 2024-05-15T09:18:17Z

I've raised a PR to add a --sharding-seed parameter to randomly distribute tests groups to shards in a deterministic way. #30817

pavelfeldman added the P3-collecting-feedback label Oct 5, 2021

mxschmitt added the feature-test-runner Playwright test specific issues label Oct 5, 2021

mxschmitt mentioned this issue Mar 4, 2022

[Feature] Randomise test execution order #12512

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] Run tests in a random order #9297

[Feature] Run tests in a random order #9297

humbienri commented Oct 4, 2021 •

edited

Loading

mxschmitt commented Oct 4, 2021

pavelfeldman commented Oct 5, 2021

dimkin-eu commented Oct 7, 2021

humbienri commented Oct 20, 2021

pavelfeldman commented Oct 20, 2021

Meemaw commented Apr 8, 2022

Meemaw commented Apr 15, 2022

Meemaw commented Apr 19, 2022

zyulyaev commented Jun 22, 2022 •

edited

Loading

zyulyaev commented Jun 23, 2022

AlexDaniel commented Jul 4, 2022

Meemaw commented Jul 5, 2022

zyulyaev commented Jul 11, 2022 •

edited

Loading

conanbatt commented Aug 10, 2022

BrianEdwardHoover commented Aug 30, 2022

jeffcasavant commented Aug 30, 2022

lforst commented Apr 26, 2023

ltsuda commented Apr 26, 2023

xyrilj commented Oct 24, 2023

john-griffin commented Dec 24, 2023

PsiKai commented Feb 21, 2024

muhqu commented May 15, 2024

[Feature] Run tests in a random order #9297

[Feature] Run tests in a random order #9297

Comments

humbienri commented Oct 4, 2021 • edited Loading

mxschmitt commented Oct 4, 2021

pavelfeldman commented Oct 5, 2021

dimkin-eu commented Oct 7, 2021

humbienri commented Oct 20, 2021

pavelfeldman commented Oct 20, 2021

Meemaw commented Apr 8, 2022

Meemaw commented Apr 15, 2022

Meemaw commented Apr 19, 2022

zyulyaev commented Jun 22, 2022 • edited Loading

zyulyaev commented Jun 23, 2022

AlexDaniel commented Jul 4, 2022

Meemaw commented Jul 5, 2022

zyulyaev commented Jul 11, 2022 • edited Loading

conanbatt commented Aug 10, 2022

BrianEdwardHoover commented Aug 30, 2022

jeffcasavant commented Aug 30, 2022

lforst commented Apr 26, 2023

ltsuda commented Apr 26, 2023

xyrilj commented Oct 24, 2023

john-griffin commented Dec 24, 2023

PsiKai commented Feb 21, 2024

muhqu commented May 15, 2024

humbienri commented Oct 4, 2021 •

edited

Loading

zyulyaev commented Jun 22, 2022 •

edited

Loading

zyulyaev commented Jul 11, 2022 •

edited

Loading