Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Run tests in a random order #9297

Open
humbienri opened this issue Oct 4, 2021 · 22 comments
Open

[Feature] Run tests in a random order #9297

humbienri opened this issue Oct 4, 2021 · 22 comments
Labels
feature-test-runner Playwright test specific issues P3-collecting-feedback

Comments

@humbienri
Copy link

humbienri commented Oct 4, 2021

For your test runner, playwright-test, please add the ability to be able to run the individual tests in a random fashion. I consider this one of those true acid tests for how well a test automation suite and overall architecture is evaluated. It can get rather hard to be watchful of not inadvertently building in dependency between tests and being able to specify a command line switch --random would be a gate to prevent that without exactly spending a whole lot of eyes time on it.

Thanks for the hard work you've been putting in so far.

@mxschmitt
Copy link
Member

Linking #7040

@pavelfeldman
Copy link
Member

This is unlikely to appear due to the reasons described in #7040. Leaving it open to collect voices. Once again, in every project I worked on, including Chromium, this has been enabled and then inevitably disabled. While great in theory, it never actually worked.

@mxschmitt mxschmitt added the feature-test-runner Playwright test specific issues label Oct 5, 2021
@dimkin-eu
Copy link

maybe a simple solution - run them in reverse order? Not a random, but still dependent tests will fail

@humbienri
Copy link
Author

@pavelfeldman, it never worked or it was never used? Or are you saying that when the tests were ran in random order, there would be failures and so folks would decide to just run them in order?

@pavelfeldman
Copy link
Member

It is less about 'in order' and more about repeatability of the tests. The cost of non-determinism is just too high.

@Meemaw
Copy link

Meemaw commented Apr 8, 2022

@pavelfeldman we have a very uneven test execution length between tests. Coupling that with sharding, it results into some shards to finish in < 2 minutes, and some taking 15 minutes.

This happens because most of the "expensive" tests are in same file which will fall into same shard.

Randomising/shuffling tests would partially mitigate this, and help us big time, unless you have any other ideas on how to solve this problem.

@Meemaw
Copy link

Meemaw commented Apr 15, 2022

cc @mxschmitt in case you have any ideas how to tackle ☝️ problem.

@Meemaw
Copy link

Meemaw commented Apr 19, 2022

Wondering if you would be open to contributions for this feature? We'll have to add it somehow, and we really wouldn't want to fork playwright for a feature I think could be beneficial to others.

@zyulyaev
Copy link

zyulyaev commented Jun 22, 2022

Given fullyParallel option which IMO contradicts your statement in #7040, I don't see any reason not to allow running tests in random order. We have a very similar issue to the one @Meemaw described. Our tests depend on the feature flags values, and in some environments, a lot of tests are skipped because of the features being disabled. So the load is distributed extremely unevenly between the shards.

@zyulyaev
Copy link

I created a patch using patch-package as a hotfix to this issue. Really hope to see this as a native feature though.

@AlexDaniel
Copy link

@pavelfeldman we have a very uneven test execution length between tests. Coupling that with sharding, it results into some shards to finish in < 2 minutes, and some taking 15 minutes.

Same. This is not strictly a problem with sharding, but with parallelization in general. Some tests can run longer than others, and not everything is strictly bound by CPU. To achieve the best wall clock time you have to take into consideration how fast/slow each test runs.

Back in the day, prove tool for perl5 tests had a slow option to run slow tests first. This was based on the results of the previous run. I wish something like this existed in Playwright too, then it'd be able to optimize both the parallelization and sharding. Note that there are many options for handling the stats file – you can either commit it to the repo from time to time, or use CI features to cache it from the previous run.

@Meemaw
Copy link

Meemaw commented Jul 5, 2022

I created a patch using patch-package as a hotfix to this issue. Really hope to see this as a native feature though.

@zyulyaev does this also shuffle tests from within a single file across multiple shards? We have most of our "expensive" tests in a single file, and when we tried patching Playwright to shuffle the tests, could only get it shuffled on the group/file level.

@zyulyaev
Copy link

zyulyaev commented Jul 11, 2022

I created a patch using patch-package as a hotfix to this issue. Really hope to see this as a native feature though.

@zyulyaev does this also shuffle tests from within a single file across multiple shards? We have most of our "expensive" tests in a single file, and when we tried patching Playwright to shuffle the tests, could only get it shuffled on the group/file level.

@Meemaw I believe so. We have a very similar issue to yours, and we've been successfully using this patch for several weeks now.

@conanbatt
Copy link

Pure randomness to distribute slow tests seem like the wrong solution - it makes the original issue less likely but still possible, and thus frequent.

There is always a trade-off between pure entropy and loadbalancing the test length. For speed you would be absolutely deterministic, relying on information from previous runs. Something in the middle would be to tag tests as slow, or different priorities, so there are multiple buckets.

@BrianEdwardHoover
Copy link

I'd love to see the ability to selectively enable randomness via a cli flag. Please reconsider this feature.

@jeffcasavant
Copy link

Regarding the determinism/repeatability issue:

It totally makes sense why folks end up turning it off again. If you have a failure flushed out by running the tests in random order, you're unlikely to see it again because the order will be different next time the tests run.

So there's not much value add unless you can make it repeatable. I think that's solvable.

What if we did the following:

  • Use a deterministic random number generator
  • In the normal case when running in a randomized order use a random seed, but print it out
  • In the rerun case, allow the user to pass in the seed so they can get the same test order again and reproduce the failure

@lforst
Copy link

lforst commented Apr 26, 2023

I think this would be really cool. An option to provide a seed would be really pragmatic and really useful.

In our case, we have around 7000 tests where a lot of tests are skipped for a lot of configurations and certain shards will run all tests and other shards run only a very small number. We would like to distribute this imbalance a bit.

Slow shard: https://github.com/getsentry/sentry-javascript/actions/runs/4786799183/jobs/8535829572?pr=7934#step:10:58
Fast shard: https://github.com/getsentry/sentry-javascript/actions/runs/4786799183/jobs/8535831595?pr=7934#step:10:58

@ltsuda
Copy link
Contributor

ltsuda commented Apr 26, 2023

I don't use playwright for work related projects so I'm just adding these pytest plugins to add more information/context and see if it helps or something.

https://github.com/pytest-dev/pytest-randomly
https://github.com/pytest-dev/pytest-order

@xyrilj
Copy link

xyrilj commented Oct 24, 2023

This would be very helpful. In a similar boat as @Meemaw

@john-griffin
Copy link

Rspec is good at this. It generates a random seed for each test run order and outputs it. You can then plug the seed in when rerunning to reproduce the sort order.

https://rspec.info/features/3-12/rspec-core/command-line/randomization/

@PsiKai
Copy link

PsiKai commented Feb 21, 2024

Plus one to the deterministic random number generator based on a seed. Also, for the fact that several other testing frameworks offer this type of feature.

@muhqu
Copy link
Contributor

muhqu commented May 15, 2024

I've raised a PR to add a --sharding-seed parameter to randomly distribute tests groups to shards in a deterministic way. #30817

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature-test-runner Playwright test specific issues P3-collecting-feedback
Projects
None yet
Development

No branches or pull requests