-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] Run tests in a random order #9297
Comments
Linking #7040 |
This is unlikely to appear due to the reasons described in #7040. Leaving it open to collect voices. Once again, in every project I worked on, including Chromium, this has been enabled and then inevitably disabled. While great in theory, it never actually worked. |
maybe a simple solution - run them in reverse order? Not a random, but still dependent tests will fail |
@pavelfeldman, it never worked or it was never used? Or are you saying that when the tests were ran in random order, there would be failures and so folks would decide to just run them in order? |
It is less about 'in order' and more about repeatability of the tests. The cost of non-determinism is just too high. |
@pavelfeldman we have a very uneven test execution length between tests. Coupling that with sharding, it results into some shards to finish in < 2 minutes, and some taking 15 minutes. This happens because most of the "expensive" tests are in same file which will fall into same shard. Randomising/shuffling tests would partially mitigate this, and help us big time, unless you have any other ideas on how to solve this problem. |
cc @mxschmitt in case you have any ideas how to tackle ☝️ problem. |
Wondering if you would be open to contributions for this feature? We'll have to add it somehow, and we really wouldn't want to fork playwright for a feature I think could be beneficial to others. |
Given |
I created a patch using patch-package as a hotfix to this issue. Really hope to see this as a native feature though. |
Same. This is not strictly a problem with sharding, but with parallelization in general. Some tests can run longer than others, and not everything is strictly bound by CPU. To achieve the best wall clock time you have to take into consideration how fast/slow each test runs. Back in the day, |
@zyulyaev does this also shuffle tests from within a single file across multiple shards? We have most of our "expensive" tests in a single file, and when we tried patching Playwright to shuffle the tests, could only get it shuffled on the group/file level. |
@Meemaw I believe so. We have a very similar issue to yours, and we've been successfully using this patch for several weeks now. |
Pure randomness to distribute slow tests seem like the wrong solution - it makes the original issue less likely but still possible, and thus frequent. There is always a trade-off between pure entropy and loadbalancing the test length. For speed you would be absolutely deterministic, relying on information from previous runs. Something in the middle would be to tag tests as slow, or different priorities, so there are multiple buckets. |
I'd love to see the ability to selectively enable randomness via a cli flag. Please reconsider this feature. |
Regarding the determinism/repeatability issue: It totally makes sense why folks end up turning it off again. If you have a failure flushed out by running the tests in random order, you're unlikely to see it again because the order will be different next time the tests run. So there's not much value add unless you can make it repeatable. I think that's solvable. What if we did the following:
|
I think this would be really cool. An option to provide a seed would be really pragmatic and really useful. In our case, we have around 7000 tests where a lot of tests are skipped for a lot of configurations and certain shards will run all tests and other shards run only a very small number. We would like to distribute this imbalance a bit. Slow shard: https://github.com/getsentry/sentry-javascript/actions/runs/4786799183/jobs/8535829572?pr=7934#step:10:58 |
I don't use playwright for work related projects so I'm just adding these pytest plugins to add more information/context and see if it helps or something. https://github.com/pytest-dev/pytest-randomly |
This would be very helpful. In a similar boat as @Meemaw |
Rspec is good at this. It generates a random seed for each test run order and outputs it. You can then plug the seed in when rerunning to reproduce the sort order. https://rspec.info/features/3-12/rspec-core/command-line/randomization/ |
Plus one to the deterministic random number generator based on a seed. Also, for the fact that several other testing frameworks offer this type of feature. |
I've raised a PR to add a |
For your test runner, playwright-test, please add the ability to be able to run the individual tests in a random fashion. I consider this one of those true acid tests for how well a test automation suite and overall architecture is evaluated. It can get rather hard to be watchful of not inadvertently building in dependency between tests and being able to specify a command line switch
--random
would be a gate to prevent that without exactly spending a whole lot of eyes time on it.Thanks for the hard work you've been putting in so far.
The text was updated successfully, but these errors were encountered: