Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Split shards via test timing data #17969

Open
ffluk3 opened this issue Oct 10, 2022 · 5 comments
Open

[Feature] Split shards via test timing data #17969

ffluk3 opened this issue Oct 10, 2022 · 5 comments

Comments

@ffluk3
Copy link
Contributor

ffluk3 commented Oct 10, 2022

Playwright has some great default behavior around sharding tests across multiple workers. It would be super helpful if the shard can take in a --timing-data file similar to CircleCI's split tests logic so we can let Playwright internally split tests across the timing boundaries. This would allow Playwright to complete as fast as possible by splitting tests more optimally across multiple machines or workers.

@michaelhays
Copy link

This is the one thing that has made me hesitate switching from Cypress to Playwright despite the many advantages.

Cypress Cloud solves this with their Smart Orchestration (specifically, the "Load Balancing" strategy).

@kp-abhishek-agrawal
Copy link

This is one of the major limitation of playwright.

@PsiKai
Copy link

PsiKai commented Nov 27, 2023

I would love to see this implemented in Playwright. It is a fairly common feature among other test platforms and tooling.

@ofirpardo-artlist
Copy link

I'd also love to see this feature. I thought of creating a PR with the following additions:

  • Add '--timing-file' flag, e.g: --timing-file="report.json"
  • The file will have the duration and id for each test
  • Based on the duration we can build sets of as close as possible duration, then assign the test groups based on the test id.

So far I tried to use the json reporter and extract this data from it which seems to work, but it's a bit messy because of the 'infinite' amount of suites that you need to go over. Not sure if I'd want to generate another smaller timing file from the json reporter, or just use the json report file and extract the data for it in the playwright runner.
Also thought of creating a completely new reporter which will only include the test name, test id and duration.

Based on my testing so far, we are able to create a pretty good balance for any amount of shards.
I believe more details are required, but please let me know whether this is something that could be implemented, otherwise I'd just create a custom external solution instead.

Thanks.

@ofirpardo-artlist
Copy link

ofirpardo-artlist commented Apr 8, 2024

A small scale example, current implementation:
image
image
image

New balance based on json report:
image
image

In most cases since the shards are running in parallel, what matters is when the last shard finishes running. Having 1 shard at 35 seconds and 1 shard at 1.5 minutes is as good as having 2 shards running for 1.5 minutes.

How the new shard balancing works:

  • You point to a json report file
  • From the report we extract the duration for each test based on ID
  • Based on the amount of total shards we map the tests by duration
  • Test groups are created based on the new mapping
  • If there are new tests that are not recorded in the report they will be split evenly based on current implementation

This currently works for any amount of shards.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants