Fix e2e interferences, and establish robust parallelism #1694

jotaen4tinypilot · 2023-11-30T14:45:57Z

Merging over the grouped e2e tests uncovered an interference problem in our e2e test setup.

Update: we disabled concurrent tests locally to alleviate the problem for local development for the time being. The topic in itself is still valid, however.

Problem

Our Playwright setup is this:

On CI, we only use a single Playwright worker. Therefore, Playwright will run each spec file one after the other, in alphabetic order.
Locally, however, we ~~run~~ had been running the files in parallel, which is the default behaviour, in case workers wasn’t set (i.e., undefined).

Due to the new grouping and our usage of the beforeEach mechanism, the timing behaviour of the tests happened to change, compared to before:

When the security-dialog tests are running, they alter the server state and toggle on the auth requirement. The about-dialog tests are executed in parallel, and can now randomly fail, since Playwright happens to attempt to open the about dialog during a time when the auth requirement is active (caused by one of the concurrently ran security tests). Therefore, Playwright fails to open the about dialog in the first place, as it’s stuck on the login page.

This problem isn’t new, but so far the timing was in our favour, so we were lucky enough to not run into this. Now, with the different timing behaviour, the e2e tests can fail locally.

Again: due to us only using a single worker on CI, this problem ~~only manifests~~ was only manifesting itself locally, not on CI.

Solution

As we continue to add more and more e2e tests (which I think is terrific), I suggest we fix our e2e test setup, and make it robust against these kinds of issues. Specifically, I think we should establish a deliberate mechanism for parallelism:

If we perform sequential scenarios that alter server state (such as the security tests), we should make sure that no other test can run in parallel. That is because server state acts like a singleton, and is shared across all tests.
- These kinds of test also must clean up the server state afterwards, otherwise we might see erratic failures in subsequent tests.
For all other, “read-only” tests (i.e., ones that don’t alter server state), we should embrace parallelism as much as possible, to take advantage of lower execution times. I think we should be able to do that on CI and locally likewise, so I’d suspect we don’t need to differentiate here like we do now.

The text was updated successfully, but these errors were encountered:

Until we have addressed #1694, I’d suggest to disable parallel e2e test execution locally, otherwise there can be erratic failures in Pro (due to the security tests interfering with other tests). On my local machine, this changes e2e execution speed from ~20s to ~26s. On CI, it obviously doesn’t change. <a data-ca-tag href="https://codeapprove.com/pr/tiny-pilot/tinypilot/1695"><img src="https://codeapprove.com/external/github-tag-allbg.png" alt="Review on CodeApprove" /></a> Co-authored-by: Jan Heuermann <jan@jotaen.net>

mtlynch · 2023-11-30T15:29:00Z

Yeah, this is a tricky one. It looks like Playwright has more granular parallelism controls than we've been using, but it doesn't look like we can say "run everything in parallel, but make sure nothing else is running while this parallelism-unfriendly test is running."

The way I approached this on PicoShare is that each Playwright session has its own, independent in-memory SQLite database. Before each test, Playwright POSTs to a dev-only route to say it wants a per-session database. The server assigns a cookie in response, and then whenever the server receives subsequent requests with that cookie, it matches the request to the corresponding in-memory SQLite database.

That system works okay, but it gets a little hacky when we have to share backend state between two browser sessions.

jotaen4tinypilot · 2023-11-30T15:33:12Z

Another potential approach that I stumbled across is to divide the tests into different test projects, where each project can have its own configuration.

I’m not sure, however, what the best way is to run these projects independently. E.g., we could use the --project CLI flag, but then we’d have to issue multiple playwright invocations.

I haven’t looked much into that, so maybe this could be a viable approach. There might also be a bunch of other options to research.

jotaen4tinypilot added bug Something isn't working enhancement New feature or request labels Nov 30, 2023

jotaen4tinypilot mentioned this issue Nov 30, 2023

Disable parallel e2e test execution locally #1695

Merged

jotaen4tinypilot changed the title ~~Fix e2e interdependencies and parallelism~~ Fix e2e interferences, and establish robust parallelism Nov 30, 2023

mtlynch mentioned this issue Nov 30, 2023

Make it easier to run a single playwright test #1696

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix e2e interferences, and establish robust parallelism #1694

Fix e2e interferences, and establish robust parallelism #1694

jotaen4tinypilot commented Nov 30, 2023 •

edited

mtlynch commented Nov 30, 2023

jotaen4tinypilot commented Nov 30, 2023 •

edited

Fix e2e interferences, and establish robust parallelism #1694

Fix e2e interferences, and establish robust parallelism #1694

Comments

jotaen4tinypilot commented Nov 30, 2023 • edited

Problem

Solution

mtlynch commented Nov 30, 2023

jotaen4tinypilot commented Nov 30, 2023 • edited

jotaen4tinypilot commented Nov 30, 2023 •

edited

jotaen4tinypilot commented Nov 30, 2023 •

edited