Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix e2e interferences, and establish robust parallelism #1694

Open
jotaen4tinypilot opened this issue Nov 30, 2023 · 2 comments
Open

Fix e2e interferences, and establish robust parallelism #1694

jotaen4tinypilot opened this issue Nov 30, 2023 · 2 comments
Labels
bug Something isn't working enhancement New feature or request

Comments

@jotaen4tinypilot
Copy link
Contributor

jotaen4tinypilot commented Nov 30, 2023

Merging over the grouped e2e tests uncovered an interference problem in our e2e test setup.

Update: we disabled concurrent tests locally to alleviate the problem for local development for the time being. The topic in itself is still valid, however.

Problem

Our Playwright setup is this:

  • On CI, we only use a single Playwright worker. Therefore, Playwright will run each spec file one after the other, in alphabetic order.
  • Locally, however, we run had been running the files in parallel, which is the default behaviour, in case workers wasn’t set (i.e., undefined).

Due to the new grouping and our usage of the beforeEach mechanism, the timing behaviour of the tests happened to change, compared to before:

When the security-dialog tests are running, they alter the server state and toggle on the auth requirement. The about-dialog tests are executed in parallel, and can now randomly fail, since Playwright happens to attempt to open the about dialog during a time when the auth requirement is active (caused by one of the concurrently ran security tests). Therefore, Playwright fails to open the about dialog in the first place, as it’s stuck on the login page.

This problem isn’t new, but so far the timing was in our favour, so we were lucky enough to not run into this. Now, with the different timing behaviour, the e2e tests can fail locally.

Again: due to us only using a single worker on CI, this problem only manifests was only manifesting itself locally, not on CI.

Solution

As we continue to add more and more e2e tests (which I think is terrific), I suggest we fix our e2e test setup, and make it robust against these kinds of issues. Specifically, I think we should establish a deliberate mechanism for parallelism:

  • If we perform sequential scenarios that alter server state (such as the security tests), we should make sure that no other test can run in parallel. That is because server state acts like a singleton, and is shared across all tests.
    • These kinds of test also must clean up the server state afterwards, otherwise we might see erratic failures in subsequent tests.
  • For all other, “read-only” tests (i.e., ones that don’t alter server state), we should embrace parallelism as much as possible, to take advantage of lower execution times. I think we should be able to do that on CI and locally likewise, so I’d suspect we don’t need to differentiate here like we do now.
@jotaen4tinypilot jotaen4tinypilot added bug Something isn't working enhancement New feature or request labels Nov 30, 2023
@jotaen4tinypilot jotaen4tinypilot changed the title Fix e2e interdependencies and parallelism Fix e2e interferences, and establish robust parallelism Nov 30, 2023
jotaen4tinypilot added a commit that referenced this issue Nov 30, 2023
Until we have addressed
#1694, I’d suggest to
disable parallel e2e test execution locally, otherwise there can be
erratic failures in Pro (due to the security tests interfering with
other tests).

On my local machine, this changes e2e execution speed from ~20s to ~26s.
On CI, it obviously doesn’t change.
<a data-ca-tag
href="https://codeapprove.com/pr/tiny-pilot/tinypilot/1695"><img
src="https://codeapprove.com/external/github-tag-allbg.png" alt="Review
on CodeApprove" /></a>

Co-authored-by: Jan Heuermann <jan@jotaen.net>
@mtlynch
Copy link
Contributor

mtlynch commented Nov 30, 2023

Yeah, this is a tricky one. It looks like Playwright has more granular parallelism controls than we've been using, but it doesn't look like we can say "run everything in parallel, but make sure nothing else is running while this parallelism-unfriendly test is running."

The way I approached this on PicoShare is that each Playwright session has its own, independent in-memory SQLite database. Before each test, Playwright POSTs to a dev-only route to say it wants a per-session database. The server assigns a cookie in response, and then whenever the server receives subsequent requests with that cookie, it matches the request to the corresponding in-memory SQLite database.

That system works okay, but it gets a little hacky when we have to share backend state between two browser sessions.

@jotaen4tinypilot
Copy link
Contributor Author

jotaen4tinypilot commented Nov 30, 2023

Another potential approach that I stumbled across is to divide the tests into different test projects, where each project can have its own configuration.

I’m not sure, however, what the best way is to run these projects independently. E.g., we could use the --project CLI flag, but then we’d have to issue multiple playwright invocations.

I haven’t looked much into that, so maybe this could be a viable approach. There might also be a bunch of other options to research.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants