Make BDD end2end tests faster #8095
Labels
component: performance
Issues with performance/benchmarking.
component: tests
Issues related to the testsuite.
priority: 3 - wishlist
Issues which are not important and/or where it's unclear whether they're feasible.
The Behaviour Driven Design / gherkin /
.feature
based end2end tests have proven to be useful as a low barrier to entry way to get test coverage for new features and fixes. While there are plans (#3319) to reduce our reliance on these tests (and some progress toward that scattered about the place) while we are still reliant on them it would be great if we could make them run faster somehow. I think the average time for them is like 1s per test, which makes the full test suite quite slow.This will probably involve a lot of tedious investigation into what types of patterns are slowest and options to speed them up. And it may end up to be largely intractable and we have to live with them as-is (until we replace them).
I've looked at the IPC mechanism and regex matching in the past and don't believe they are slow, a lot of the slowness seems to be in the browser under test itself (;_;) (for example, opening a new window seems to show up as a common slow step), but don't let that vague assertion bias your investigation. Doing a profile of the test process should help get more visibility on that.
Speeding up the browser itself is out of scope for this investigation, so if you see an opportunity to do that please raise a separate issue for it.
One opportunity to speed things up I've seen is to batch commands where possible, instead of running them one by one. Opening tabs all at once in each step of a feature file with just four tests saved about 25% of the total run time (see example below). For instead of doing:
We might be able to do:
With the "wait for" step being optional, for example currently with "And I open" you don't need to specify your own wait line, you only have to do that when you run raw commands like
And I run :open -t about:blank?3.txt
.This may increase the flakyness of tests, since we don't have a way of strictly specifying command ordering (see #3007) so there should be some thought put into 1) how to highlight ordering issues if a test fails 2) how to disable the parallelization either per test (with a fixture or even just deleting the "in parallel" line) 3) for a whole test run. In practice though I suspect there are a few common cases where it would work fine.
Another thing I saw in the same example below is that opening about:blank tabs seems to be faster than opening tiny files in the local webserver. Not sure if that's expected or it's something to do with the webserver itself.
Changing the
"----> found it"
log message to bef"----> found it ({elapsed_timer.elapsed()}) ({match.message})"
could help highlight slow pieces to be investigated, same with pytest's--duration=<num>
command.example
This is a small BDD scenario I was looking at today. I tested batching up tab open commands and switching from opening a file from the web server vs opening about:blank. The times I got across four runs of each attempt were:
The initial version of the file was:
And with about:blank:
And back to loading from the webserver but batched (using
;;
to chain :open commands):And now about:blank again but batched:
The text was updated successfully, but these errors were encountered: