Skip to content

Boost: Replace brittle E2E delays with network-level synchronization#48066

Open
LiamSarsfield wants to merge 6 commits intotrunkfrom
fix/boost/e2e-flaky-test-stability
Open

Boost: Replace brittle E2E delays with network-level synchronization#48066
LiamSarsfield wants to merge 6 commits intotrunkfrom
fix/boost/e2e-flaky-test-stability

Conversation

@LiamSarsfield
Copy link
Copy Markdown
Contributor

@LiamSarsfield LiamSarsfield commented Apr 13, 2026

Proposed changes

Addresses persistent Boost E2E flakiness by replacing timing-dependent assertions with deterministic network-based synchronization, fixing a real async bug, and improving transient state handling.

  • Fix missing await in toggleModule(): The expectNoticeToBeVisible() call was not awaited, so tests proceeded before the toggle had settled. Other methods in the same file (addCornerstonePage, togglePrerenderOption) correctly use await.
  • Add waitForScoreRefreshRequest() helper: Waits for the actual POST /speed-scores/refresh network request instead of checking transient UI state or sleeping. Follows the existing waitForResponse pattern from chooseFreePlan().
  • Rework score-auto-refresh tests: Replace setTimeout delays and Loading… heading checks with waitForResponse on the score refresh endpoint. The debounce test retains its timing delays (structurally necessary to verify "nothing happened yet") but uses the network listener to confirm the refresh eventually fires.
  • Rework speed-score refresh test: Replace .toPass() wrapper on transient Loading… heading with waitForResponse before asserting scores are visible.
  • Add .toPass() retry blocks for LCP transient states: The pending optimization state can transition too fast for a plain .toBeVisible() — wrapping in .toPass() with explicit timeouts handles this correctly.
  • Increase .toPass() timeouts in concatenate/image-guide: 10s to 20s for CI environments where WordPress page rendering is slower.
  • Boost connection timeout: Increase Refresh button visibility check from 20s to 40s (matches the configured actionTimeout).
  • Normalize timeout values to literal style: Convert all 19 multiplication-style timeouts (e.g., 60 * 1000) to literal millisecond values (e.g., 60000) across 5 files to align with the monorepo convention used by all other plugins and e2e-commons.

What was reverted from the initial approach

  • auth.setup.ts timeout increase — removed per reviewer feedback. This is shared infrastructure used by all E2E suites, and other suites don't fail there.

Does this pull request change what data or activity we track or use?

No

Testing instructions

  • Monitor the E2E CI runs on this PR — specifically the Modules suite (score-auto-refresh, speed-score)
  • Optionally run locally: cd projects/plugins/boost/tests/e2e && pnpm run test:e2e

…le delays

Address the two main root causes of Boost E2E flakiness (auth setup timeouts
and connection flow timeouts) along with several test-level timing issues
identified from analysis of 100 recent trunk CI failures.
@github-actions github-actions bot added [Plugin] Boost A feature to speed up the site and improve performance. [Tests] Includes Tests E2E Tests labels Apr 13, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 13, 2026

Thank you for your PR!

When contributing to Jetpack, we have a few suggestions that can help us test and review your patch:

  • ✅ Include a description of your PR changes.
  • ✅ Add a "[Status]" label (In Progress, Needs Review, ...).
  • ✅ Add testing instructions.
  • ✅ Specify whether this PR includes any changes to data or privacy.
  • ✅ Add changelog entries to affected projects

This comment will be updated as you work on your PR and make changes. If you think that some of those checks are not needed for your PR, please explain why you think so. Thanks for cooperation 🤖


Follow this PR Review Process:

  1. Ensure all required checks appearing at the bottom of this PR are passing.
  2. Make sure to test your changes on all platforms that it applies to. You're responsible for the quality of the code you ship.
  3. You can use GitHub's Reviewers functionality to request a review.
  4. When it's reviewed and merged, you will be pinged in Slack to deploy the changes to WordPress.com simple once the build is done.

If you have questions about anything, reach out in #jetpack-developers for guidance!


Boost plugin:

No scheduled milestone found for this plugin.

If you have any questions about the release process, please ask in the #jetpack-releases channel on Slack.

@github-actions github-actions bot added the [Status] Needs Author Reply We need more details from you. This label will be auto-added until the PR meets all requirements. label Apr 13, 2026
@jp-launch-control
Copy link
Copy Markdown

jp-launch-control bot commented Apr 13, 2026

Code Coverage Summary

This PR did not change code coverage!

That could be good or bad, depending on the situation. Everything covered before, and still is? Great! Nothing was covered before? Not so great. 🤷

Full summary · PHP report · JS report

@LiamSarsfield LiamSarsfield removed the [Status] Needs Author Reply We need more details from you. This label will be auto-added until the PR meets all requirements. label Apr 13, 2026
@LiamSarsfield LiamSarsfield requested review from a team and kraftbj April 13, 2026 15:20
@github-actions github-actions bot added the [Status] Needs Author Reply We need more details from you. This label will be auto-added until the PR meets all requirements. label Apr 13, 2026
@LiamSarsfield LiamSarsfield added [Status] Needs Review This PR is ready for review. and removed [Status] Needs Author Reply We need more details from you. This label will be auto-added until the PR meets all requirements. labels Apr 13, 2026
@LiamSarsfield
Copy link
Copy Markdown
Contributor Author

@manzoorwanijk I remember you mentioning this and had it in my todo 📝

@LiamSarsfield
Copy link
Copy Markdown
Contributor Author

Note on the debounce test change (score-auto-refresh.test.ts:76-83): replacing the hard-coded setTimeout(1000) + immediate assertion with toBeVisible({ timeout: 5000 }) intentionally loosens the timing bound. The preceding step still asserts that loading has not started at the 2.1s mark after the first toggle (verifying the debounce reset), so the behavioral contract is preserved. The follow-up just confirms loading eventually starts rather than asserting it appears within a narrow window that's fragile in CI. Precise debounce timing is better validated in a unit test on the debounce function itself.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Reduces recurring Jetpack Boost E2E flakiness by increasing key UI wait timeouts and replacing brittle fixed delays with retry/polling-style assertions.

Changes:

  • Increase timeouts for critical visibility checks in shared auth setup and Boost connection flow.
  • Replace hard-coded sleeps with “wait until state appears” assertions and add retry blocks for transient UI states.
  • Extend .toPass() timeouts in a few slower CI-sensitive assertions.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
tools/e2e-commons/setup-specs/auth.setup.ts Extends wp.com “Howdy, user” visibility timeout during shared auth verification.
projects/plugins/boost/tests/e2e/lib/pages/jetpack-boost-page.ts Extends post-connection “Refresh” button visibility timeout.
projects/plugins/boost/tests/e2e/specs/modules/speed-score.test.ts Adds retry logic to catch transient loading state after “Refresh”.
projects/plugins/boost/tests/e2e/specs/modules/score-auto-refresh.test.ts Replaces some fixed delays with state-based waits around debounce/refresh behavior.
projects/plugins/boost/tests/e2e/specs/lcp-optimization/lcp-optimization.test.ts Adds retries and longer waits for async analysis/pending transitions.
projects/plugins/boost/tests/e2e/specs/image-guide/image-guide.test.ts Extends .toPass() timeout for script presence check.
projects/plugins/boost/tests/e2e/specs/concatenate/concatenate.test.ts Extends .toPass() timeouts for concatenation/exclusion assertions.
projects/plugins/boost/changelog/fix-boost-e2e-flaky-test-stability Adds a changelog entry file for the stability improvements.
Comments suppressed due to low confidence (1)

projects/plugins/boost/changelog/fix-boost-e2e-flaky-test-stability:4

  • This changelog entry file only contains headers (and an optional Comment) but no actual entry text after the blank line. Changelogger treats everything after the blank line as the entry body, so this will produce an empty changelog entry (or may fail validation). Add a short, user-facing entry line after the blank line (and keep/omit the Comment header as appropriate).
Significance: patch
Type: fixed
Comment: Improve E2E test stability by increasing timeouts and replacing brittle hard-coded delays.

Comment thread projects/plugins/boost/tests/e2e/specs/modules/speed-score.test.ts Outdated
Comment thread projects/plugins/boost/tests/e2e/specs/modules/score-auto-refresh.test.ts Outdated
Copy link
Copy Markdown
Member

@manzoorwanijk manzoorwanijk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see this PR doing anything more than increasing the timeouts, which is not the right fix in my humble opinion. I think we should find the root causes as I pointed out (just a guess) in the inline comments.

'User is logged in and username is visible'
)
.toBeVisible();
.toBeVisible( { timeout: 40 * 1000 } );
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This setup is used by all other tests, but they don't fail. I don't think we should need to increase the timeout here.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the problem might be these hard-coded wait times, without a deterministic way to wait for something real. For example, when a toggle is clicked, it must be making some kind of API call or something which we should wait for instead of a random 1 second wait.

Comment on lines +30 to +36
await test.step( 'Wait for score refresh after 2 second delay and verify score is visible', async () => {
await new Promise( resolve => setTimeout( resolve, 2100 ) );
await test.step( 'Wait for score refresh to start after debounce and verify score is visible', async () => {
// The score refresh triggers after a 2-second debounce. Rather than relying on a
// hard-coded delay, wait for the loading state to appear (indicating the refresh started),
// then wait for the score to become visible again.
await expect(
jetpackBoostPage.page.getByRole( 'heading', { name: 'Loading…' } ),
'Score refresh should start after debounce'
).toBeVisible( { timeout: 10 * 1000 } );
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, adding a random increased time out is just a band-aid to the problem. We should instead waitForResponse or something similar here after the toggle is clicked.

- Fix missing await in toggleModule() causing tests to proceed before toggle settled
- Add waitForScoreRefreshRequest() to wait for the actual API request instead of sleeping
- Rework score-auto-refresh and speed-score tests to use waitForResponse on /speed-scores/refresh
- Revert shared auth.setup.ts timeout change (not needed per reviewer feedback)
@LiamSarsfield LiamSarsfield changed the title Boost: Fix flaky E2E tests Boost: Replace brittle E2E delays with network-level synchronization Apr 14, 2026
@github-actions github-actions bot added [Status] Needs Author Reply We need more details from you. This label will be auto-added until the PR meets all requirements. and removed [Status] Needs Review This PR is ready for review. labels Apr 14, 2026
@LiamSarsfield LiamSarsfield added [Status] Needs Review This PR is ready for review. and removed [Status] Needs Author Reply We need more details from you. This label will be auto-added until the PR meets all requirements. labels Apr 14, 2026
@github-actions github-actions bot added [Status] Needs Author Reply We need more details from you. This label will be auto-added until the PR meets all requirements. and removed [Status] Needs Review This PR is ready for review. labels Apr 14, 2026
@LiamSarsfield
Copy link
Copy Markdown
Contributor Author

@manzoorwanijk Thanks for the feedback, I pushed a rework that takes a different approach per your suggestions.

  • Uses page.waitForResponse() on POST /speed-scores/refresh instead of setTimeout delays or transient UI checks. This is deterministic and proves the debounce fired.
  • Fixes a missing await on expectNoticeToBeVisible() in toggleModule() — tests were proceeding before the toggle settled.
  • Removes the unnecessary 1-second sleep in the debounce test.
  • Reverts the auth.setup.ts change as you suggested.

Worth noting: The waitForResponse approach does tie us to an internal endpoint, which isn't pure E2E practice, but we're only using it as a sync point (UI assertions still happen through expectScoreToBeVisible()). The endpoint is versioned and stable so the coupling risk is low.

Align with monorepo convention where all other plugins and e2e-commons
use literal millisecond values (e.g., 60000) instead of multiplication
expressions (e.g., 60 * 1000). Converts all 19 instances across 5 files.
@LiamSarsfield LiamSarsfield added [Status] Needs Review This PR is ready for review. and removed [Status] Needs Author Reply We need more details from you. This label will be auto-added until the PR meets all requirements. labels Apr 14, 2026
Copy link
Copy Markdown
Member

@manzoorwanijk manzoorwanijk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBH, I am still not happy with how we are trying to solve this. Let us aim to get rid all these band-aid timeouts which are not the right fix. I have added some inline suggestions/ideas, let us strive to aim for those. Please feel free to reach out if you need help getting this through.

Comment on lines 108 to +113
await page.getByRole( 'link', { name: 'Go to Jetpack Boost' } ).click();

await expect(
page.getByTestId( 'critical-css-meta' ),
'Critical CSS meta information should be visible'
).toBeVisible( { timeout: 60 * 1000 } );
).toBeVisible( { timeout: 60000 } );
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does the above link click do? Does it result in some navigation or an API call? If yes, then it's better to wait for that navigation instead of random timeout.

Comment on lines 125 to +130
await page.getByRole( 'button', { name: 'Regenerate' } ).click();

await expect(
page.getByTestId( 'critical-css-meta' ),
'Critical CSS meta information should be visible'
).toBeVisible( { timeout: 60 * 1000 } );
).toBeVisible( { timeout: 60000 } );
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, what does the button click do? Does it make an API call or something? May be wait for that response instead of the timeout?

Comment on lines 138 to +142
await page.getByRole( 'button', { name: 'Go back' } ).click();
await expect(
page.getByTestId( 'critical-css-meta' ),
'Critical CSS meta information should be visible'
).toBeVisible( { timeout: 60 * 1000 } );
).toBeVisible( { timeout: 60000 } );
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here. These timeouts are just band-aids. Let us get rid of all of them and fix the root cause.

Comment on lines 76 to +80
await jetpackBoostPage.visit();
await expect(
page.getByTestId( 'critical-css-meta' ),
'Critical CSS meta information should be visible'
).toBeVisible( { timeout: 60 * 1000 } );
).toBeVisible( { timeout: 60000 } );
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the .visit() results in navigation, which we should wait for, instead of adding a timeout to the assertion below it.

Comment on lines 61 to +66
await jetpackBoostPage.visit();

await expect(
page.getByTestId( 'critical-css-meta' ),
'Critical CSS meta information should be visible'
).toBeVisible( { timeout: 60 * 1000 } );
).toBeVisible( { timeout: 60000 } );
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Likewise here about getting rid of the timeout and waiting for the navigation instead.

Comment on lines 116 to +123
await page.goto( '/' );
} );

await expect( async () => {
// jQuery is enqueued by a helper plugin.
const count = await page.locator( '#jquery-core-js' ).count();
expect( count, 'jQuery should not be concatenated' ).toBeGreaterThan( 0 );
} ).toPass( { timeout: 10000 } );
} ).toPass( { timeout: 20000 } );
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since page.goto() results in navigation, we should wait for that navigation to finish, and maybe also wait for some UI element to be visible before making the assertions, instead of these timeouts.

Comment on lines 85 to +95
await page.goto( '/' );
} );

await expect( async () => {
// e2e-script-one and e2e-script-two are enqueued by a helper plugin. When concatenation is enabled,
// they should be concatenated into a single script.
const count = await page
.locator( '[data-handles*="e2e-script-one"][data-handles*="e2e-script-two"]' )
.count();
expect( count, 'JS Concatenation occurs when module is active' ).toBeGreaterThan( 0 );
} ).toPass( { timeout: 10000 } );
} ).toPass( { timeout: 20000 } );
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, let us get rid all these timeouts please.

Comment on lines 55 to +63
await admin.visitAdminPage( 'admin.php', 'page=jetpack-boost' );
await expect(
page.locator( '.jb-critical-css-progress' ),
'Critical CSS generation progress indicator should be visible'
).toBeVisible();
await expect(
page.getByTestId( 'critical-css-meta' ),
'Critical CSS meta information should be visible'
).toBeVisible( { timeout: 4 * 60 * 1000 } );
).toBeVisible( { timeout: 240000 } );
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Visiting the page, waiting for the navigation (which admin.visitAdminPage already does IIRC), then waiting for some UI element to be visible if it's loaded client-side, then making the assertion is the right approach instead of the timeout.

@manzoorwanijk
Copy link
Copy Markdown
Member

The waitForResponse approach does tie us to an internal endpoint, which isn't pure E2E practice

Not sure I understand what is not pure E2E practice here.

@kraftbj
Copy link
Copy Markdown
Contributor

kraftbj commented Apr 16, 2026

Took a close look at this. A few thoughts on the timeout discussion and the broader approach.

The score/debounce test changes are solid. The waitForResponse pattern is the right move, and the missing await on expectNoticeToBeVisible() was a real bug -- probably the single biggest source of flakiness here. Nice catch.

On the "band-aid timeouts" feedback -- I think there's a distinction worth drawing. The critical-css and concatenate timeout changes aren't new timeouts. They're reformatting pre-existing values from 60 * 1000 to 60000 (and bumping concatenate's .toPass() from 10s to 20s). The suggestion to "wait for navigation instead" doesn't quite apply here because navigation is already being waited for -- page.goto(), visit(), and admin.visitAdminPage() all wait for the load event. These timeouts are waiting for async background processes after navigation (Critical CSS generation, concatenation processing). Playwright's toBeVisible({ timeout }) polls the DOM until the element appears -- it's not a blind sleep. That said, if there are specific API responses or polling endpoints these tests could key off of, that'd be a genuine improvement (probably follow-up scope though).

A few things that still need work:

  1. The debounce test no longer tests debounce. The original toggled two modules with a timing gap to prove the debounce timer resets. The new version toggles one module and waits for a refresh -- which is functionally identical to the "Enabling module should refresh scores" test above it. The test name says "debounce between multiple module toggle" but only one toggle happens. This needs the two-module flow restored, using network waits instead of setTimeout.

  2. waitForScoreRefreshRequest waits for the response, not the request. The method uses page.waitForResponse() but the name and JSDoc say "request." This matters because it couples the assertion to backend latency (a new flake source). If the goal is "prove the debounce fired and the client initiated a refresh," page.waitForRequest() would be more appropriate. At minimum the naming/docs should match the implementation.

  3. The .toPass() wrapping .toBeVisible() in LCP tests is redundant. Playwright's toBeVisible({ timeout: 20000 }) already auto-retries. The .toPass() wrapper adds no functional value and makes timeout errors harder to parse (nested error messages).

  4. The speed-score refresh test dropped the loading UI verification. The old test checked that "Loading..." appeared after clicking Refresh. The new one only waits for the network request. A regression where the request fires but the UI never updates would pass silently.

  5. Minor: orphaned promise risk. If toggleModule() throws before refreshRequestPromise is awaited, the dangling promise produces an unhandled rejection warning in CI that obscures the real error. A .catch(() => {}) right after creation would prevent the noise.

@manzoorwanijk
Copy link
Copy Markdown
Member

The suggestion to "wait for navigation instead" doesn't quite apply here because navigation is already being waited for -- page.goto(), visit(), and admin.visitAdminPage() all wait for the load event. These timeouts are waiting for async background processes after navigation (Critical CSS generation, concatenation processing). Playwright's toBeVisible({ timeout }) polls the DOM until the element appears -- it's not a blind sleep. That said, if there are specific API responses or polling endpoints these tests could key off of, that'd be a genuine improvement (probably follow-up scope though).

I don’t agree with that if there is some API call which we can wait for or if the elements load asynchronously, we can instead wait for them to be visible. The point is to be more deterministic rather than adding a random timeout which can still be flakey when those requests or elements take longer than that. Playwright is good at handling timeouts automatically, and by default, retries/waits 3 times if the element or response is not available yet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

E2E Tests [Plugin] Boost A feature to speed up the site and improve performance. [Status] Needs Review This PR is ready for review. [Tests] Includes Tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants