[lexical-playground] Chore: Audit and de-flake the e2e suite (remove all @flaky tags) by etrepum · Pull Request #8585 · facebook/lexical

etrepum · 2026-05-28T19:54:05Z

Description

Cleans up the Playwright e2e suite: removes stale browser workarounds, audits every @flaky test, fixes the ones that are genuinely flaky, and removes all @flaky tags. No library/product code is changed — this is test-suite maintenance only.

1. Drop stale Firefox workarounds (follow-up to the selection fix in #8582)

Remove the Firefox test.skip on Move left from last node in RTL Bug: Selection movement incorrect for RTL going back from last node in paragraph #7775 — it passes on the current Firefox.
Collapse three dead Firefox/WebKit assertion branches in Navigation.spec whose values were already identical to the default. Other Firefox branches were verified still-needed (inline-node caret positions, selectAll root selection, decorator focus) or left alone because they encode IS_WINDOWS-specific behavior that can't be verified from Linux.

2. Audit every @flaky test (32 total)
Ran the suite against the CI-equivalent setup (static playground build on :4000 + collab websocket on :1234, chromium + firefox × {rich-text, plain-text, rich-text-with-collab}, --retries=0, --repeat-each 10–50). Note: running against the live dev server inflates flakiness massively, so a built static server is required to get representative results.

26 tests never failed across ~40–80 runs each → tags removed.
6 tests genuinely flaked → root-caused and fixed (below).

3. Fix the genuinely-flaky tests

Collab setup (whole-suite, dominant cause): exposeLexicalEditor waited for the right frame's connect button with the default 5s timeout, and under parallel load a split-view iframe occasionally fails to connect or to boot/activate collab at all (~1/60, chromium+collab). The readiness check now retries with a page.reload() between attempts (3×, 15s each), so a transient setup hiccup recovers instead of failing the test. This hardens every collab test.
ClearFormatting mention typeahead: pressed Enter on a partial-query menu (@Lu matches "Agent Kallus", highlighted first), selecting the wrong mention. Now waits for "Luke Skywalker" to be the aria-selected option.
Toolbar image insert: the image renders behind React.Suspense (fallback={null}) and only appears after the asset loads, which can exceed the 5s assert under load. Now waits for .editor-image img before asserting.

Result: no @flaky tags remain in the e2e suite, and the collab test harness is materially more robust.

The diff is large but almost entirely Prettier re-indentation: removing the {tag: '@flaky'} options argument turns 3-arg test() calls back into 2-arg calls and de-indents the bodies. git diff -w shows the real changes are only the tag/workaround removals plus the helper edits in utils/index.mjs.

Test plan

Automated — Playwright, against the CI-equivalent static build (:4000) + collab server (:1234), --retries=0:

Before

ClearFormatting "default styling of hashtags and mentions": ~25% failure on firefox + collab (wrong mention selected).
Toolbar "Insert image caption + table": intermittent empty image decorator.
Collab tests: ~1/60 setup failures (exposeLexicalEditor timeout waiting for the connect button) under --repeat-each stress; 3 failures in a 180-run chromium+collab sample.

After

ClearFormatting: 30/30 firefox + collab.
Collab boot-stall fix: 200/200 chromium + collab and 120/120 firefox + collab (--repeat-each=50/30, --retries=0).
Full @flaky set across all 6 CI configs: stable; playwright test --list parses all 2124 tests.

vercel · 2026-05-28T19:54:10Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
lexical	Ready	Preview, Comment	May 29, 2026 5:12am
lexical-playground	Ready	Preview, Comment	May 29, 2026 5:12am

@flaky

Audited all 32 @flaky e2e tests against the CI flaky-job configuration: static playground build served on :4000, collab websocket on :1234, chromium + firefox x {rich-text, plain-text, rich-text-with-collab}, --repeat-each=10 --retries=0 --workers=4, plus a --repeat-each=20 deep collab pass. Each test got ~40-80 runs. (An initial run against the live vite dev server on :3000 showed ~80 failures, but that was environmental noise from on-the-fly compilation under load; against the CI-equivalent static build only the tests below ever failed.) Removed @flaky from 26 tests that never failed across the entire audit. Kept @flaky (still intermittently fail, fixed separately): - ClearFormatting: Should preserve the default styling of hashtags and mentions - TextFormatting: Regression facebook#2523 can toggle format across a decorator - Toolbar: Insert image caption + table - Tables: Select multiple merged cells (selection expands to a rectangle) - Tables: Can align text using Table selection - ListsCopyAndPaste: Copy and paste of partial list items into the list The large diff is prettier re-indenting test bodies after the options argument was dropped; there are no semantic changes to any test. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

@flaky

…se, collab connect wait) Investigated the still-@flaky tests. Root causes and fixes: - exposeLexicalEditor (collab setup): the wait for the right frame's ".action-button.connect" to read "Disconnect" used the default 5s expect timeout. Under parallel load the shared y-websocket connect exceeds 5s, which was the dominant source of collab @flaky failures across the whole suite. Bumped that wait (and the editor visibility check) to 30s. - ClearFormatting "Should preserve the default styling of hashtags and mentions": waited for any "@luke" typeahead item then pressed Enter. While "@luke" is still being typed the partial query "@lu" also matches "Agent Kallus" (kal-LU-s), which sorts earlier and is highlighted, so Enter selected it. Now wait for "Luke Skywalker" to be the aria-selected option. Verified 30/30 firefox+collab (was ~25% failure). Tag removed. - Toolbar "Insert image caption + table": the image renders behind React.Suspense (fallback={null}) and only appears after the asset loads, which can exceed assertHTML's 5s window under load. Wait for ".editor-image img" (30s) before asserting. Tag removed. The remaining 4 @flaky tests (TextFormatting facebook#2523, Tables "Select multiple merged cells", Tables "Can align text using Table selection", ListsCopyAndPaste "partial list items into the list") only fail via a rarer shared collab right-iframe boot stall (~1/60, chromium+collab only), which the 30s connect wait reduces but does not eliminate; left tagged pending a harness-level fix. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

@flaky

…alls; untag remaining @flaky tests The remaining @flaky tests (TextFormatting facebook#2523, Tables "Select multiple merged cells", Tables "Can align text using Table selection", ListsCopyAndPaste "partial list items into the list") only failed via a shared, rare (~1/60, chromium+collab) collab-setup stall: under parallel load one split-view iframe occasionally fails to boot / activate collab within the timeout, so its ".action-button.connect" toolbar button never appears and initialize() fails. This affects every collab test, not these four specifically. exposeLexicalEditor now retries the collab-frame readiness check, reloading the page between attempts (up to 3×, 15s each), so a transient boot/connect hiccup during setup recovers instead of failing the test. Validated under stress against the static build on :4000: 200/200 chromium+collab and 120/120 firefox+collab with retries=0 (previously ~3 failures in a comparable sample). With the stall handled, all four tests are stable, so their @flaky tags are removed. No @flaky tags remain in the e2e suite. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

@flaky

Removing the last @flaky tags made the flaky CI jobs run `playwright test --grep "@flaky"`, which now matches zero tests and errors with "No tests found". Pass --pass-with-no-tests on the flaky invocations so the jobs are a no-op success, and still run normally if a @flaky test is re-introduced later. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

@flaky

All @flaky tags were removed in facebook#8585 and the suite has been de-flaked, so the machinery for splitting "flaky" tests out of CI is dead weight that only invites re-adding flaky tests. Remove it: - call-e2e-all-tests.yml: drop the (already `if: false`) `flaky` job. - call-e2e-test.yml: drop the `flaky` input, `continue-on-error`, the `--grep`/`--grep-invert "@flaky"` (+ `--pass-with-no-tests`) toggles, and the `flaky` segment of the artifact name. Steps now just run the suite. - package.json: drop `--grep-invert "@flaky"` from the *-ci-* e2e scripts. No test carried an actual `@flaky` tag, so this is behavior-preserving.

[tests] Chore: Clean up e2e tests

685c46f

meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label May 28, 2026

vercel Bot deployed to Preview – lexical-playground May 28, 2026 19:54 View deployment

vercel Bot deployed to Preview – lexical May 28, 2026 19:55 View deployment

etrepum added the extended-tests Run extended e2e tests on a PR label May 28, 2026

etrepum marked this pull request as ready for review May 28, 2026 21:48

etrepum requested review from acywatson, fantactuka, ivailop7, potatowagon and zurfyx as code owners May 28, 2026 21:48

etrepum and others added 3 commits May 28, 2026 14:59

vercel Bot deployed to Preview – lexical-playground May 28, 2026 23:52 View deployment

vercel Bot deployed to Preview – lexical May 28, 2026 23:53 View deployment

etrepum changed the title ~~[tests] Chore: Clean up e2e tests~~ [lexical-playground] Chore: Audit and de-flake the e2e suite (remove all @flaky tags) May 28, 2026

vercel Bot deployed to Preview – lexical-playground May 29, 2026 05:12 View deployment

vercel Bot deployed to Preview – lexical May 29, 2026 05:12 View deployment

potatowagon approved these changes May 29, 2026

View reviewed changes

zurfyx approved these changes May 29, 2026

View reviewed changes

etrepum added this pull request to the merge queue May 29, 2026

Merged via the queue into facebook:main with commit 13fc148 May 29, 2026
45 checks passed

etrepum mentioned this pull request May 30, 2026

[lexical-playground] Chore: De-flake e2e timing- and history-dependent tests #8595

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[lexical-playground] Chore: Audit and de-flake the e2e suite (remove all @flaky tags)#8585

[lexical-playground] Chore: Audit and de-flake the e2e suite (remove all @flaky tags)#8585
etrepum merged 5 commits into
facebook:mainfrom
etrepum:e2e-cleanup

etrepum commented May 28, 2026 •

edited

Loading

Uh oh!

vercel Bot commented May 28, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

etrepum commented May 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Test plan

Before

After

Uh oh!

vercel Bot commented May 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

etrepum commented May 28, 2026 •

edited

Loading

vercel Bot commented May 28, 2026 •

edited

Loading