Skip to content

[lexical-playground] Chore: Audit and de-flake the e2e suite (remove all @flaky tags)#8585

Merged
etrepum merged 5 commits into
facebook:mainfrom
etrepum:e2e-cleanup
May 29, 2026
Merged

[lexical-playground] Chore: Audit and de-flake the e2e suite (remove all @flaky tags)#8585
etrepum merged 5 commits into
facebook:mainfrom
etrepum:e2e-cleanup

Conversation

@etrepum
Copy link
Copy Markdown
Collaborator

@etrepum etrepum commented May 28, 2026

Description

Cleans up the Playwright e2e suite: removes stale browser workarounds, audits every @flaky test, fixes the ones that are genuinely flaky, and removes all @flaky tags. No library/product code is changed — this is test-suite maintenance only.

1. Drop stale Firefox workarounds (follow-up to the selection fix in #8582)

  • Remove the Firefox test.skip on Move left from last node in RTL Bug: Selection movement incorrect for RTL going back from last node in paragraph #7775 — it passes on the current Firefox.
  • Collapse three dead Firefox/WebKit assertion branches in Navigation.spec whose values were already identical to the default. Other Firefox branches were verified still-needed (inline-node caret positions, selectAll root selection, decorator focus) or left alone because they encode IS_WINDOWS-specific behavior that can't be verified from Linux.

2. Audit every @flaky test (32 total)
Ran the suite against the CI-equivalent setup (static playground build on :4000 + collab websocket on :1234, chromium + firefox × {rich-text, plain-text, rich-text-with-collab}, --retries=0, --repeat-each 10–50). Note: running against the live dev server inflates flakiness massively, so a built static server is required to get representative results.

  • 26 tests never failed across ~40–80 runs each → tags removed.
  • 6 tests genuinely flaked → root-caused and fixed (below).

3. Fix the genuinely-flaky tests

  • Collab setup (whole-suite, dominant cause): exposeLexicalEditor waited for the right frame's connect button with the default 5s timeout, and under parallel load a split-view iframe occasionally fails to connect or to boot/activate collab at all (~1/60, chromium+collab). The readiness check now retries with a page.reload() between attempts (3×, 15s each), so a transient setup hiccup recovers instead of failing the test. This hardens every collab test.
  • ClearFormatting mention typeahead: pressed Enter on a partial-query menu (@Lu matches "Agent Kallus", highlighted first), selecting the wrong mention. Now waits for "Luke Skywalker" to be the aria-selected option.
  • Toolbar image insert: the image renders behind React.Suspense (fallback={null}) and only appears after the asset loads, which can exceed the 5s assert under load. Now waits for .editor-image img before asserting.

Result: no @flaky tags remain in the e2e suite, and the collab test harness is materially more robust.

The diff is large but almost entirely Prettier re-indentation: removing the {tag: '@flaky'} options argument turns 3-arg test() calls back into 2-arg calls and de-indents the bodies. git diff -w shows the real changes are only the tag/workaround removals plus the helper edits in utils/index.mjs.

Test plan

Automated — Playwright, against the CI-equivalent static build (:4000) + collab server (:1234), --retries=0:

Before

  • ClearFormatting "default styling of hashtags and mentions": ~25% failure on firefox + collab (wrong mention selected).
  • Toolbar "Insert image caption + table": intermittent empty image decorator.
  • Collab tests: ~1/60 setup failures (exposeLexicalEditor timeout waiting for the connect button) under --repeat-each stress; 3 failures in a 180-run chromium+collab sample.

After

  • ClearFormatting: 30/30 firefox + collab.
  • Collab boot-stall fix: 200/200 chromium + collab and 120/120 firefox + collab (--repeat-each=50/30, --retries=0).
  • Full @flaky set across all 6 CI configs: stable; playwright test --list parses all 2124 tests.

@vercel
Copy link
Copy Markdown

vercel Bot commented May 28, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
lexical Ready Ready Preview, Comment May 29, 2026 5:12am
lexical-playground Ready Ready Preview, Comment May 29, 2026 5:12am

Request Review

@meta-cla meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label May 28, 2026
@etrepum etrepum added the extended-tests Run extended e2e tests on a PR label May 28, 2026
@etrepum etrepum marked this pull request as ready for review May 28, 2026 21:48
etrepum and others added 3 commits May 28, 2026 14:59
Audited all 32 @flaky e2e tests against the CI flaky-job configuration:
static playground build served on :4000, collab websocket on :1234,
chromium + firefox x {rich-text, plain-text, rich-text-with-collab},
--repeat-each=10 --retries=0 --workers=4, plus a --repeat-each=20 deep
collab pass. Each test got ~40-80 runs.

(An initial run against the live vite dev server on :3000 showed ~80
failures, but that was environmental noise from on-the-fly compilation
under load; against the CI-equivalent static build only the tests below
ever failed.)

Removed @flaky from 26 tests that never failed across the entire audit.

Kept @flaky (still intermittently fail, fixed separately):
- ClearFormatting: Should preserve the default styling of hashtags and mentions
- TextFormatting: Regression facebook#2523 can toggle format across a decorator
- Toolbar: Insert image caption + table
- Tables: Select multiple merged cells (selection expands to a rectangle)
- Tables: Can align text using Table selection
- ListsCopyAndPaste: Copy and paste of partial list items into the list

The large diff is prettier re-indenting test bodies after the options
argument was dropped; there are no semantic changes to any test.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…se, collab connect wait)

Investigated the still-@flaky tests. Root causes and fixes:

- exposeLexicalEditor (collab setup): the wait for the right frame's
  ".action-button.connect" to read "Disconnect" used the default 5s expect
  timeout. Under parallel load the shared y-websocket connect exceeds 5s, which
  was the dominant source of collab @flaky failures across the whole suite.
  Bumped that wait (and the editor visibility check) to 30s.

- ClearFormatting "Should preserve the default styling of hashtags and
  mentions": waited for any "@luke" typeahead item then pressed Enter. While
  "@luke" is still being typed the partial query "@lu" also matches
  "Agent Kallus" (kal-LU-s), which sorts earlier and is highlighted, so Enter
  selected it. Now wait for "Luke Skywalker" to be the aria-selected option.
  Verified 30/30 firefox+collab (was ~25% failure). Tag removed.

- Toolbar "Insert image caption + table": the image renders behind
  React.Suspense (fallback={null}) and only appears after the asset loads,
  which can exceed assertHTML's 5s window under load. Wait for ".editor-image
  img" (30s) before asserting. Tag removed.

The remaining 4 @flaky tests (TextFormatting facebook#2523, Tables "Select multiple
merged cells", Tables "Can align text using Table selection",
ListsCopyAndPaste "partial list items into the list") only fail via a rarer
shared collab right-iframe boot stall (~1/60, chromium+collab only), which the
30s connect wait reduces but does not eliminate; left tagged pending a
harness-level fix.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…alls; untag remaining @flaky tests

The remaining @flaky tests (TextFormatting facebook#2523, Tables "Select multiple
merged cells", Tables "Can align text using Table selection", ListsCopyAndPaste
"partial list items into the list") only failed via a shared, rare
(~1/60, chromium+collab) collab-setup stall: under parallel load one split-view
iframe occasionally fails to boot / activate collab within the timeout, so its
".action-button.connect" toolbar button never appears and initialize() fails.
This affects every collab test, not these four specifically.

exposeLexicalEditor now retries the collab-frame readiness check, reloading the
page between attempts (up to 3×, 15s each), so a transient boot/connect hiccup
during setup recovers instead of failing the test.

Validated under stress against the static build on :4000: 200/200 chromium+collab
and 120/120 firefox+collab with retries=0 (previously ~3 failures in a
comparable sample). With the stall handled, all four tests are stable, so their
@flaky tags are removed. No @flaky tags remain in the e2e suite.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@etrepum etrepum changed the title [tests] Chore: Clean up e2e tests [lexical-playground] Chore: Audit and de-flake the e2e suite (remove all @flaky tags) May 28, 2026
Removing the last @flaky tags made the flaky CI jobs run
`playwright test --grep "@flaky"`, which now matches zero tests and errors
with "No tests found". Pass --pass-with-no-tests on the flaky invocations so
the jobs are a no-op success, and still run normally if a @flaky test is
re-introduced later.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@etrepum etrepum added this pull request to the merge queue May 29, 2026
Merged via the queue into facebook:main with commit 13fc148 May 29, 2026
45 checks passed
etrepum pushed a commit to etrepum/lexical that referenced this pull request May 30, 2026
All @flaky tags were removed in facebook#8585 and the suite has been de-flaked, so the
machinery for splitting "flaky" tests out of CI is dead weight that only invites
re-adding flaky tests. Remove it:

- call-e2e-all-tests.yml: drop the (already `if: false`) `flaky` job.
- call-e2e-test.yml: drop the `flaky` input, `continue-on-error`, the
  `--grep`/`--grep-invert "@flaky"` (+ `--pass-with-no-tests`) toggles, and the
  `flaky` segment of the artifact name. Steps now just run the suite.
- package.json: drop `--grep-invert "@flaky"` from the *-ci-* e2e scripts.

No test carried an actual `@flaky` tag, so this is behavior-preserving.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. extended-tests Run extended e2e tests on a PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants