Skip to content

fix(e2e): unblock nightly-deploy-moddable cell#204

Merged
ottovlotto merged 1 commit into
mainfrom
becca/e2e-moddable-cell-fixes
May 22, 2026
Merged

fix(e2e): unblock nightly-deploy-moddable cell#204
ottovlotto merged 1 commit into
mainfrom
becca/e2e-moddable-cell-fixes

Conversation

@ottovlotto
Copy link
Copy Markdown
Collaborator

Summary

Three test-infra fixes for the first-night failure on the moddable cell (issue #202). The CLI itself worked correctly on the failing nightly — exit 0, deploy completed, registry stamped. The cell failed because:

  • vitest hung ~30 s after junit was written, killed by the action runner → exit 1 → "attempt 1 failed";
  • nick-fields/retry re-ran the setup; attempt 2's gh repo create paritytech/e2e-cli-moddable-<run_id> collided with attempt 1's leftover repo ("Name already exists") → real failure.

1. Drop the un-unref'd 30 s setTimeout that caused the vitest hang

src/utils/connection.ts::timeoutAfter and the duplicate in e2e/cli/helpers/chain.ts::getTestClient both create a setTimeout(..., 30_000) inside Promise.race. When connectPaseo() wins the race, the loser timer is not cancelled — it stays pending and keeps Node's event loop alive until it fires. The CLI's scheduleHardExit() in process-guard.ts papers over this in production; the vitest harness has no equivalent and was hanging past its 5 s teardownTimeout, surfacing as:

close timed out after 5000ms
Tests closed successfully but something prevents Vite server from exiting

Fix: .unref() on the timer in connection.ts. Removes the redundant wrapper in chain.ts (it was layering a second 30 s timeout on top of the one already inside getConnection()).

2. Make setupModdableFixture retry-safe

Probe gh repo view paritytech/<repoName> before gh repo create. If the repo already exists (retry within the same run, or leftover from a crashed earlier run), reuse it: git remote add origin … && git push -u origin main --force. Mirrors what the CLI itself logs on re-run (using existing origin (...)). Force-push is safe — fixture content is identical between attempts.

gh auth setup-git is invoked before the raw git push so HTTPS auth via GH_TOKEN flows through gh's credential helper (which gh repo create --push does implicitly).

3. Cleanup post-step in the workflow

if: always() && matrix.cell == 'nightly-deploy-moddable' step that runs gh repo delete paritytech/e2e-cli-moddable-${{ github.run_id }} --yes || true after artefacts are uploaded. Runs on success, failure, and cancellation. The weekly e2e-cleanup.yml cron stays as the backstop, but per-run cleanup means we don't accumulate one orphan per failed nightly.

Test plan

  • pnpm format:check — clean
  • pnpm lint:license — clean
  • pnpm build — clean
  • pnpm test — all unit tests pass (one telemetry test is a known flake under suite-wide contention, passes in isolation)
  • nightly-deploy-moddable cell passes on the first attempt (verified via workflow_dispatch trigger after merge)
  • Subsequent scheduled nightly is green and Nightly E2E failure: 2026-05-22 #202 auto-closes

Closes #202

Three test-infra fixes for the first-night failure on the moddable cell:

1. Drop un-unref'd 30s setTimeout in `connection.ts::timeoutAfter` (and the
   redundant duplicate in `chain.ts::getTestClient`). The timer was the
   loser in `Promise.race` after the chain client connected, but stayed
   pending and kept the event loop alive for ~30s past test completion.
   The CLI's `scheduleHardExit` papers over it in production; the e2e
   harness has no equivalent and was hanging past vitest's 5s
   teardownTimeout, surfacing as `close timed out after 5000ms / Tests
   closed successfully but something prevents Vite server from exiting`.

2. Make `setupModdableFixture` retry-safe. Probe `gh repo view` before
   create; if the repo already exists (retry within the same run, or
   leftover from a crashed prior run), reuse it via origin-add +
   force-push instead of crashing on "Name already exists". Mirrors the
   CLI's own "using existing origin (...)" path and lets
   `nick-fields/retry` actually retry the cell without colliding on the
   run-scoped repo name.

3. Add `if: always()` cleanup post-step to the workflow that deletes the
   per-run paritytech/e2e-cli-moddable-<run_id> repo. Runs on success,
   failure, and cancellation. The weekly e2e-cleanup.yml cron stays as a
   backstop, but per-run cleanup means retries start clean and we don't
   accumulate one orphan per failed nightly.

Closes #202
@github-actions
Copy link
Copy Markdown
Contributor

Dev build ready — try this branch:

curl -fsSL https://raw.githubusercontent.com/paritytech/playground-cli/main/install.sh | VERSION=dev/becca/e2e-moddable-cell-fixes bash

@github-actions
Copy link
Copy Markdown
Contributor

E2E Test Pass · ✅ PASS

Tag: e2e-ci-pr · Branch: becca/e2e-moddable-cell-fixes · Commit: 0836c15 · Run logs

Cell Result Time
pr-deploy-cdm ✅ PASS 2m06s
pr-init-session ✅ PASS 1m45s
pr-deploy-foundry ✅ PASS 0m38s
pr-deploy-frontend ✅ PASS 3m30s
pr-install ✅ PASS 0m43s
pr-preflight ✅ PASS 1m26s
pr-mod ✅ PASS 2m02s
${{ matrix.cell }} ⏭️ SKIP 0m00s
${{ matrix.cell }} ⏭️ SKIP 0m00s

Sentry traces: view spans for this run

@ottovlotto
Copy link
Copy Markdown
Collaborator Author

Verified the moddable fix via workflow_dispatch on this branch (run 26298463347):

  • nightly-deploy-moddablepassed in 111 s on first attempt (no retry — "Surface failure detail" step skipped, confirming the test cell exited cleanly within vitest's 5 s teardown budget).
  • Cleanup post-step ran successfully — the per-run repo was deleted.
  • All other nightly-only cells (nightly-mod-miss, nightly-deploy-hardhat, nightly-deploy-multi, nightly-chaos-rpc, nightly-chaos-sigint, nightly-diagnostic, nightly-rejections, Init cold-start smoke test) also green.

pr-init-session and pr-install were cancelled within ~20–50 s of starting ("The operation was canceled" — GH runner cancellation, not a test failure). The pull_request CI on this same branch run 26293100972 had all pr-* cells green, so the cancellation here was environmental.

@ottovlotto ottovlotto merged commit 17890c5 into main May 22, 2026
35 of 37 checks passed
@ottovlotto ottovlotto deleted the becca/e2e-moddable-cell-fixes branch May 22, 2026 16:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Nightly E2E failure: 2026-05-22

1 participant