Skip to content

ci: pre-deploy smoke gate (Option C) so a JS-dead render can't reach isamples.org#225

Merged
rdhyee merged 2 commits into
isamplesorg:mainfrom
rdhyee:smoke-gate-option-c
May 15, 2026
Merged

ci: pre-deploy smoke gate (Option C) so a JS-dead render can't reach isamples.org#225
rdhyee merged 2 commits into
isamplesorg:mainfrom
rdhyee:smoke-gate-option-c

Conversation

@rdhyee
Copy link
Copy Markdown
Contributor

@rdhyee rdhyee commented May 15, 2026

Problem

The deploy workflow runs quarto render and ships whatever docs/ it produces. Neither code review nor pytest --collect-only ever loads the rendered page in a browser, so a render that "succeeds" but yields a JS-dead explorer (DuckDB-WASM never inits, Cesium blank, search returns nothing) deploys to isamples.org anyway. This is the failure class behind past "we reviewed it and it still broke" incidents.

What this adds

  • tests/test_smoke.py — fundamental-liveness gate. Single fresh context, one navigation, poll-for-readiness (deliberately no reload loop; rapid reloads exhaust the DuckDB-WASM worker and produce false failures). Four unambiguous assertions:
    1. DuckDB-WASM initialized (SESAR facet count populated)
    2. Cesium canvas attached (globe actually drew)
    3. A world search via the visible #searchSubmitBtn returns results
    4. No uncaught JS exception / regression-fingerprint console error
  • quarto-pages.yml — smoke step inserted between quarto render and Deploy, serving the rendered docs/ locally. Fail-closed: smoke failure fails the job, so Deploy is skipped and a broken render never reaches production. trap-reaps the static server under GitHub's bash -e.

Validation

  • Passes the current known-good build in ~15s (fast; no false-closed).
  • Raises TimeoutError on a rendered-but-JS-dead page → pytest fails → step fails → Deploy skipped (fail-closed confirmed).

Notes

  • The gate executes on push-to-main (the deploy trigger), so it self-validates on its first real deploy after merge. Worst case of a false-fail is a blocked deploy, not a broken site.
  • Follow-up (Option A, separate): post-deploy check vs live isamples.org (cache-busted) as a backstop for prod-only data/CDN issues.

🤖 Generated with Claude Code

rdhyee and others added 2 commits May 15, 2026 16:06
The deploy workflow runs `quarto render` and ships whatever docs/ it
produces; nothing ever loads the rendered page in a browser, so a render
that "succeeds" but yields a page where DuckDB-WASM never inits, Cesium
never draws, or search returns nothing has historically deployed anyway
(the failure class behind past "reviewed and still broke" incidents).

Adds tests/test_smoke.py: single fresh context, one navigation,
poll-for-readiness (no reload loop — rapid reloads exhaust the
DuckDB-WASM worker and false-fail). Asserts four unambiguous liveness
signals: facet query populated, Cesium canvas attached, a world search
returns results, no uncaught JS exception / regression-fingerprint
console error.

Wires it into quarto-pages.yml between render and Deploy, serving the
rendered docs/ locally. Fail-closed: smoke failure fails the job and the
Deploy step is skipped. trap-reaps the static server under `bash -e`.

Validated both directions: passes the known-good build in ~15s; raises
TimeoutError on a rendered-but-JS-dead page.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…g#225

- Scope _FATAL_CONSOLE to same-origin scripts: a third-party
  console.error (Cesium CDN, injected extension) can no longer block a
  deploy; pageerror stays the unconditional hard signal for uncaught
  app exceptions.
- Cesium check now also asserts non-zero canvas dimensions, catching
  the "widget mounted but globe never sized" case without flaky pixel
  readback.
- Search-result wait 60s -> 90s, aligned with the perf test budget, so
  a slow CI cold DuckDB-WASM query + remote parquet fetch doesn't
  false-fail a healthy build.

Re-validated: passes the known-good build (~20s).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@rdhyee
Copy link
Copy Markdown
Contributor Author

rdhyee commented May 15, 2026

How to test this gate / what to look for

Run it exactly as CI will

git fetch origin smoke-gate-option-c && git checkout smoke-gate-option-c
python3 -m http.server 8080 --directory docs &
ISAMPLES_BASE_URL=http://localhost:8080 pytest tests/test_smoke.py -s -q
kill %1

Expected (healthy build): SMOKE OK — search result: '50+ results for "pottery"' then 1 passed in ~15–20s.

Prove it's fail-closed (catches a break)

cp docs/explorer.html /tmp/explorer.bak
sed -i '' "s/data-value='SESAR'/data-value='BROKEN'/g" docs/explorer.html
ISAMPLES_BASE_URL=http://localhost:8080 pytest tests/test_smoke.py -s -q   # -> 1 failed (TimeoutError on readiness)
cp /tmp/explorer.bak docs/explorer.html

What each assertion proves

Signal Proves Pass criterion
SESAR facet count DuckDB-WASM initialized & ran a query text like (4,389,231) within 90s
Cesium canvas Globe actually drew, not just a container canvas attached + non-zero clientWidth/Height
World search Visible #searchSubmitBtn → query path alive #searchResults shows …results… with a digit, ≤90s
pageerror / same-origin fatal console No uncaught JS exception zero captured

Review sanity-checks

  • Fail-closed wiring: in quarto-pages.yml the smoke step sits before Deploy 🚀; pytest is the last command, so a non-zero exit aborts the job (GitHub's bash -eo pipefail) and Deploy never runs.
  • No false-fails: console check is scoped to same-origin scripts (Cesium CDN / browser extensions can't block a deploy); pageerror is the unconditional hard signal.
  • Self-validation: the gate only executes on push-to-main (the deploy trigger), so its first real run is the post-merge deploy. A false-fail there blocks a deploy but does not break the live site.

CI-only difference from the local run: CI first does pip install pytest playwright && playwright install --with-deps chromium and trap-reaps the background server. Otherwise identical.

@rdhyee rdhyee merged commit d5091df into isamplesorg:main May 15, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant