infra: exclude demo-ubi.spec.ts from CI Playwright run (unblock smoke runtime budget)#424
Conversation
… plan Pipeline trail for the demo-ubi CI exclusion work that clears the smoke job's reseed-runtime block. idea.md (preflight refresh): priority framing shifted from "smoke red on every PR" to "precondition for re-enabling per-PR smoke" since the SMOKE_TEST gate landed 2026-06-02. AC-8 citation corrected (1140s/19 min hard ceiling, ~28 min worst case — not the 24-min downstream drift in pr.yml/demo-ubi.spec.ts). Decisions locked: D-1 Option A (testIgnore), D-2 Option C deferred (operator picked), D-3 Option B rejected (math), D-4 sibling coordination. feature_spec.md: 5 FRs (testIgnore extension, vitest regression guard, runbook section, pr.yml comment refresh, state.md update), 7 ACs, single- phase. GPT-5.5 cross-model review: 3 cycles, 13 findings (1 H + 5 M + 7 L), all accepted and applied. Convergence at cycle 3. implementation_plan.md: 1 epic, 5 stories one-per-FR, 0 endpoints, 1 new test file, 4 modified files. GPT-5.5: 3 cycles, 11 findings (0 H + 4 M + 7 L), all accepted. Convergence at cycle 3. pipeline_status.md: spec + plan finalized, ready for execution. Dashboards regenerated by the mvp1-dashboard-regen pre-commit hook (176 features across 3 releases). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: SoundMindsAI <eric.starr@soundminds.ai>
….1 + 1.2) The smoke job's demo-ubi.spec.ts beforeAll hook drives a full reseed that exceeds the 25-min job cap. AC-8 of feat_demo_ubi_study_comparison bounds the in-flight reseed at 1140s (~19 min hard ceiling) with §14 estimating ~28 min worst case once the Solr scenario lights up. Adding Playwright + smoke-job setup overhead pushes total wall-clock past the cap (PR #383 run 26790636716 hit it at 25:18). Fix: extend playwright.config.ts's testIgnore CI-gated branch by one entry — '**/demo-ubi.spec.ts' — joining the 6 pre-existing demo-data- dependent specs. Single-file edit; matches the established pattern from chore_drop_demo_seed_from_ci + PR #291's 4th-run surface. Local coverage preserved: CI=unset (the normal local-dev case) still discovers and runs demo-ubi.spec.ts. The file itself is unchanged. Story 1.2 (vitest regression guard): ui/src/__tests__/playwright-config-test-ignore.test.ts reads playwright.config.ts as text and asserts (a) demo-ubi entry is in the CI ternary branch, (b) all 7 expected CI-gated entries are present, (c) demo-ubi does NOT appear outside the CI ternary (local coverage intact). Text-grep approach per spec D-7 — lowest-coupling, no module-reload tricks. §16 manual verification (recorded in this PR's body): CI=true playwright test --list -> 86 tests in 30 files, 0 demo-ubi CI=unset playwright test --list -> 110 tests in 37 files, demo-ubi discovered (5 grep matches) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: SoundMindsAI <eric.starr@soundminds.ai>
…ries 1.3 + 1.4 + 1.5)
Story 1.3 — docs/03_runbooks/smoke-solr-stability.md gains §5
"Reseed runtime (demo-ubi exclusion)". Explains why the exclusion
exists (AC-8 vs smoke-cap mismatch — cites the actual 1140s/19 min
hard ceiling, not the 24-min downstream drift), where it lives (the
testIgnore CI branch in playwright.config.ts — single source of
truth), the local-coverage promise (CI=unset keeps demo-ubi running
locally), the nightly-CI caveat (a future nightly-on-GHA job would
also exclude demo-ubi unless it overrides CI or uses a separate
config — defer until needed), and the Option C path-forward if
per-PR demo-ubi coverage is ever wanted.
Note: numbered §5 (not the spec's literal "§4") because the existing
§4 "Why each lever is GHA-only" pairs tightly with §3's lever
cascade — inserting between them would interrupt that flow. FR-3's
"or wherever it fits the runbook's flow" clause covers this; AC-4's
literal number was paraphrasing FR-3's intent (section by name, not
by ordinal).
Story 1.4 — .github/workflows/pr.yml comment blocks refreshed:
- Lines 42-58 (SMOKE-TEST opt-in switch note): replace "demo-ubi
reseed exceeds the per-PR budget" framing with "runtime block
cleared via testIgnore — flip SMOKE_TEST=true after the §16
verification". Operator opt-in commands unchanged.
- Lines 507-523 (smoke-test job header / timeout-minutes comment):
replace "AC-8 bounds at 24 min" framing with "runtime is
expected to fit within the 25-min cap post-demo-ubi-exclusion".
YAML structure untouched: if-gate, timeout-minutes, needs, env,
permissions, steps all byte-identical. Comments-only diff verified
with awk filter (zero non-comment changed lines).
Story 1.5 — state.md updated:
- "CI note" paragraph (lines 13-15): the two stale sentences
("drives the demo-ubi reseed, which routinely hits the 25-min
cap" and "Until the reseed-runtime fix lands, leave it off")
replaced with framing that preserves SMOKE_TEST=OFF-by-default,
names the demo-ubi exclusion as the shipped fix, and points at
the spec §16 verification.
- "Known debt / fragility" section: the Solr CI-readiness entry
was the umbrella tracking three sub-tasks (backend, Solr
stability, reseed runtime); rewritten as fully resolved with
the third sub-task now shipped here.
The "Last 5 merges" entry is NOT added here — that's the
finalization step's responsibility (after PR merge), per Epic gate
item #9 of the implementation plan.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: SoundMindsAI <eric.starr@soundminds.ai>
There was a problem hiding this comment.
Code Review
This pull request implements the infra_smoke_reseed_runtime_budget feature, which excludes the heavy demo-ubi.spec.ts Playwright E2E test from the CI smoke test run via the testIgnore configuration in ui/playwright.config.ts to prevent job timeouts. It also adds a Vitest regression test to guard this configuration and updates relevant documentation. The reviewer provided valuable feedback on the new test file, suggesting the use of import.meta.url instead of process.cwd() for more robust path resolution, and pointing out a potential cross-platform failure on Windows due to hardcoded \n line endings in the sliceConfig helper function.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| import { readFileSync } from 'node:fs'; | ||
| import { join } from 'node:path'; | ||
|
|
||
| import { describe, expect, it } from 'vitest'; | ||
|
|
||
| const CONFIG_PATH = join(process.cwd(), 'playwright.config.ts'); |
There was a problem hiding this comment.
Using process.cwd() to resolve the path to playwright.config.ts makes the test fragile if executed from a different directory (e.g., the repository root in a monorepo setup). It is more robust to resolve the path relative to the test file's directory using import.meta.url.
| import { readFileSync } from 'node:fs'; | |
| import { join } from 'node:path'; | |
| import { describe, expect, it } from 'vitest'; | |
| const CONFIG_PATH = join(process.cwd(), 'playwright.config.ts'); | |
| import { readFileSync } from 'node:fs'; | |
| import { dirname, resolve } from 'node:path'; | |
| import { fileURLToPath } from 'node:url'; | |
| import { describe, expect, it } from 'vitest'; | |
| const __dirname = dirname(fileURLToPath(import.meta.url)); | |
| const CONFIG_PATH = resolve(__dirname, '../../playwright.config.ts'); |
| function sliceConfig(source: string): { | ||
| beforeCi: string; | ||
| ciBranch: string; | ||
| afterCi: string; | ||
| } { | ||
| const ciIdx = source.indexOf('process.env.CI'); | ||
| if (ciIdx === -1) { | ||
| throw new Error( | ||
| 'Could not locate `process.env.CI` in playwright.config.ts — testIgnore CI-ternary shape changed?', | ||
| ); | ||
| } | ||
| // Find the `? [` after process.env.CI. | ||
| const openBracketIdx = source.indexOf('? [', ciIdx); | ||
| if (openBracketIdx === -1) { | ||
| throw new Error( | ||
| 'Could not locate `? [` after `process.env.CI` — testIgnore CI-ternary shape changed?', | ||
| ); | ||
| } | ||
| // Find the matching `] : []` close that ends the ternary. | ||
| const closeIdx = source.indexOf(']\n : []', openBracketIdx); | ||
| if (closeIdx === -1) { | ||
| throw new Error( | ||
| 'Could not locate `] : []` closing the CI ternary — testIgnore CI-ternary shape changed?', | ||
| ); | ||
| } | ||
| return { | ||
| beforeCi: source.slice(0, ciIdx), | ||
| ciBranch: source.slice(openBracketIdx, closeIdx), | ||
| afterCi: source.slice(closeIdx), | ||
| }; | ||
| } |
There was a problem hiding this comment.
The sliceConfig function is sensitive to line endings because it searches for ]\n : [] and expects \n. On Windows environments where files are checked out with \r\n (CRLF), this test will fail. Normalizing the line endings of the source string to \n before processing will make the test cross-platform robust.
function sliceConfig(source: string): {
beforeCi: string;
ciBranch: string;
afterCi: string;
} {
const normalizedSource = source.replace(/\r\n/g, '\n');
const ciIdx = normalizedSource.indexOf('process.env.CI');
if (ciIdx === -1) {
throw new Error(
'Could not locate `process.env.CI` in playwright.config.ts — testIgnore CI-ternary shape changed?',
);
}
// Find the `? [` after process.env.CI.
const openBracketIdx = normalizedSource.indexOf('? [', ciIdx);
if (openBracketIdx === -1) {
throw new Error(
'Could not locate `? [` after `process.env.CI` — testIgnore CI-ternary shape changed?',
);
}
// Find the matching `] : []` close that ends the ternary.
const closeIdx = normalizedSource.indexOf(']\n : []', openBracketIdx);
if (closeIdx === -1) {
throw new Error(
'Could not locate `] : []` closing the CI ternary — testIgnore CI-ternary shape changed?',
);
}
return {
beforeCi: normalizedSource.slice(0, ciIdx),
ciBranch: normalizedSource.slice(openBracketIdx, closeIdx),
afterCi: normalizedSource.slice(closeIdx),
};
}…fig-test-ignore.test.ts Two Medium-severity findings, both accepted: 1. Path resolution via `import.meta.url` instead of `process.cwd()`. Plan D-7 explicitly approved both options; Gemini's robustness point holds for ad-hoc operator runs like `pnpm vitest run ui/src/__tests__/...` from the repo root (where cwd would be the repo root, not ui/, and the lookup would fail). `import.meta.url` works in both the canonical `pnpm --dir ui test` shape and the ad-hoc shape. Strictly more robust. 2. CRLF normalization in sliceConfig() before the `\n`-anchored indexOf searches. Zero-cost defense for any future Windows checkout where git's autocrlf converts line endings; macOS/Linux unchanged. Spec D-7 didn't address this; accepting as free defense. Vitest after both fixes: 3/3 still pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: SoundMindsAI <eric.starr@soundminds.ai>
…issing markdown links) Two findings accepted, one rejected: ACCEPTED #2 (Medium): ui/playwright.config.ts comment said "See ... smoke-solr-stability.md §4 for the lever cascade context" but §4 is "Why each lever is GHA-only", not the lever cascade (which is §3). My new section about reseed runtime is §5. Updated to point at §5 for the reseed-runtime-vs-Solr-stability relationship table (which is where the broader cascade context is explained in the demo-ubi exclusion narrative). ACCEPTED #3 (Low): FR-3 required the new runbook §5 to "cross-link" to ui/playwright.config.ts and ui/tests/e2e/demo-ubi.spec.ts. Inline-code mentions don't satisfy "cross-link" — converted to clickable markdown links with verified resolvable relative paths. REJECTED #1 (High): "AC-7 file-shape contract violated" — re-raise without new evidence. Counter-evidence cited in PR #424 body's "Diff scope" section: every recent PR (#383, #416, #421, #422) ships the pipeline-trail (idea/spec/plan/pipeline_status) per project convention; dashboard regen files are emitted by the mvp1-dashboard-regen pre-commit hook (forbidden to skip per CLAUDE.md Rule #7 "never skip hooks"). AC-7's strict literal "5 files" predates the project-convention consideration of pipeline- trail co-shipping; the spec's intent (the 5 deliverables described in FR-1..FR-5) is satisfied byte-identically in this diff. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: SoundMindsAI <eric.starr@soundminds.ai>
Review adjudication (Gemini Code Assist + GPT-5.5 final review)Commits landing fixes: Gemini Code Assist (2 findings — both accepted)
GPT-5.5 final review (3 findings — 2 accepted, 1 rejected)
Outcomes
CI is re-running on |
Move the feature folder from planned_features/02_mvp2/ to implemented_features/2026_06_02_infra_smoke_reseed_runtime_budget/ (flat, date-prefixed). pipeline_status.md → Implementation: Complete (+Release: mvp2 marker for the dashboard classifier); implementation_plan.md → Complete (PR #424). state.md: prepend the PR #424 one-liner to "Last 5 merges", drop the now-6th infra_solr_ci_readiness entry to state_history.md, refresh the current-branch context + In-flight (Solr-CI debt chain fully drained). Full reasoning narrative added to state_history.md. Dashboards + website roadmap regenerated by the pre-commit hooks (MVP2 now 15 done; this feature classified mvp2 via the Release marker). Tracking issue #409 closed. Signed-off-by: SoundMindsAI <eric.starr@soundminds.ai> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Summary
The smoke job's
demo-ubi.spec.tsbeforeAllhook drives a full demo reseed whose in-flight wall-clock alone consumes the smoke job's 25-min cap. AC-8 offeat_demo_ubi_study_comparisonbounds the reseed at 1140s (~19 min hard ceiling) with §14 estimating ~28 min worst case once the Solr scenario lights up. Adding Playwright + smoke-job setup overhead pushes total wall-clock past 25 min — PR #383 run 26790636716 hit the cap at 25:18.This PR clears the runtime block with a single-file fix:
ui/playwright.config.ts'stestIgnoreCI-gated branch with'**/demo-ubi.spec.ts'. Single-file edit, matches the established pattern fromchore_drop_demo_seed_from_ci+ PR chore(ci): optimize smoke + backend jobs (buildx artifact handoff + image cache + pytest-xdist) #291's 4th-run surface (6 pre-existing CI-gated specs; this is the 7th).ui/src/__tests__/playwright-config-test-ignore.test.ts. Readsplaywright.config.tsas text and asserts (a) demo-ubi in CI branch, (b) all 7 expected entries present, (c) demo-ubi NOT outside the CI ternary (local coverage stays intact).docs/03_runbooks/smoke-solr-stability.md§5 "Reseed runtime (demo-ubi exclusion)" — explains why the exclusion exists, where it lives, the local-coverage promise, the nightly-CI caveat, and the Option C path-forward..github/workflows/pr.ymlcomment blocks (lines 42-58 + 507-523) refreshed: stale "demo-ubi reseed exceeds the budget" framing replaced with "runtime block cleared, flipSMOKE_TEST=trueafter §16 verification". YAML structure untouched — comments-only diff.state.mdupdated: CI-note paragraph stale sentences replaced; Solr-not-CI-ready debt entry rewritten as fully resolved (this PR ships the third sub-task).Operator decisions locked at idea-preflight:
timeout-minutesfrom 25 → 35 leaves <7 min margin against spec §14's ~28 min worst case; erodes with each future demo scenario.infra_smoke_fork_pr_secret_skipedits the samepr.ymljob; resolved by moving the exclusion toplaywright.config.ts(sibling now has zero collision risk).Local
make upsmoke retains full demo-ubi coverage (CI=unset → testIgnore CI branch doesn't fire). After this merges, operators may flipSMOKE_TEST=trueto re-enable per-PR smoke signal after the §16 manual verification below.Cross-model review trail:
§16 Manual verification (recorded for the reviewer)
Test plan
pnpm vitest run src/__tests__/playwright-config-test-ignore.test.ts— 3/3 passpnpm exec tsc --noEmit— cleanpnpm lint— clean (0 errors, 136 pre-existing warnings — none on new file)playwright test --listAC-1 + AC-2 verified (table above)git diff -U0 -- .github/workflows/pr.yml | awk '/^[+-]/ && !/^([+-]{3})/ && \$0 !~ /^[+-][[:space:]]*#/'— zero output (comments-only)python3 -c "import yaml; yaml.safe_load(open('.github/workflows/pr.yml'))"— YAML parses, all 10 jobs intactwc -c state.md— 22 KB (under 60 KB pre-commit gate)gh variable set SMOKE_TEST --body trueto re-enable per-PR smoke signal once they're satisfied with the §16 evidenceDiff scope (vs AC-7's strict "5 files")
AC-7 said exactly 5 files. The actual branch diff has 13: the 5 deliverables, plus the standard pipeline-trail artifacts (idea preflight + spec + plan + pipeline_status — every recent feature PR ships these together), plus 4 dashboard regen files emitted by the
mvp1-dashboard-regenpre-commit hook. The 5 AC-7 deliverables are still byte-identical to what the spec described:ui/playwright.config.tsui/src/__tests__/playwright-config-test-ignore.test.tsdocs/03_runbooks/smoke-solr-stability.md.github/workflows/pr.ymlstate.mdThe non-AC-7 files are all metadata or regen artifacts. None of them alter the runtime behavior of the fix.
Notes on the demo-ubi.spec.ts AC-8 citation drift
Both
pr.yml:518-523(now refreshed) andui/tests/e2e/demo-ubi.spec.ts:11carry "24 min" / "25-minute ceiling per AC-8" framing that drifts from the spec's actual 1140s/19 min hard ceiling. The pr.yml comment IS refreshed here as part of Story 1.4 (structural to the comment block being rewritten). The demo-ubi.spec.ts:11 drift is explicitly deferred per spec D-6 — the spec file is not edited at all by this PR, and touching it for an unrelated citation fix would violate AC-7's file-shape contract.🤖 Generated with Claude Code