fix(uat): address results explorer review followups#255
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 9f03a17302
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
|
||
| _stage_playwright_fixture_dir( | ||
| source_dir=output_dir, | ||
| fixture_dir=playwright_fixture_dir or _default_playwright_fixture_dir(), |
There was a problem hiding this comment.
Use per-run Playwright fixture directory
run_explorer_smoke() now rewrites a single shared fixture mount (results-explorer/test-fixtures/.generated/data) when playwright_fixture_dir is not provided, and both the CLI/orchestrator call this default path. If two UAT sweeps run concurrently in the same checkout, one run can overwrite the other run’s fixture link/copy before Playwright executes, causing flaky failures or validating the wrong dataset.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Follow-up sweep actioned this PR review comment.
Disposition: fixed
Branch: `chore/pr-review-followups-since-2026-05-05`
Summary:
Disposition: fixed
Evidence: Current explorer_smoke.py still defaulted to the shared fixture path when callers omitted playwright_fixture_dir. I changed it to use log_dir/playwright-fixtures/data, pass that via E2E_FIXTURE_DIR, assign a per-run E2E_PORT, and prevent Playwright server reuse for fixture overrides. Regression coverage in test_explorer_smoke.py now checks the per-run default, env wiring, and server reuse guard.
Verification: uv run -- python -m pytest tests/uat/test_explorer_smoke.py -m fast -q passed, uv run -- ruff check tests/uat/phases/explorer_smoke.py tests/uat/test_explorer_smoke.py passed, node --check results-explorer/scripts/serve-browser-tests.mjs passed, npm --prefix results-explorer run typecheck passed, and git diff --check passed.
Future sweeps skip comments that already have this marker reply.
Address remaining items from the post-#255 review: - preflight: derive AUTOMATED_LOCAL_PLATFORMS from docker_assets so the bring-up script and the preflight allowlist share a single source of truth. Adds a regression test that pins both sides to the same set. - spec: document the matrix-summary TSV column-order bump that PR #247 introduced (`submit_terminal_state` between `result_path` and `validator_status`). - ops doc: warn that `make uat-explorer-smoke` runs a full `npm ci` + `npm run build` per invocation and is not locally cached. The disk_budget_table.tsv `peak_database_gib=0.0` clarification is deferred until PR #245 lands; nothing to annotate on develop yet. Co-authored-by: Joe Harris <57046+joeharris76@users.noreply.github.com> Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
The function was extended in #255 to flag both genuine status downgrades AND newly-derived rows missing from the checked-in matrix. Rename to `static_matrix_drift` so the name reflects the broader semantics; expand the docstring to enumerate the three drift kinds it detects. Also lock the contract between `_default_playwright_fixture_dir()` in explorer_smoke.py and `serve-browser-tests.mjs` with a unit test that asserts both ends agree on the `test-fixtures/.generated/data` mount. Surfaced as nits during the review of #255. Co-authored-by: Joe Harris <57046+joeharris76@users.noreply.github.com>
* fix(pr-followup): PR #245 comment 3201562602 — **<sub><sub> Path: tests/uat/_cli.py * fix(pr-followup): PR #245 comment 3201562610 — **<sub><sub> Path: tests/uat/phases/preflight.py * fix(pr-followup): PR #246 comment 3202118174 — **<sub><sub> Path: _project/audits/screenshots/results-explorer-hierarchy-usability-data-audit-20260507/screenshot-manifest.json * fix(pr-followup): PR #255 comment 3202201883 — **<sub><sub> Path: tests/uat/phases/explorer_smoke.py * fix(pr-followup): PR #256 comment 3202212814 — **<sub><sub> Path: results-explorer/src/pages/BenchmarkIndex.tsx * fix(pr-followup): PR #266 comment 3204995419 — **<sub><sub> Path: results-explorer/src/lib/facetDisplay.ts * fix(pr-followup): PR #267 comment 3205001733 — **<sub><sub> Path: _project/scripts/skill_sync_lock_audit.py --------- Co-authored-by: Joe Harris <57046+joeharris76@users.noreply.github.com>
No description provided.