Skip to content

fix/results explorer disk budget resume plan#245

Merged
joeharris76 merged 2 commits intodevelopfrom
fix/results-explorer-disk-budget-resume-plan
May 7, 2026
Merged

fix/results explorer disk budget resume plan#245
joeharris76 merged 2 commits intodevelopfrom
fix/results-explorer-disk-budget-resume-plan

Conversation

@joeharris76
Copy link
Copy Markdown
Owner

  • chore(uat): capture disk budget resume evidence
  • feat(uat): estimate disk budget and resume sweeps

@joeharris76 joeharris76 enabled auto-merge (squash) May 7, 2026 12:53
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 93fd89ae38

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread tests/uat/_cli.py
print(f"[preflight warn] {warning}", file=sys.stderr)
if result.aborted:
print(f"[preflight] ABORT: {result.abort_reason}", file=sys.stderr)
return 0
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Return nonzero when preflight reports an abort

preflight_main always returns 0 even when run_preflight sets aborted=True, so automation cannot detect a failed preflight and may continue into execute/sweep steps despite an explicit free-space abort message. This only appears when python -m tests.uat._cli preflight is used, but in that path the command currently signals success for failure conditions.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

    Follow-up sweep actioned this PR review comment.

    Disposition: fixed
    Branch: `chore/pr-review-followups-since-2026-05-05`

    Summary:
    Disposition: fixed

Evidence: Current tests/uat/_cli.py still returned success after a preflight abort, while execute_main already returned 2 for the same condition. I changed standalone preflight_main to return 2 on abort and added focused dispatch coverage in tests/uat/test_cli_dispatch.py.

Verification: uv run -- python -m pytest tests/uat/test_cli_dispatch.py -q passed, 13 tests. uv run -- ruff check tests/uat/_cli.py tests/uat/test_cli_dispatch.py passed.

    Future sweeps skip comments that already have this marker reply.

warnings: list[str] = []
aborted = False
abort_reason: str | None = None
disk_budget_summary = estimate_disk_budget_summary(disk_budget_config) if disk_budget_config is not None else None
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Guard advisory disk-budget errors in preflight

The new disk-budget summary is documented as advisory, but this direct call can raise (for example, malformed TSV headers or unparsable numeric fields) and crash preflight before PreflightResult is returned. That turns a non-gating estimate into a hard stop for all sweeps, which contradicts the intended behavior where only the free-space cutoff should abort execution.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

    Follow-up sweep actioned this PR review comment.

    Disposition: fixed
    Branch: `chore/pr-review-followups-since-2026-05-05`

    Summary:
    Disposition: fixed

Evidence: Current preflight.py still had the advisory disk-budget call on the preflight path; it now converts estimator exceptions into warnings. Added test_preflight.py coverage for a malformed TSV header proving preflight returns non-aborted with no summary.

Verification: uv run -- python -m pytest tests/uat/test_preflight.py -q -n 0 passed; uv run -- ruff check tests/uat/phases/preflight.py tests/uat/test_preflight.py passed.

    Future sweeps skip comments that already have this marker reply.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 93fd89ae38

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread tests/uat/_cli.py
print(f"[preflight warn] {warning}", file=sys.stderr)
if result.aborted:
print(f"[preflight] ABORT: {result.abort_reason}", file=sys.stderr)
return 0
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Return nonzero when preflight reports an abort

preflight_main always returns 0 even when run_preflight sets aborted=True, so automation cannot detect a failed preflight and may continue into execute/sweep steps despite an explicit free-space abort message. This only appears when python -m tests.uat._cli preflight is used, but in that path the command currently signals success for failure conditions.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

    Follow-up sweep actioned this PR review comment.

    Disposition: no-current-action
    Branch: `chore/pr-review-followups-since-2026-05-05`

    Summary:
    Disposition: no-current-action

Evidence: Current tests/uat/_cli.py already returns 2 when result.aborted is true, and tests/uat/test_cli_dispatch.py has focused regression coverage asserting the preflight subcommand returns 2 while printing the abort/warning output. Files were left unchanged.

Verification: uv run -- python -m pytest tests/uat/test_cli_dispatch.py::test_main_preflight_returns_2_when_preflight_aborts -q passed: 1 passed in 2.06s.

    Future sweeps skip comments that already have this marker reply.

warnings: list[str] = []
aborted = False
abort_reason: str | None = None
disk_budget_summary = estimate_disk_budget_summary(disk_budget_config) if disk_budget_config is not None else None
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Guard advisory disk-budget errors in preflight

The new disk-budget summary is documented as advisory, but this direct call can raise (for example, malformed TSV headers or unparsable numeric fields) and crash preflight before PreflightResult is returned. That turns a non-gating estimate into a hard stop for all sweeps, which contradicts the intended behavior where only the free-space cutoff should abort execution.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

    Follow-up sweep actioned this PR review comment.

    Disposition: no-current-action
    Branch: `chore/pr-review-followups-since-2026-05-05`

    Summary:
    Disposition: no-current-action

Evidence: Current preflight.py already guards estimate_disk_budget_summary(...) with try/except Exception and records an advisory warning. Focused coverage exists in test_preflight.py, asserting malformed disk-budget TSV input does not abort preflight. Worktree stayed unchanged.

Verification: uv run -- python -m pytest tests/uat/test_preflight.py::test_preflight_disk_budget_table_error_is_advisory -q passed, 1 test.

    Future sweeps skip comments that already have this marker reply.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 93fd89ae38

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread tests/uat/_cli.py
print(f"[preflight warn] {warning}", file=sys.stderr)
if result.aborted:
print(f"[preflight] ABORT: {result.abort_reason}", file=sys.stderr)
return 0
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Return nonzero when preflight reports an abort

preflight_main always returns 0 even when run_preflight sets aborted=True, so automation cannot detect a failed preflight and may continue into execute/sweep steps despite an explicit free-space abort message. This only appears when python -m tests.uat._cli preflight is used, but in that path the command currently signals success for failure conditions.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

    Follow-up sweep actioned this PR review comment.

    Disposition: no-current-action
    Branch: `chore/pr-review-followups-since-2026-05-05`

    Summary:
    Disposition: no-current-action

Evidence: Current tests/uat/_cli.py already returns 2 when result.aborted, and tests/uat/test_cli_dispatch.py has focused coverage asserting the preflight subcommand returns 2 and emits the abort message. Worktree remained unchanged.

Verification: uv run -- python -m pytest tests/uat/test_cli_dispatch.py::test_main_preflight_returns_2_when_preflight_aborts -q passed: 1 passed.

    Future sweeps skip comments that already have this marker reply.

warnings: list[str] = []
aborted = False
abort_reason: str | None = None
disk_budget_summary = estimate_disk_budget_summary(disk_budget_config) if disk_budget_config is not None else None
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Guard advisory disk-budget errors in preflight

The new disk-budget summary is documented as advisory, but this direct call can raise (for example, malformed TSV headers or unparsable numeric fields) and crash preflight before PreflightResult is returned. That turns a non-gating estimate into a hard stop for all sweeps, which contradicts the intended behavior where only the free-space cutoff should abort execution.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

    Follow-up sweep actioned this PR review comment.

    Disposition: no-current-action
    Branch: `chore/pr-review-followups-since-2026-05-05`

    Summary:
    Disposition: no-current-action

Evidence: Current preflight.py already guards estimate_disk_budget_summary(...) with try/except Exception and records an advisory warning. Focused regression coverage exists in test_preflight.py, using a malformed TSV header and asserting preflight does not abort. Worktree remains unchanged.

Verification: uv run -- python -m pytest tests/uat/test_preflight.py -m fast -k disk_budget_table_error_is_advisory -q passed: 1 passed in 2.25s.

    Future sweeps skip comments that already have this marker reply.

joeharris76 added a commit that referenced this pull request May 7, 2026
Address remaining items from the post-#255 review:

- preflight: derive AUTOMATED_LOCAL_PLATFORMS from docker_assets so
  the bring-up script and the preflight allowlist share a single
  source of truth. Adds a regression test that pins both sides to
  the same set.
- spec: document the matrix-summary TSV column-order bump that
  PR #247 introduced (`submit_terminal_state` between `result_path`
  and `validator_status`).
- ops doc: warn that `make uat-explorer-smoke` runs a full `npm ci`
  + `npm run build` per invocation and is not locally cached.

The disk_budget_table.tsv `peak_database_gib=0.0` clarification is
deferred until PR #245 lands; nothing to annotate on develop yet.

Co-authored-by: Joe Harris <57046+joeharris76@users.noreply.github.com>
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
@joeharris76 joeharris76 force-pushed the fix/results-explorer-disk-budget-resume-plan branch from 93fd89a to f0e26b6 Compare May 7, 2026 23:23
@joeharris76 joeharris76 merged commit 5e16503 into develop May 7, 2026
5 checks passed
@joeharris76 joeharris76 deleted the fix/results-explorer-disk-budget-resume-plan branch May 7, 2026 23:33
joeharris76 added a commit that referenced this pull request May 8, 2026
* fix(pr-followup): PR #245 comment 3201562602 — **<sub><sub>![P2 Badge](https://img.shields.io/badge…

Disposition: fixed
Source: #245 (comment)
Path: tests/uat/_cli.py

* fix(pr-followup): PR #245 comment 3201562610 — **<sub><sub>![P1 Badge](https://img.shields.io/badge…

Disposition: fixed
Source: #245 (comment)
Path: tests/uat/phases/preflight.py

* fix(pr-followup): PR #246 comment 3202118174 — **<sub><sub>![P2 Badge](https://img.shields.io/badge…

Disposition: fixed
Source: #246 (comment)
Path: _project/audits/screenshots/results-explorer-hierarchy-usability-data-audit-20260507/screenshot-manifest.json

* fix(pr-followup): PR #255 comment 3202201883 — **<sub><sub>![P2 Badge](https://img.shields.io/badge…

Disposition: fixed
Source: #255 (comment)
Path: tests/uat/phases/explorer_smoke.py

* fix(pr-followup): PR #256 comment 3202212814 — **<sub><sub>![P2 Badge](https://img.shields.io/badge…

Disposition: fixed
Source: #256 (comment)
Path: results-explorer/src/pages/BenchmarkIndex.tsx

* fix(pr-followup): PR #266 comment 3204995419 — **<sub><sub>![P1 Badge](https://img.shields.io/badge…

Disposition: fixed
Source: #266 (comment)
Path: results-explorer/src/lib/facetDisplay.ts

* fix(pr-followup): PR #267 comment 3205001733 — **<sub><sub>![P2 Badge](https://img.shields.io/badge…

Disposition: fixed
Source: #267 (comment)
Path: _project/scripts/skill_sync_lock_audit.py

---------

Co-authored-by: Joe Harris <57046+joeharris76@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant