audit: flag pending/skipped/blocked/hollow tasks after execute by peteromallet · Pull Request #20 · peteromallet/megaplan

peteromallet · 2026-05-05T05:15:39Z

Summary

Closes the audit blind spot that let Sprint 3 silently ship ~30% of its scope: `_validate_execution_evidence_code` only compared files-claimed-vs-files-in-diff and never noticed that 10 of 14 tasks were still `status=pending` after the executor died on quota.

After this PR, the audit emits a finding for any of:

Tasks left at `status=pending` after execute (executor never started them)
Tasks marked `skipped` or `blocked` with empty executor_notes (no reason recorded)
Tasks marked `done` with neither files_changed nor commands_run (suspiciously hollow)

Findings flow through the existing auto-driver retry path. If the executor genuinely had more to do, the chain re-dispatches execute. If nothing can advance, the chain stalls visibly on a known reason instead of producing an "audit clean" artifact for an obviously-incomplete run.

Test plan

`pytest tests/test_evaluation.py` — 70 passed (11 `validate_execution_evidence` cases including 5 new ones)
No regressions vs main (the 4 pre-existing test failures in `test_finalize`/`test_cloud_chain_status` reproduce on `main` without this PR)

🤖 Generated with Claude Code

Closes a real audit blind spot: validate_execution_evidence_code only checked files-claimed-vs-files-in-diff and rubber-stamp executor_notes, silently treating tasks left at status=pending after execute as fine. A chain that lost ~70% of its sprint scope to mid-execute quota exhaustion shipped a "clean" audit artifact this morning. Now emit a finding for any of: - Tasks left at `status=pending` after execute (executor never started them — the most common silent-scope-shrink mode). - Tasks marked `skipped` or `blocked` with empty executor_notes (no reason recorded, indistinguishable from dropped-on-floor). - Tasks marked `done` but with neither files_changed nor commands_run (suspiciously hollow — likely skipped without flagging). Findings flow through the existing auto-driver retry path: if the executor genuinely had more to do, the chain re-dispatches execute. If nothing can advance, the chain stalls visibly on a known reason ("Tasks left pending: T7, T8, T10, T12") instead of the prior silent silver lining of an empty audit. Tests: 5 new cases covering pending, skipped-without-reason, blocked-without-reason, hollow-done, and a clean-run negative case. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

peteromallet merged commit 9303d40 into main May 5, 2026

peteromallet deleted the megaplan-audit-incomplete-tasks branch May 5, 2026 05:15

peteromallet mentioned this pull request May 21, 2026

Cloud recovery can stay blocked on stale hollow finalize.json task evidence #42

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

audit: flag pending/skipped/blocked/hollow tasks after execute#20

audit: flag pending/skipped/blocked/hollow tasks after execute#20
peteromallet merged 1 commit into
mainfrom
megaplan-audit-incomplete-tasks

peteromallet commented May 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

peteromallet commented May 5, 2026

Summary

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant