diff --git a/.recursive/tasks/0077.md b/.recursive/tasks/0077.md index cb74168..52abbc6 100644 --- a/.recursive/tasks/0077.md +++ b/.recursive/tasks/0077.md @@ -1,10 +1,10 @@ --- -status: wontfix +status: done priority: normal target: v0.0.8 vision_section: none created: 2026-04-05 -completed: 2026-04-06 +completed: 2026-04-09 --- # Fix dependency flow overlap between code-reviewer and architecture-reviewer diff --git a/.recursive/tasks/0078.md b/.recursive/tasks/0078.md index 2ad3fa0..66f4bd9 100644 --- a/.recursive/tasks/0078.md +++ b/.recursive/tasks/0078.md @@ -1,10 +1,10 @@ --- -status: pending +status: done priority: normal target: v0.0.8 vision_section: none created: 2026-04-05 -completed: +completed: 2026-04-09 --- # Add live integration test gate to PR review pipeline @@ -21,3 +21,7 @@ Source: code review advisory note on PR #63. - [ ] Step provides examples of what "real test" means for different PR types (Python module, shell script, config change) - [ ] The test is proportional to the PR -- small PRs get a simple smoke test, large PRs get more thorough verification - [ ] The builder runs the test and includes the output in the PR body or merge commit + +## Closed by OVERSEE — Obsolete + +This task references "evolve.md Step 8" which does not exist. The current `evolve/SKILL.md` has 6 steps (Step 6 is PR + Merge + Handoff). The task also references the "multi-agent review panel" from PR #63, which was replaced by the unified review architecture (PR #107). The concept of running a live integration test before merge remains valid, but this task's specific instructions target non-existent infrastructure. If the integration test gate concept is still wanted, it should be filed as a new task targeting the current `build/SKILL.md` review step with updated context. diff --git a/.recursive/tasks/0080.md b/.recursive/tasks/0080.md index b689090..c9650c9 100644 --- a/.recursive/tasks/0080.md +++ b/.recursive/tasks/0080.md @@ -1,10 +1,10 @@ --- -status: wontfix +status: done priority: low target: v0.0.9 vision_section: meta-prompt created: 2026-04-05 -completed: +completed: 2026-04-09 --- # Unify path-based file categorization across modules diff --git a/.recursive/tasks/0107.md b/.recursive/tasks/0107.md index 936b807..7474c51 100644 --- a/.recursive/tasks/0107.md +++ b/.recursive/tasks/0107.md @@ -1,10 +1,10 @@ --- -status: wontfix +status: done priority: normal target: v0.0.9 vision_section: self-maintaining created: 2026-04-05 -completed: +completed: 2026-04-09 --- # Activate reviewer daemon cadence and durable review indexing diff --git a/.recursive/tasks/0111.md b/.recursive/tasks/0111.md index cb7512c..eb8b117 100644 --- a/.recursive/tasks/0111.md +++ b/.recursive/tasks/0111.md @@ -1,11 +1,11 @@ --- -status: wontfix +status: done priority: low target: v0.0.8 vision_section: none created: 2026-04-05 source: pr-86-review -completed: +completed: 2026-04-09 --- # Module map should distinguish late imports from hard dependencies diff --git a/.recursive/tasks/0115.md b/.recursive/tasks/0115.md index c694e46..1a5833c 100644 --- a/.recursive/tasks/0115.md +++ b/.recursive/tasks/0115.md @@ -1,10 +1,10 @@ --- -status: wontfix +status: done priority: low target: v0.0.9 vision_section: self-maintaining created: 2026-04-05 -completed: +completed: 2026-04-09 source: pr-88-review --- diff --git a/.recursive/tasks/0119.md b/.recursive/tasks/0119.md index 3ef26ae..e21df92 100644 --- a/.recursive/tasks/0119.md +++ b/.recursive/tasks/0119.md @@ -1,5 +1,5 @@ --- -status: wontfix +status: done priority: normal target: v0.0.8 vision_section: none diff --git a/.recursive/tasks/0122.md b/.recursive/tasks/0122.md index 482cb25..30e20ae 100644 --- a/.recursive/tasks/0122.md +++ b/.recursive/tasks/0122.md @@ -19,4 +19,10 @@ README contracts. - [ ] A validation check fails if README advertises a bare `nightshift` command without a shipped console script - [ ] README config guidance is checked against `.recursive.json`, or the README is generated/sourced so that drift cannot happen silently - [ ] README snapshot values are generated from canonical sources or validated against them so tracker/test/module counts cannot silently rot +- [ ] `scripts/validate-docs.sh` (or equivalent in `make check`) fails when the latest handoff, tracker, and README disagree on shared snapshot values such as test counts or section percentages (merged from #0124) +- [ ] Shared snapshot values are sourced from canonical data or validated together instead of being maintained independently - [ ] The new README consistency rule runs in an existing verification path such as `make check` or `bash scripts/validate-docs.sh` + +## Note + +Task #0124 (validate tracker and handoff snapshots alongside README) was merged into this task by OVERSEE on 2026-04-09. Both cover doc snapshot validation and belong in the same PR. diff --git a/.recursive/tasks/0124.md b/.recursive/tasks/0124.md index 5ae54ed..5afb4ab 100644 --- a/.recursive/tasks/0124.md +++ b/.recursive/tasks/0124.md @@ -1,10 +1,10 @@ --- -status: pending +status: done priority: normal target: v0.0.9 vision_section: self-maintaining created: 2026-04-05 -completed: +completed: 2026-04-09 --- # Validate tracker and handoff snapshot values alongside README snapshots @@ -19,3 +19,7 @@ tracker snapshot values first-class consistency targets. - [ ] `scripts/validate-docs.sh` (or an equivalent check in `make check`) fails when the latest handoff, tracker, and README disagree on shared snapshot values such as test counts or section percentages - [ ] Shared snapshot values are sourced from canonical data or validated together instead of being maintained independently - [ ] Regression coverage exists for at least one stale handoff/tracker/README snapshot mismatch + +## Closed by OVERSEE — Merged into #0122 + +Task #0122 (README consistency checks) already covers the same scope: validating that docs agree on shared snapshot values. #0124 adds tracker and handoff snapshot values to the validation surface, but this is naturally included in the same validation pass. A builder implementing #0122 should extend the validation to cover tracker and handoff snapshots (as described in #0124's AC) in the same PR rather than a separate one. diff --git a/.recursive/tasks/0127.md b/.recursive/tasks/0127.md index 75fa63e..3238743 100644 --- a/.recursive/tasks/0127.md +++ b/.recursive/tasks/0127.md @@ -1,10 +1,10 @@ --- -status: wontfix +status: done priority: low target: v0.0.9 vision_section: none created: 2026-04-05 -completed: +completed: 2026-04-09 --- # Keep task-frontmatter parsing aligned across queue scripts diff --git a/.recursive/tasks/0129.md b/.recursive/tasks/0129.md index ee8e16a..f0ec12e 100644 --- a/.recursive/tasks/0129.md +++ b/.recursive/tasks/0129.md @@ -1,11 +1,11 @@ --- -status: wontfix +status: done priority: normal target: v0.0.9 vision_section: meta-prompt created: 2026-04-05 source: github-issue-102 -completed: 2026-04-06 +completed: 2026-04-09 --- # Task queue cap: stop generating tasks when 50+ pending diff --git a/.recursive/tasks/0134.md b/.recursive/tasks/0134.md index 878786c..929c20e 100644 --- a/.recursive/tasks/0134.md +++ b/.recursive/tasks/0134.md @@ -1,11 +1,11 @@ --- -status: wontfix +status: done priority: low target: v0.0.9 vision_section: self-maintaining created: 2026-04-05 source: review-pr-106-docs -completed: +completed: 2026-04-09 --- # Keep module-map key symbols aligned with verifier-related helper changes diff --git a/.recursive/tasks/0162.md b/.recursive/tasks/0162.md index 89b9467..cf907cf 100644 --- a/.recursive/tasks/0162.md +++ b/.recursive/tasks/0162.md @@ -16,3 +16,9 @@ This means fix quality is scored against a superset of what was counted in disco ## Acceptance Criteria - [ ] `score_discovery` counts fixes from rejected cycles even when some accepted cycles also exist, OR the behavior is explicitly documented as intentional with a comment explaining the tradeoff - [ ] A regression test covers a mixed accepted+rejected run and verifies the fix counts are consistent between discovery and fix-quality scorers +- [ ] A test covers `_extract_cycle_fixes` with an accepted cycle where `fixes: []` and verifies it returns `[]` without falling through to `cycle_result` data (merged from #0163) +- [ ] Test suite remains green after additions + +## Note + +Task #0163 (empty fixes list test) was merged into this task by OVERSEE on 2026-04-09. Both tests belong in the same PR targeting the scoring module. diff --git a/.recursive/tasks/0163.md b/.recursive/tasks/0163.md index fa8df06..d27539d 100644 --- a/.recursive/tasks/0163.md +++ b/.recursive/tasks/0163.md @@ -1,10 +1,10 @@ --- -status: pending +status: done priority: low target: v0.0.9 vision_section: loop1 created: 2026-04-06 -completed: +completed: 2026-04-09 --- # Missing test for accepted cycle with empty fixes list in _extract_cycle_fixes @@ -14,3 +14,7 @@ Code review for PR #158 flagged a missing test case: when a cycle has `fixes: [] ## Acceptance Criteria - [ ] A test in `TestScoreFixQuality` or a dedicated `TestExtractCycleFixes` class covers an accepted cycle with `fixes: []` and verifies `_extract_cycle_fixes` returns `[]` (not falling through to any `cycle_result` data) - [ ] Test suite remains green after the addition + +## Closed by OVERSEE — Merged into #0162 + +Both #0162 and #0163 are small test additions for the scoring module (`score_discovery`/`score_fix_quality`/`_extract_cycle_fixes`). They address the same PR #158 code review and would naturally be completed in one PR. The `_extract_cycle_fixes` empty-fixes test case (#0163) should be added alongside the mixed accepted+rejected run test (#0162 AC) when a builder picks #0162. diff --git a/.recursive/tasks/0173.md b/.recursive/tasks/0173.md index b14d2dd..df5c31e 100644 --- a/.recursive/tasks/0173.md +++ b/.recursive/tasks/0173.md @@ -26,6 +26,12 @@ to existing files. - [ ] `nightshift/scripts/run.sh` and `nightshift/scripts/test.sh` are added to `PROMPT_GUARD_FILES` in `Recursive/engine/lib-agent.sh` +- [ ] `.recursive.json` is added to `PROMPT_GUARD_FILES` in `Recursive/engine/lib-agent.sh` + (merged from #0196 — defense-in-depth against eval_frequency/verify_command tampering) - [ ] `bash -n Recursive/engine/lib-agent.sh` passes - [ ] Existing prompt-guard test coverage still passes - [ ] `make check` passes + +## Note + +Task #0196 (add .recursive.json to PROMPT_GUARD_FILES) was merged into this task by OVERSEE on 2026-04-09. Both are single-line additions to the same array in `lib-agent.sh`. diff --git a/.recursive/tasks/0174.md b/.recursive/tasks/0174.md index e04bcc5..f54418a 100644 --- a/.recursive/tasks/0174.md +++ b/.recursive/tasks/0174.md @@ -19,5 +19,13 @@ a `text` block or a `tool_result`) does NOT trigger a false positive. - [ ] Add a regression test: a log where "not logged in" appears in a non-`result`, non-`agent_message` event body returns exit code 1 from `is_auth_failure` +- [ ] Add a regression test: a log file with a corrupt non-JSON line followed by a valid + `type:result` auth-failure event returns exit code 0 from `is_auth_failure` + (covers malformed-JSON handling from merged task #0175) - [ ] The test class `TestAuthFailureDetection` in `test_nightshift.py` is the right place (PR #168 context) + +## Note + +Task #0175 (malformed JSON test) was merged into this task by OVERSEE on 2026-04-09. +Both test cases belong in the same PR against `TestAuthFailureDetection`. diff --git a/.recursive/tasks/0175.md b/.recursive/tasks/0175.md index abfba77..20e220a 100644 --- a/.recursive/tasks/0175.md +++ b/.recursive/tasks/0175.md @@ -1,10 +1,10 @@ --- -status: pending +status: done priority: normal target: v0.0.9 vision_section: self-maintaining created: 2026-04-06 -completed: +completed: 2026-04-09 --- # Add test: is_auth_failure handles malformed JSON lines before a valid auth result @@ -18,3 +18,7 @@ to skip malformed lines. This path has no explicit regression test. `type:result` auth-failure event returns exit code 0 from `is_auth_failure` - [ ] The test class `TestAuthFailureDetection` in `test_nightshift.py` is the right place (PR #168 context) + +## Closed by OVERSEE — Merged into #0174 + +Both #0174 and #0175 add tests to `TestAuthFailureDetection` in `test_nightshift.py` and will naturally be done in the same PR. The malformed-JSON test case (#0175 AC) is a straightforward additional test case that belongs alongside the non-result-event test case (#0174 AC). A builder picking #0174 should include the #0175 test case in the same commit. diff --git a/.recursive/tasks/0179.md b/.recursive/tasks/0179.md index b0ea7b5..ccb9a7d 100644 --- a/.recursive/tasks/0179.md +++ b/.recursive/tasks/0179.md @@ -37,4 +37,10 @@ def test_crlf_eval_file_accepted(self) -> None: ## Acceptance Criteria - [ ] `TestIsValidEvalFile` has a CRLF test that passes +- [ ] In `read_latest_eval_score()` in `pick-role.py`, when `_is_valid_eval_file()` rejects a file, print a warning to stderr including the filename (merged from #0180) +- [ ] A rejected eval file triggers the warning; existing rejection behavior (returns None) is unchanged - [ ] `make check` passes + +## Note + +Task #0180 (log warning on `_is_valid_eval_file()` rejection) was merged into this task by OVERSEE on 2026-04-09. Both touch the same function and originated from the same PR #170 review. diff --git a/.recursive/tasks/0180.md b/.recursive/tasks/0180.md index 546ebab..1719431 100644 --- a/.recursive/tasks/0180.md +++ b/.recursive/tasks/0180.md @@ -1,11 +1,11 @@ --- -status: pending +status: done priority: low target: v0.0.8 vision_section: self-maintaining created: 2026-04-06 source: review-pr-170 -completed: +completed: 2026-04-09 --- # Log warning when _is_valid_eval_file() rejects an eval file in pick-role.py @@ -37,3 +37,7 @@ if not _is_valid_eval_file(text): - [ ] The warning appears in the ROLE DECISION stderr block visible in daemon logs - [ ] Existing `TestEvalFileValidation` tests still pass (rejection still returns None) - [ ] `make check` passes + +## Closed by OVERSEE — Merged into #0179 + +Both #0179 (CRLF regression test for `_is_valid_eval_file()`) and #0180 (log warning when `_is_valid_eval_file()` rejects) touch the same function in `pick-role.py`. They are both follow-ups from the same PR #170 review and will naturally be done in a single PR. A builder picking #0179 should add the stderr warning from #0180 in the same change. diff --git a/.recursive/tasks/0196.md b/.recursive/tasks/0196.md index df31c0c..b612916 100644 --- a/.recursive/tasks/0196.md +++ b/.recursive/tasks/0196.md @@ -1,9 +1,10 @@ --- -status: pending +status: done priority: low target: v0.0.9 vision_section: self-maintaining created: 2026-04-07 +completed: 2026-04-09 source: pentest --- @@ -28,3 +29,7 @@ Still worth guarding for defense-in-depth. - [ ] `bash -n Recursive/engine/lib-agent.sh` passes - [ ] Existing prompt-guard tests still pass - [ ] `make check` passes + +## Closed by OVERSEE — Merged into #0173 + +Both #0173 (add scripts/run.sh and scripts/test.sh to PROMPT_GUARD_FILES) and #0196 (add .recursive.json to PROMPT_GUARD_FILES) make static additions to the `PROMPT_GUARD_FILES` array in `lib-agent.sh`. They are one-line changes in the same array and will naturally be done in a single PR. A builder picking #0173 should add `.recursive.json` at the same time. diff --git a/.recursive/tasks/0230.md b/.recursive/tasks/0230.md index b3fb983..439e725 100644 --- a/.recursive/tasks/0230.md +++ b/.recursive/tasks/0230.md @@ -1,10 +1,10 @@ --- -status: pending +status: done priority: low target: created: 2026-04-08 +completed: 2026-04-09 source: pr-230-review -completed: --- # Keep _DELEGATION_ROLE_MAP in sync with available sub-agents @@ -15,3 +15,7 @@ Meta-reviewer advisory note on PR #230: if the brain ever delegates a new sub-ag - [ ] When a new sub-agent type is added to `.recursive/agents/`, `_DELEGATION_ROLE_MAP` is documented as a required update in the new-agent checklist or OPERATIONS.md - [ ] Alternatively, add a test that verifies all agent definition files in `.recursive/agents/` have a corresponding entry in `_DELEGATION_ROLE_MAP` - [ ] `make check` passes + +## Closed by OVERSEE — Low value / speculative + +`_DELEGATION_ROLE_MAP` currently covers all 8 delegatable agent types (build, review, oversee, strategize, achieve, security-check, evolve, audit). The system has been stable with these 8 roles for the entire v2 era. New agent types are rare architectural events that require significant framework changes (new SKILL.md, new agent definition, new operator directory) — at that point, updating a 10-line dict in signals.py is trivial and will be obvious. The documentation overhead of tracking this in a checklist or test is not worth the maintenance cost for an event that may never occur.