audit: session-126 framework audit (18 sessions since last)#248
Conversation
Quality audit findings and fixes (18 sessions since last audit): 1. OPERATIONS.md: test count 915 -> 1156 (stale by 241 tests) 2. OPERATIONS.md: add v0.0.7 and v0.0.8 version milestones (both missing) 3. OPERATIONS.md + CLAUDE.md: sync dependency flows -- both were divergent; add owl/eval_runner; correct settings/eval_targets ordering 4. DAEMON.md: arg 2 was "pause seconds" but daemon uses it as duration_hours 5. DAEMON.md: remove hardcoded absolute paths (/Users/no9labs/...) with wrong .recursive placement; replace with relative paths 6. DAEMON.md: remove stale pentest log references (v1 artifact, files don't exist) 7. ROLE-SCORING.md: add pentest_framework_tasks and sessions_since_eval signals (both active in pick-role.py since sessions #109 and #124 respectively) 8. sessions/index.md: fix corrupted role field in session 20260409-020609 (shell injection artifact from regex extractor: .*'"$LOG_FILE"2>/d -> brain) Tasks created: - #249: Regenerate MODULE_MAP.md (stale since session #1, shows only 3 modules) - #250: Fix DAEMON.md cycle lifecycle git commands (shows checkout+clean-fd, actual daemon only does fetch+reset) make check: 1156 tests pass
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: e7bdf231d2
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| - **No hardcoded data in logic files.** Regex patterns, score maps, category weights, thresholds -- these go in `core/constants.py` or a dedicated `*_patterns.py`. Logic files import them. | ||
| - **New module checklist:** create the `.py` file in the appropriate subpackage (`core/`, `settings/`, `owl/`, `raven/`, `infra/`), add to the subpackage's `__init__.py` re-exports, add to `nightshift/scripts/install.sh` PACKAGE_FILES, add to this file's structure tree. | ||
| - **Follow the dependency flow:** `core.errors -> core.types -> core.constants -> core.shell -> core.state -> settings.config -> settings.eval_targets -> owl.cycle -> owl.scoring -> owl.readiness -> raven.planner -> raven.decomposer -> raven.subagent -> raven.integrator -> raven.feature -> infra.worktree -> infra.module_map -> infra.multi -> infra.release -> cli`. New modules slot into this chain. No circular imports. (`infra/multi.py` uses a late import of `run_nightshift` from `cli.py` to avoid circular deps.) | ||
| - **Follow the dependency flow:** `core.errors -> core.types -> core.constants -> core.shell -> raven.summary -> raven.coordination -> infra.module_map -> owl.readiness -> owl.scoring -> owl.eval_runner -> core.state -> settings.config -> settings.eval_targets -> infra.multi -> raven.e2e -> raven.profiler -> infra.worktree -> owl.cycle -> raven.planner -> raven.subagent -> raven.decomposer -> raven.integrator -> raven.feature -> infra.release -> cli`. New modules slot into this chain. No circular imports. (`infra/multi.py` uses a late import of `run_nightshift` from `cli.py` to avoid circular deps.) |
There was a problem hiding this comment.
Correct settings dependency direction in architecture flow
The new dependency flow now says settings.config -> settings.eval_targets, but nightshift/settings/config.py directly imports infer_target_verify_command from nightshift.settings.eval_targets (line 16), so settings.eval_targets must appear earlier in the chain for the documented “imports only from the left” rule to remain true. Keeping this reversed in the contributor contract can drive future module placement/import decisions that violate the intended layering and reintroduce circular-import risk.
Useful? React with 👍 / 👎.
Move owl.eval_runner after settings.eval_targets in the dependency chain in both CLAUDE.md and OPERATIONS.md, since eval_runner imports settings.config. Mark sessions_since_eval in ROLE-SCORING.md as dashboard-only (not used in pick-role.py scoring) to prevent misleading documentation.
…hift test When `nightshift test --repo-dir /tmp/nonexistent` is run and the directory does not exist, the tool now auto-clones from PHRACTAL_URL instead of crashing with a FileNotFoundError. If the clone fails, a NightshiftError is raised with the exact `git clone` command so the user can run it manually. - Add PHRACTAL_URL constant to settings/eval_targets.py - Add clone_repo() to infra/worktree.py (S603/S607 already suppressed there) - Add _ensure_repo_dir() to cli.py, called at the top of run_nightshift() - Export clone_repo and PHRACTAL_URL from nightshift/__init__.py - Add TestCloneRepo and TestRunNightshiftMissingRepoDir test classes
…n mode The previous code called _ensure_repo_dir(repo_dir) unconditionally, meaning a user running `nightshift run --repo-dir /nonexistent` would silently clone Phractal into that path. Guard the call behind `if test_mode and not repo_dir.exists()` so auto-clone only fires for `nightshift test`. Add a regression test that asserts run mode does NOT create the missing directory.
feat(#248): auto-clone eval target when repo-dir is missing
Summary
Framework audit covering 18 sessions since the last audit (session #107). Audited 12 framework files. Found 8 issues; fixed 6 directly; created 2 tasks for issues requiring project-zone work.
Issues Fixed
DURATION_HOURS(default 8). The tmux exampledaemon.sh claude 60would be interpreted as 60 hours by the daemon./Users/no9labs/Developer/.recursive/Nightshift/...with.recursivemisplaced. Replaced with plain relative references..recursive/sessions/*-pentest.log(v1 artifact, no longer exists in v2). Removed from Logs table and circuit breaker recovery instructions.pentest_framework_tasks(added session feat: live formatted output in daemon tmux pane #109, +40 boost to evolve) andsessions_since_eval(added session fix: round 6 — all 9 remaining issues patched #124, staleness alert) were not documented. Both are actively used in pick-role.py..*'"$LOG_FILE"2>/d(shell injection artifact from daemon.sh role extractor). Corrected tobrain.owl/eval_runner(added session Rewrite DAEMON.md for unified daemon + pick-role.py architecture #118). Synchronized both to a consistent flow.Tasks Created
python3 -m nightshift module-map --write(build zone).git checkout mainandgit clean -fdwhich don't appear in actual daemon.sh.Pattern Analysis
Sessions #107-#125 analyzed:
Test Plan
make checkpasses (1156 tests, no regressions)