achieve: fix #0184 -- separate stdout/stderr in pick_session_role by fazxes · Pull Request #179 · Recusive/Nightshift

fazxes · 2026-04-07T01:51:55Z

Summary

Fixes pentest watch item overseer: queue triage -- close 8, fix paths, 4 pentest tasks #184: pick_session_role() in daemon.sh previously merged stdout and stderr from pick-role.py via 2>&1, then used tail -1 to extract SESSION_ROLE. Any unexpected stdout line (atexit handler, uncaught exception after sys.exit, future library print) would corrupt SESSION_ROLE to garbage, silently falling back to build for all subsequent cycles — losing OVERSEE, STRATEGIZE, and ACHIEVE scheduling until a human debugged it.
pick-role.py already separates concerns: role name → stdout, reasoning log → stderr. The fix routes stderr through mktemp so stdout is clean.
2 new contract tests in TestPickSessionRoleStderrSeparation.
Also: autonomy score corrected from 81 → 85 (task-gen and healer rescored 3→5 with evidence).
False-positive documented: docs: comprehensive README update #185 is not a real bug — _require_int() in config.py rejects non-integer eval_frequency before reaching the shell arithmetic.

Changes

scripts/daemon.sh: pick_session_role() uses mktemp instead of 2>&1
tests/test_nightshift.py: 2 new contract tests
docs/autonomy/2026-04-07.md: Autonomy report with score, false-positive analysis
docs/changelog/v0.0.8.md: Entry for this fix
docs/learnings/2026-04-07-stdout-stderr-separation-role-selection.md: Learning
docs/handoffs/LATEST.md: Handoff for next session

Test plan

make check passes: 1134 tests, all checks
python3 -m nightshift run --dry-run --agent codex OK
python3 -m nightshift run --dry-run --agent claude OK
test_pick_session_role_does_not_merge_stderr_into_stdout passes
test_pick_session_role_captures_stderr_via_temp_file passes

Previously daemon.sh:83 merged pick-role.py stdout+stderr with 2>&1 and used tail -1 to extract SESSION_ROLE. Any unexpected stdout line (atexit handler, uncaught exception after sys.exit, future library print) would cause tail -1 to return the wrong line, silently corrupting SESSION_ROLE to an unknown string that falls through to the build default -- locking the daemon into BUILD-only mode indefinitely with no error. pick-role.py already writes reasoning to stderr and the role name to stdout. The fix honors that separation: mktemp captures stderr; stdout is captured cleanly; cat prints the reasoning log in controlled order. Also includes autonomy score correction: 81 -> 85 (task-gen and healer rescored 3->5 with evidence from session index and healer log). 2 new contract tests in TestPickSessionRoleStderrSeparation.

…llow-up) Code review (PR #179) flagged two advisory items: 1. mktemp failure under set -uo pipefail would abort the daemon; add || _pick_stderr="/dev/null" fallback so stderr is silently dropped rather than killing the role-selection flow. 2. Contract tests are static-only; add task #192 to track behavioral test for the stdout isolation guarantee.

…st tasks Queue cleanup after session #103 major restructuring: Closed (8): - #73: AGENTS.md exists (commit 38e1fe5) - #88: duplicate of #69 (auto-changelog) - #141: obsolete (docs/prompt/evolve.md deleted) - #157: obsolete (docs/prompt/feedback/ deleted) - #159: consolidated into #190 - #161: consolidated into #190 - #181: obsolete (docs/prompt/unified.md deleted) - #184: done (fixed by PR #179) Path updates (30+ tasks): - docs/ -> .recursive/ or Recursive/ops/ - scripts/daemon.sh -> Recursive/engine/daemon.sh - scripts/lib-agent.sh -> Recursive/engine/lib-agent.sh - .nightshift.json -> .recursive.json - nightshift/*.py -> nightshift/{core,owl,raven,infra}/*.py Pentest tasks created (4): - #194: budget limiter triple-failure (CONFIRMED) - #195: python3 -c path interpolation (CONFIRMED) - #196: .recursive.json prompt guard (THEORETICAL) - #197: costs.json negative value validation (THEORETICAL)

…one) Queue before: 72 pending + 9 wontfix-in-active-dir Queue after: 65 pending + 0 wontfix (all converted to done for archiving) Merged into primary tasks (5 closures): - #175 -> #174: both add tests to TestAuthFailureDetection, same PR - #163 -> #162: both are scoring module tests from PR #158 review, same PR - #124 -> #122: both validate doc snapshot consistency, same PR scope - #196 -> #173: both add entries to PROMPT_GUARD_FILES in lib-agent.sh - #180 -> #179: both touch _is_valid_eval_file() in pick-role.py, same PR Closed as obsolete (1): - #78: references non-existent "evolve.md Step 8" and the multi-agent review panel replaced by unified review in PR #107 Closed as low-value (1): - #230: _DELEGATION_ROLE_MAP covers all 8 current agent types; new agent types require major framework work making the map update obvious Converted wontfix -> done for archiving (9): - #77, #80, #107, #111, #115, #119, #127, #129, #134 All had wontfix status with rationale already documented; changed to done so daemon's archive_done_tasks() housekeeping removes them

fazxes added 2 commits April 6, 2026 21:51

fazxes merged commit df70ee4 into main Apr 7, 2026

fazxes deleted the achieve/fix-0184-pick-role-stderr-separation branch April 7, 2026 01:54

fazxes mentioned this pull request Apr 9, 2026

oversee: triage task queue — close 16 duplicates and superseded tasks #252

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

achieve: fix #0184 -- separate stdout/stderr in pick_session_role#179

achieve: fix #0184 -- separate stdout/stderr in pick_session_role#179
fazxes merged 2 commits intomainfrom
achieve/fix-0184-pick-role-stderr-separation

fazxes commented Apr 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

fazxes commented Apr 7, 2026

Summary

Changes

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant