Skip to content

fix: target step 0 evaluations at cloned repos#94

Merged
fazxes merged 1 commit intomainfrom
fix/step0-eval-targeting
Apr 6, 2026
Merged

fix: target step 0 evaluations at cloned repos#94
fazxes merged 1 commit intomainfrom
fix/step0-eval-targeting

Conversation

@fazxes
Copy link
Copy Markdown
Member

@fazxes fazxes commented Apr 6, 2026

Summary

  • correct the Step 0 evaluation command in the evolve prompt so it passes --repo-dir to the fresh clone
  • add regression coverage for the prompt contract and evaluation helper wiring
  • record evaluation 0009 and related tracker/changelog/handoff/task updates

Test plan

  • make check
  • bash scripts/validate-docs.sh
  • PYTHONPATH=$(pwd) python3 -m nightshift test --agent claude --cycles 2 --cycle-minutes 5 --repo-dir /tmp/nightshift-eval

@fazxes fazxes merged commit 18a5b76 into main Apr 6, 2026
2 checks passed
@fazxes fazxes deleted the fix/step0-eval-targeting branch April 6, 2026 00:42
fazxes added a commit that referenced this pull request Apr 6, 2026
Fix now: #182 -- add open_pr_data tag escaping to PENTEST_REPORT and
ALERT_CONTENT sed sanitizers (confirmed real, no existing task coverage)

Watch next:
- #183 -- NIGHTSHIFT_PENTEST_AGENT env var interpolation in python3 -c (daemon.sh:373)
- #184 -- pick_session_role() stderr+stdout merge fragility via 2>&1
- #185 -- non-numeric eval_frequency crashing bash arithmetic in should_evaluate()

Also confirms prompt-alert changes (lib-agent.sh + pick-role.py) are legitimate
security fixes from session #94 -- no revert needed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant