Skip to content

review: harden nightshift/evaluation.py and fix 2 pentest findings#165

Merged
fazxes merged 2 commits intomainfrom
review/evaluation-py-and-pentest-fixes
Apr 6, 2026
Merged

review: harden nightshift/evaluation.py and fix 2 pentest findings#165
fazxes merged 2 commits intomainfrom
review/evaluation-py-and-pentest-fixes

Conversation

@fazxes
Copy link
Copy Markdown
Member

@fazxes fazxes commented Apr 6, 2026

Summary

  • evaluation.py (4 code quality fixes): move _TEMPLATE_MARKERS to EVALUATION_TEMPLATE_MARKERS in constants.py, extract /tmp/nightshift-eval to EVALUATION_CLONE_DEST, refactor fragile notes_parts[-1] = mutation in score_clean_state, remove dead try/except around rmtree(ignore_errors=True)
  • daemon.sh (pentest): add opening-tag sanitization for <prompt_alert> in ALERT_CONTENT — previously only closing tags were stripped; mirrors the four-expression pattern on PENTEST_REPORT
  • lib-agent.sh (pentest): convert task_files_to_add from unquoted string to bash array — removes SC2086 suppression, safe with spaces in paths
  • 5 new tests; 1057 passing

Test plan

  • make check passes (1057 tests, ruff, mypy, dry-runs, shell syntax, ASCII check)

evaluation.py (4 issues):
- Move _TEMPLATE_MARKERS to EVALUATION_TEMPLATE_MARKERS in constants.py
- Extract hardcoded /tmp path to EVALUATION_CLONE_DEST constant
- Refactor score_clean_state fragile notes_parts[-1] mutation to if/elif/else
- Remove redundant try/except OSError around rmtree(ignore_errors=True)

daemon.sh (pentest): add opening-tag sanitization for <prompt_alert> in
ALERT_CONTENT sed block -- closing tags were stripped but opening tags
survived, creating a potential nested wrapper confusion vector.

lib-agent.sh (pentest): convert task_files_to_add from unquoted
space-separated string to bash array; removes SC2086 suppress comment.

5 new tests: EVALUATION_TEMPLATE_MARKERS, EVALUATION_CLONE_DEST constants,
prompt_alert opening tag pattern present, opening tag sanitized.
Comment thread tests/test_nightshift.py
Comment on lines +9708 to +9710
"echo '<prompt_alert>injected</prompt_alert>' | "
"sed -e 's|<[[:space:]]*/[[:space:]]*prompt_alert[[:space:]]*>|[/prompt_alert]|g'"
" -e 's|<[[:space:]]*prompt_alert[^>]*>|[prompt_alert]|g'",
@fazxes fazxes merged commit 5141160 into main Apr 6, 2026
1 check passed
@fazxes fazxes deleted the review/evaluation-py-and-pentest-fixes branch April 6, 2026 21:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant