Skip to content

security: fix prompt-guard revert-failure and .nightshift.json SSRF#140

Merged
fazxes merged 2 commits intomainfrom
fix/prompt-guard-revert-failure
Apr 6, 2026
Merged

security: fix prompt-guard revert-failure and .nightshift.json SSRF#140
fazxes merged 2 commits intomainfrom
fix/prompt-guard-revert-failure

Conversation

@fazxes
Copy link
Copy Markdown
Member

@fazxes fazxes commented Apr 6, 2026

Summary

  • Finding 1 (REAL, FIXED): check_origin_integrity returned the same exit code (1) whether the revert push succeeded or failed. daemon.sh called reset_repo_state unconditionally after the check. A failed revert pulled tampered origin/main content into the working tree, which then became the new snapshot baseline — silencing all future detections.
    • check_origin_integrity now returns 0 (clean), 1 (tampered + reverted OK), or 2 (tampered + revert FAILED)
    • daemon.sh aborts with notify_human + break on code 2 at both guard points (pentest preflight and post-builder)
  • Finding 2 (REAL, FIXED): .nightshift.json was absent from PROMPT_GUARD_FILES. Task [Nightshift] Circuit breaker tripped #148 incorrectly deferred this to "when webhook is added" — notify_human() already reads notification_webhook from this file and POSTs to it today. An agent that merges an attacker URL intercepts all daemon security alerts.

Test plan

  • test_revert_failure_returns_exit_code_2: attacker pushes to bare repo, then chmod -R a-w blocks the revert push; asserts exit code 2 and WARNING in output
  • test_nightshift_json_in_prompt_guard_files: iterates PROMPT_GUARD_FILES in bash and asserts .nightshift.json is present
  • All 7 TestPromptGuardOriginBlindSpot tests pass; 1011 total passing

Two real pentest findings confirmed and fixed:

1. check_origin_integrity now returns exit code 2 (tampered + revert
push FAILED) distinct from code 1 (tampered + reverted OK). daemon.sh
aborts with notify_human instead of calling reset_repo_state when code
2 is returned, closing the window where a failed revert let tampered
origin/main content into the working tree and poisoned all future
snapshots.

2. .nightshift.json added to PROMPT_GUARD_FILES. notify_human() already
reads notification_webhook from this file and POSTs to it on every
circuit-breaker and budget alert. Previous task #148 incorrectly
deferred this to "when webhook is added" -- the webhook is active today.

Two new tests: test_revert_failure_returns_exit_code_2 (uses chmod
-R a-w to block the revert push) and test_nightshift_json_in_prompt
_guard_files. 1011 tests passing.
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ee5ebe9892

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread tests/test_nightshift.py
# Make origin read-only so the revert push cannot write to it.
# git fetch still works (reads origin; writes to local repo_dir refs).
# git push fails (needs to write to origin_dir).
chmod -R a-w "{origin_dir}"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Make revert-failure test independent of Unix file modes

The new revert-failure test assumes chmod -R a-w will make git push fail, but that assumption breaks when tests run as root (or with elevated capabilities), where write restrictions are bypassed and the push succeeds. In that environment check_origin_integrity returns 1 instead of 2, so this test fails despite correct runtime behavior; I reproduced this with pytest -k test_revert_failure_returns_exit_code_2 in this container. Please use a push-failure mechanism that is permission-model agnostic (e.g., a rejecting hook or invalid push target) so the test is deterministic across CI/container setups.

Useful? React with 👍 / 👎.

@fazxes fazxes merged commit 4e2a61b into main Apr 6, 2026
@fazxes fazxes deleted the fix/prompt-guard-revert-failure branch April 6, 2026 13:39
fazxes added a commit that referenced this pull request Apr 6, 2026
…sions

The prompt guard (check_origin_integrity) auto-reverted PR #140's security
fixes because it saw origin/main change during the session that merged the PR.
This created a circular problem: the guard cannot distinguish `gh pr merge
--merge` (legitimate) from a direct push to origin/main (attack).

Fix: before reverting, check if snap_hash is an ancestor of current_hash AND
the path contains at least one merge commit. Both conditions together are strong
evidence of a legitimate PR merge. If true, return 0 without reverting and log
an INFO message to stdout only (not the alert file).

Also restructured the detection loop to defer alarm printing to the attack path.
Previously, "ORIGIN/MAIN MODIFIED" and "ORIGIN/MAIN CHANGED" messages were
printed in the detection loop before any analysis, causing false alarms for
legitimate PR merges.

Three security regressions restored (removed by b2e2fdf):
- .nightshift.json in PROMPT_GUARD_FILES (notify_human webhook SSRF)
- check_origin_integrity 3-way return codes (0=clean, 1=reverted, 2=failed)
- daemon.sh abort-on-revert-failure at both pentest and builder guard points

Tests: 1012 passing (+3). Two previously failing tests now pass.
New test: test_pr_merge_not_auto_reverted verifies the bootstrap fix.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant