Harden review prompt against XML tag injection#90
Conversation
Replace predictable XML tags (<pr-description>, <diff>, <spec>, etc.) with randomly generated boundary delimiters (MIME-style). Each block gets a unique token via secrets.token_hex(4), so an attacker who controls PR content cannot predict or forge a closing delimiter to break out of their data block and inject instructions. Fixes #89
Review by KaiReview: PR #90 — Harden review prompt against XML tag injectionThe core fix is sound. All seven acceptance criteria from the spec are satisfied. Suggestion — The three unconditional boundary tokens ( Suggestion — Both new tests do def test_each_block_has_unique_token(self):
...
import re
tokens = re.findall(...)
Note — The comment "statistically near-certain" is accurate. With 6 blocks × 32-bit tokens per prompt, the probability all 6 tokens happen to collide between two invocations is roughly 1 in 2^32. Acceptable for a test. If this ever becomes a CI flake concern (it won't), the fix would be to mock Clean otherwise. The preamble update correctly describes the boundary format to the model, the ordering test ( |
Review by KaiReview: PR #90 — Harden review prompt against XML tag injectionThe prior review's only actionable suggestion (moving All seven acceptance criteria from the spec are satisfied. Checking each:
One minor thing worth noting (not a blocking issue): Suggestion — tokens = re.findall(r"--- BEGIN \w+ ([0-9a-f]{8}) ---", prompt)
Clean otherwise. The |
…ersal fix Reverts the issue-body fetching from PR #88 and restores local filesystem spec loading. The key change from the original code: load_spec() now contains paths within the repo root using Path.resolve().relative_to() to prevent path traversal attacks. Why revert: piping untrusted external content (GitHub issue bodies) directly into a Claude session is a prompt injection surface. The boundary tokens from PR #90 prevent structural injection (delimiter escape) but not semantic injection (content inside the boundary influencing Claude's behavior). The security principle: don't build pipelines from external content to LLM sessions. A human reads the issue, copies relevant content to a local spec file, and references it in the PR. The human is the firewall. This also avoids establishing a pattern that future agents with more capabilities could inherit. The review agent today is toolless and non-interactive, but the pattern of 'fetch external content, pipe to LLM' is dangerous to normalize. What stays from PRs #88/#90: - Boundary tokens (PR #90) for PR descriptions and diffs (these MUST be fed to the agent since they are what is being reviewed) - _resolve_local_repo() improvement (better than old home_repo_name) - Prior comment awareness, issue triage agent, etc. (unrelated) Fixes #91
…ersal fix (#92) * Remove issue-body fetching, restore local spec loading with path traversal fix Reverts the issue-body fetching from PR #88 and restores local filesystem spec loading. The key change from the original code: load_spec() now contains paths within the repo root using Path.resolve().relative_to() to prevent path traversal attacks. Why revert: piping untrusted external content (GitHub issue bodies) directly into a Claude session is a prompt injection surface. The boundary tokens from PR #90 prevent structural injection (delimiter escape) but not semantic injection (content inside the boundary influencing Claude's behavior). The security principle: don't build pipelines from external content to LLM sessions. A human reads the issue, copies relevant content to a local spec file, and references it in the PR. The human is the firewall. This also avoids establishing a pattern that future agents with more capabilities could inherit. The review agent today is toolless and non-interactive, but the pattern of 'fetch external content, pipe to LLM' is dangerous to normalize. What stays from PRs #88/#90: - Boundary tokens (PR #90) for PR descriptions and diffs (these MUST be fed to the agent since they are what is being reviewed) - _resolve_local_repo() improvement (better than old home_repo_name) - Prior comment awareness, issue triage agent, etc. (unrelated) Fixes #91 * Fix None description crash and branch spec containment - Guard resolve_spec_from_body against None description (GitHub sends null for PRs with no body, causing AttributeError on splitlines). - Add Path.resolve().relative_to() containment check to strategy 2 (branch-based spec path) matching strategy 1. A misconfigured spec_dir pointing outside the repo would otherwise leak files.
Summary
<pr-description>,<diff>,<spec>, etc.) with randomly generated boundary delimiters inbuild_review_prompt()secrets.token_hex(4), generated fresh per invocationFixes #89
Test plan