feat: add constrained red attack runner#24
Conversation
|
Warning Rate limit exceeded
You’ve run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Organization UI Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (5)
📝 WalkthroughWalkthroughThis PR implements constrained attack script execution for the nullstate purple-team CLI. It introduces an ChangesConstrained Attack Script Execution with Event Logging
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Tip 💬 Introducing Slack Agent: The best way for teams to turn conversations into code.Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.
Built for teams:
One agent for your entire SDLC. Right inside Slack. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@README.md`:
- Line 50: Confirm whether the "stage" field is actually emitted to events.jsonl
by the red tool runner (the generator that writes events.jsonl alongside
generated attack.py runs) and if so, update the README text to include "stage"
in the list of recorded fields; if it is not emitted, update the PR description
to remove "stage" from the objectives and/or add a note in the README explaining
why it is omitted. While editing the README sentence, consider splitting the
long sentence into 2–3 shorter sentences and ensure the links to Security Model
and Threat Model remain intact.
In `@src/nullstate/attack_runner.py`:
- Around line 53-60: The subprocess.run call in attack_runner.py (the block that
sets completed = subprocess.run(...)) can raise subprocess.TimeoutExpired; catch
subprocess.TimeoutExpired around that call and convert it into an
AttackToolResult instead of letting it propagate: import or reference
subprocess.TimeoutExpired, set a timed_out/timeout flag in the AttackToolResult,
populate stdout and stderr using the exception's output and stderr attributes
(or empty strings if missing), include the timeout_seconds and
command/resolved_run_dir in the result metadata, and return that
AttackToolResult; ensure existing success/failure code paths (which use the
completed variable) remain unchanged for non-timeout cases.
In `@src/nullstate/attack.py`:
- Around line 15-27: Duplicate probe_target logic found in multiple script
templates (probe_target in src/nullstate/attack.py and the same function in the
azure-public-blob and aws-public-s3 templates); extract it into a shared
template helper (e.g., a template fragment named probe_target_helper) and update
each script template to include or import that helper instead of embedding the
function inline. Ensure the shared helper preserves the same signature
(probe_target(target_url: str, stage: str) -> int), same behavior around URL
validation, health_url construction, urllib.request.urlopen usage and error
handling (return codes 0/2), and update the templates to reference the helper
symbol so changes are centralized.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 415be843-8957-4d70-b2c3-415454300e1f
📒 Files selected for processing (9)
README.mddocs/case-study.mddocs/demo-script.mddocs/security-model.mdsrc/nullstate/attack.pysrc/nullstate/attack_runner.pysrc/nullstate/cli.pytests/test_attack_runner.pytests/test_cli.py
| ## Security model | ||
|
|
||
| V1 does not target real cloud environments by default. Sandboxes are explicit, run artifacts are local, and remediation happens in a copied run workspace rather than mutating the original Terraform directory. See [Security Model](docs/security-model.md) and [Threat Model](docs/threat-model.md). | ||
| V1 does not target real cloud environments by default. Sandboxes are explicit, run artifacts are local, and remediation happens in a copied run workspace rather than mutating the original Terraform directory. The red tool runner is constrained to generated `attack.py` scripts inside the run directory and records command, stdout, stderr, return code, target URL, and timestamps in `events.jsonl`. See [Security Model](docs/security-model.md) and [Threat Model](docs/threat-model.md). |
There was a problem hiding this comment.
Verify the omission of "stage" field in events.jsonl documentation.
The PR objectives state that events.jsonl records "command, stdout, stderr, return code, target URL, stage, and timestamps," but this line omits "stage" from the documented fields. Please verify whether "stage" should be included in the documentation.
Optional: Consider breaking the long sentence for readability.
The security model sentence spans 71 words with multiple clauses. While grammatically correct, splitting it into 2-3 shorter sentences could improve scannability.
✍️ Suggested readability improvement
-V1 does not target real cloud environments by default. Sandboxes are explicit, run artifacts are local, and remediation happens in a copied run workspace rather than mutating the original Terraform directory. The red tool runner is constrained to generated `attack.py` scripts inside the run directory and records command, stdout, stderr, return code, target URL, and timestamps in `events.jsonl`. See [Security Model](docs/security-model.md) and [Threat Model](docs/threat-model.md).
+V1 does not target real cloud environments by default. Sandboxes are explicit, run artifacts are local, and remediation happens in a copied run workspace rather than mutating the original Terraform directory.
+
+The red tool runner is constrained to generated `attack.py` scripts inside the run directory. It records command, stdout, stderr, return code, target URL, and timestamps in `events.jsonl`.
+
+See [Security Model](docs/security-model.md) and [Threat Model](docs/threat-model.md).🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@README.md` at line 50, Confirm whether the "stage" field is actually emitted
to events.jsonl by the red tool runner (the generator that writes events.jsonl
alongside generated attack.py runs) and if so, update the README text to include
"stage" in the list of recorded fields; if it is not emitted, update the PR
description to remove "stage" from the objectives and/or add a note in the
README explaining why it is omitted. While editing the README sentence, consider
splitting the long sentence into 2–3 shorter sentences and ensure the links to
Security Model and Threat Model remain intact.
| completed = subprocess.run( | ||
| command, | ||
| cwd=resolved_run_dir, | ||
| text=True, | ||
| capture_output=True, | ||
| check=False, | ||
| timeout=timeout_seconds, | ||
| ) |
There was a problem hiding this comment.
Handle subprocess.TimeoutExpired exception.
The subprocess.run call can raise subprocess.TimeoutExpired when the timeout is exceeded, but this exception is not caught. Callers expect an AttackToolResult regardless of timeout, so the exception should be caught and converted to a result with appropriate metadata.
🛡️ Proposed fix to handle timeout
+import subprocess
+
def run_attack_script(
script_path: Path,
*,
run_dir: Path,
target_url: str,
stage: str,
timeout_seconds: int = 30,
) -> AttackToolResult:
resolved_script = script_path.resolve()
resolved_run_dir = run_dir.resolve()
_validate_attack_script(resolved_script, resolved_run_dir)
command = [
sys.executable,
str(resolved_script),
"--target-url",
target_url,
"--stage",
stage,
]
started_at = datetime.now(UTC).isoformat()
started = time.monotonic()
- completed = subprocess.run(
- command,
- cwd=resolved_run_dir,
- text=True,
- capture_output=True,
- check=False,
- timeout=timeout_seconds,
- )
+ try:
+ completed = subprocess.run(
+ command,
+ cwd=resolved_run_dir,
+ text=True,
+ capture_output=True,
+ check=False,
+ timeout=timeout_seconds,
+ )
+ except subprocess.TimeoutExpired as exc:
+ ended_at = datetime.now(UTC).isoformat()
+ return AttackToolResult(
+ command=command,
+ target_url=target_url,
+ stage=stage,
+ returncode=-1,
+ stdout=exc.stdout or "",
+ stderr=exc.stderr or "",
+ started_at=started_at,
+ ended_at=ended_at,
+ duration_seconds=round(time.monotonic() - started, 3),
+ )
ended_at = datetime.now(UTC).isoformat()🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/nullstate/attack_runner.py` around lines 53 - 60, The subprocess.run call
in attack_runner.py (the block that sets completed = subprocess.run(...)) can
raise subprocess.TimeoutExpired; catch subprocess.TimeoutExpired around that
call and convert it into an AttackToolResult instead of letting it propagate:
import or reference subprocess.TimeoutExpired, set a timed_out/timeout flag in
the AttackToolResult, populate stdout and stderr using the exception's output
and stderr attributes (or empty strings if missing), include the timeout_seconds
and command/resolved_run_dir in the result metadata, and return that
AttackToolResult; ensure existing success/failure code paths (which use the
completed variable) remain unchanged for non-timeout cases.
| def probe_target(target_url: str, stage: str) -> int: | ||
| print(f"stage={stage} target={target_url}") | ||
| if not target_url.startswith(("http://", "https://")): | ||
| print("offline target selected; no network request performed") | ||
| return 0 | ||
| health_url = target_url.rstrip("/") + "/_localstack/health" | ||
| try: | ||
| with urllib.request.urlopen(health_url, timeout=5) as response: | ||
| print(f"health_url={health_url} status={response.status}") | ||
| return 0 if response.status < 500 else 2 | ||
| except urllib.error.URLError as error: | ||
| print(f"health_url={health_url} error={error}") | ||
| return 2 |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial | 💤 Low value
Consider extracting duplicate probe_target logic.
The probe_target function appears identically in both the azure-public-blob and aws-public-s3 script templates. While the duplication is currently manageable since these are embedded script templates written to standalone files, consider extracting shared logic into a template helper if the script library grows or diverges.
Also applies to: 47-59
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/nullstate/attack.py` around lines 15 - 27, Duplicate probe_target logic
found in multiple script templates (probe_target in src/nullstate/attack.py and
the same function in the azure-public-blob and aws-public-s3 templates); extract
it into a shared template helper (e.g., a template fragment named
probe_target_helper) and update each script template to include or import that
helper instead of embedding the function inline. Ensure the shared helper
preserves the same signature (probe_target(target_url: str, stage: str) -> int),
same behavior around URL validation, health_url construction,
urllib.request.urlopen usage and error handling (return codes 0/2), and update
the templates to reference the helper symbol so changes are centralized.
Summary
Tests
Summary by CodeRabbit
Release Notes
Documentation
New Features
Tests