Wreck-It Ralph

Autonomous web application security testing agent powered by Claude.

Wreck-It Ralph orchestrates Claude CLI with browser automation (Playwright MCP) to methodically test web applications for security vulnerabilities. It runs in iterations — each one a full Claude session that picks up where the last left off — with hook-based enforcement of scope, rate limits, and safety controls.

How It Works

flowchart TD
    A["@targets.md + SECURITY_BRIEF.md"] --> B["Wreck-It Ralph Orchestrator"]
    B --> C["Claude CLI + Playwright Browser"]

    C --> D{"Testing Phase"}
    D --> E["Reconnaissance"]
    D --> F["Auth Testing"]
    D --> G["Input Validation"]
    D --> H["Access Control"]
    D --> I["Business Logic"]
    D --> J["API Security"]

    E & F & G & H & I & J --> K["WRECK_STATUS + WRECK_FINDING + WRECK_LEARNED"]

    K --> L{"More phases?"}
    L -- Yes --> M["Next Iteration"]
    M --> C
    L -- No --> N["HTML + Markdown Reports"]

    subgraph Hooks ["Safety Hooks (enforce on every action)"]
        direction LR
        S1["Scope Enforcer"]
        S2["Rate Limiter"]
        S3["Payload Validator"]
        S4["Stop Validator"]
    end

    C -. "every tool call" .-> Hooks
    Hooks -. "block or allow" .-> C

    subgraph Memory ["Persisted Across Iterations"]
        direction LR
        M1["Learned Skills"]
        M2["Findings"]
        M3["Checkpoints"]
        M4["Scope Learning"]
    end

    K --> Memory
    Memory --> B

Features

Core Testing Loop

Phase-based testing — Reconnaissance, Authentication, Input Validation, Access Control, Business Logic, API Security
Iteration continuity — Context injected at each iteration start so Claude knows what was done, what's left, and what failed
Checkpoint recovery — Crash mid-run? Resume from the last completed iteration
Empty iteration detection — Exponential backoff when Claude gets stuck, auto-stops after prolonged stalling

Multi-Target Support

Define multiple related targets (e.g., frontend + API) in one @targets.md
Each target has its own scope, auth config, and type (WebApplication, Api, SinglePageApp, MobileBackend)
Targets can declare dependencies (DependsOn) for cross-target testing (CORS, token leakage)
Scope patterns are combined across all targets for the enforcer hooks

Safety Hooks (Enforced, Not Suggested)

Hooks are Node.js scripts that block Claude's actions until requirements are met. They are not prompt instructions — they are enforcement mechanisms.

Hook	What It Does
`scope-enforcer.mjs`	Blocks navigation to out-of-scope URLs
`rate-limiter.mjs`	Enforces requests-per-minute limit
`payload-validator.mjs`	Blocks destructive payloads (DROP TABLE, rm -rf, etc.)
`stop-validator.mjs`	Blocks output unless WRECK_STATUS block is present and valid
`file-validator.mjs`	Prevents writes to wrong files
`session-start.mjs`	Injects iteration context, skills, and blocked ops history
`activity-tracker.mjs`	Logs all tool use for audit trail

Learned Skills System

Claude accumulates knowledge across iterations:

Claude-reported skills — Claude emits WRECK_LEARNED blocks when it discovers target-specific patterns (WAF behavior, auth quirks, API conventions)
Auto-generated failure skills — Repeated blocked operations automatically become skills so Claude stops retrying the same mistakes
Confidence decay — Unused skills fade over time; frequently referenced skills get boosted
Deduplication — Existing skills are shown to Claude with content previews to prevent redundant reports

Finding Management

Deduplication — Hash-based (URL + param + category + payload) and normalized title matching
Verification — Optional re-test of high-severity findings for confirmation
Evidence capture — HTTP request/response pairs and screenshots stored per finding
OWASP/CWE/WSTG mapping — Findings tagged with industry-standard identifiers

Scope Learning

Tracks repeatedly blocked hosts and suggests scope additions
Classifies blocked URLs by type (API endpoints, CDN, third-party services)
Saves suggestions to logs/scope-learning/scope-suggestions.md

Reporting

HTML report — Styled, self-contained report with finding details, severity breakdown, and evidence
Markdown report — Same content in plain text for version control or further processing
Generated automatically at session end (even on Ctrl+C)

Quality of Life

Interactive setup — Run with no arguments for a guided configuration wizard
System tray icon — Shows progress, current phase, finding count (Windows)
Audio notifications — Sounds for startup, iteration complete, finding discovered, errors
Toast notifications — Windows notifications for completion and errors
Headless mode — Run Playwright without a visible browser window
Temp email accounts — Auto-create test accounts via temporary email services for authenticated testing
Reconnaissance artifacts — Network captures, page snapshots, and screenshots preserved for review

Quick Start

# Build
dotnet build

# Run with no arguments for interactive setup
dotnet run --project src/WreckItRalph

# Or specify options directly
dotnet run --project src/WreckItRalph -- --targets @targets.md --brief SECURITY_BRIEF.md

# Validate configuration without running
dotnet run --project src/WreckItRalph -- --dry-run

# Publish self-contained binary
dotnet publish -c Release -r win-x64

CLI Options

wreck [options]

Options:
  -t, --targets <file>       Targets file (default: @targets.md)
  -b, --brief <file>         Security brief (default: SECURITY_BRIEF.md)
  -m, --max-iterations <n>   Max iterations (default: 50)
  -d, --delay <seconds>      Delay between iterations (default: 5)
  --timeout <minutes>        Timeout per iteration (default: 30)
  --rate-limit <rpm>         Requests per minute (default: 30)
  --no-verify                Skip finding verification
  --report-dir <dir>         Report output directory (default: reports)
  -c, --config <file>        Config file (default: wreck.json)
  -s, --safe-mode            Use cmd.exe without streaming output
  --model <name>             Claude model to use
  --api-key <key>            API key for the model provider
  -v, --verbose              Show detailed output
  --no-hooks                 Disable safety hooks
  --headless                 Run browser in headless mode
  --dry-run                  Validate config only

Configuration

@targets.md

Defines testing scope, authentication, and phases.

Single target:

# Security Testing Scope

## Target
- Name: My Application
- Base URL: https://app.example.com
- Type: WebApplication

## Authentication
- Type: FormLogin
- Login URL: /login

## In-Scope
- https://app.example.com/**

## Out-of-Scope
- https://app.example.com/admin/**

## Testing Phases
- [ ] Reconnaissance
- [ ] Authentication Testing
- [ ] Input Validation (XSS, SQLi)
- [ ] Access Control (IDOR)
- [ ] Business Logic
- [ ] API Security

Multi-target:

# Security Testing Scope

## Target
- Name: Frontend
- Base URL: https://app.example.com
- Type: SinglePageApp
- Primary: true

## Target
- Name: API
- Base URL: https://api.example.com
- Type: Api
- DependsOn: Frontend

## In-Scope
- https://app.example.com/**
- https://api.example.com/**

## Testing Phases
- [ ] Reconnaissance
- [ ] Authentication Testing
- [ ] Input Validation (XSS, SQLi)
- [ ] Access Control (IDOR)
- [ ] Cross-Origin Testing

SECURITY_BRIEF.md

Testing instructions and methodology for Claude. Describes the target application, known features, areas of concern, and any special testing requirements.

wreck.json (optional)

JSON configuration file for hook settings and other options:

{
  "hooksConfig": {
    "scopeEnforcement": true,
    "rateLimiting": true,
    "blockDestructive": true,
    "activityTracking": true,
    "contextInjection": true
  }
}

Status Protocol

Claude reports status at the end of each iteration:

---WRECK_STATUS---
{"phase":"RECONNAISSANCE","status":"IN_PROGRESS","newFindings":0,"highestSeverity":"NONE","endpointsTested":5,"endpointsDiscovered":10,"exitSignal":false,"recommendation":"Continue scanning"}
---END_WRECK_STATUS---

Findings are reported inline:

---WRECK_FINDING---
{"title":"Reflected XSS in Search","severity":"HIGH","category":"XSS","url":"https://target.com/search","parameter":"q","payload":"<script>alert(1)</script>","description":"User input reflected without encoding","evidence":"Response contains unescaped payload","reproduction":"Navigate to /search, enter payload","recommendation":"HTML-encode output","cwe":"CWE-79","owasp":"A03:2021","wstg":"WSTG-INPV-01","confidence":0.9}
---END_WRECK_FINDING---

Learned skills are reported when Claude discovers reusable target-specific knowledge:

---WRECK_LEARNED---
{"skillName":"waf-blocks-inline-scripts","skillDescription":"WAF blocks script tags but allows event handlers","skillContent":"Use onerror/onload event handlers instead of <script> tags for XSS testing"}
---END_WRECK_LEARNED---

Runtime Files

When running, the tool creates:

wreck-hooks/ — Generated Node.js hook scripts
.claude/settings.local.json — Hook configuration for Claude CLI
logs/ — Iteration logs, context-input.json, blocked operations, learned skills
reports/ — Generated HTML and Markdown security reports
evidence/ — HTTP evidence and screenshots for findings
recon/ — Reconnaissance artifacts (network captures, snapshots)
attack-surface.md — Created by Claude during reconnaissance

Requirements

.NET 10.0 SDK
Claude CLI (claude.ai/code)
Node.js (for hook scripts and Playwright MCP server)

Important Notices

This tool is for authorized security testing only. You must have explicit written permission to test any target application. Unauthorized security testing is illegal in most jurisdictions.

Uses --dangerously-skip-permissions. Wreck-It Ralph runs Claude CLI with this flag to enable autonomous operation. This gives Claude unrestricted tool access within the session. The safety hooks provide guardrails, but they are not a security boundary — they are best-effort enforcement.

Scope enforcement is not airtight. Hooks validate URL patterns and payload regex, but edge cases exist. This tool assists authorized testing; it does not guarantee confinement.

Each iteration consumes Claude API credits. A typical 15-iteration run involves 15 full Claude sessions with browser automation. Monitor your usage.

Check Anthropic's acceptable use policy before using this tool for automated security testing via Claude CLI.

Project Structure

src/WreckItRalph/
├── Program.cs                    # CLI entry point + interactive setup
├── Config/                       # WreckOptions, HooksConfig
├── Models/                       # Target, Finding, WreckStatusBlock
├── Orchestration/                # Main testing loop
├── Services/                     # Status parsing, findings, logging, evidence
├── Hooks/
│   ├── SafetyHookManager.cs      # Hook script generation + context injection
│   ├── Scripts/                  # Embedded Node.js hook scripts
│   └── Skills/                   # Learned skills manager (CRUD, decay, usage)
├── Reporting/                    # HTML + Markdown report generation
├── Tray/                         # System tray icon + notifications
├── Setup/                        # Interactive setup wizard + templates
└── Output/                       # Console output formatting

tests/WreckItRalph.Tests/         # xUnit tests

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
src/WreckItRalph		src/WreckItRalph
tests/WreckItRalph.Tests		tests/WreckItRalph.Tests
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
WreckItRalph.slnx		WreckItRalph.slnx

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Wreck-It Ralph

How It Works

Features

Core Testing Loop

Multi-Target Support

Safety Hooks (Enforced, Not Suggested)

Learned Skills System

Finding Management

Scope Learning

Reporting

Quality of Life

Quick Start

CLI Options

Configuration

@targets.md

SECURITY_BRIEF.md

wreck.json (optional)

Status Protocol

Runtime Files

Requirements

Important Notices

Project Structure

License

About

Uh oh!

Releases

Packages

Languages

License

Skamiplan/Wreck-It-Ralph

Folders and files

Latest commit

History

Repository files navigation

Wreck-It Ralph

How It Works

Features

Core Testing Loop

Multi-Target Support

Safety Hooks (Enforced, Not Suggested)

Learned Skills System

Finding Management

Scope Learning

Reporting

Quality of Life

Quick Start

CLI Options

Configuration

@targets.md

SECURITY_BRIEF.md

wreck.json (optional)

Status Protocol

Runtime Files

Requirements

Important Notices

Project Structure

License

About

Resources

License

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages