Autonomous AI-Powered Red Team Simulation Agent
Install · Quick Start · Architecture · 中文
An autonomous red team simulation agent that works with Claude Code, OpenCode, and Codex. It transforms any workspace into a full penetration testing environment for CTF/lab targets — featuring 8 AI agents, containerized Kali tools, a streaming case collection pipeline, and 78 security reference files.
Key Features:
- Multi-CLI support — works with Claude Code, OpenCode, and Codex out of the box
- Autonomous workflow — 5-phase methodology (Recon → Collect → Test → Exploit+OSINT → Report) runs with minimal user interaction
- Orchestrator GUI — local web UI for projects, live runs, artifacts, timelines, and terminal run metadata
- Intelligence collection —
intel.mdaccumulates tech stack, people, domains, credentials from recon through exploitation; OSINT agent enriches with CVE, breach, DNS history, and social data - 8 specialized agents — operator, recon-specialist, source-analyzer, vulnerability-analyst, exploit-developer, fuzzer, osint-analyst, report-writer
- Containerized tools — all pentest tools run in Docker (Kali toolbox, mitmproxy, Katana, optional Metasploit RPC for OpenCode), zero local installation
- Case collection pipeline — SQLite-backed queue with 4 producers, automatic type classification, zero-token dispatcher
- 78 reference files — OWASP Top 10:2025, API Security 2023, offensive tactics, AD/Kerberos attacks
- Resume support — interrupt and continue any engagement without losing progress
- Docker (with Docker Compose)
- At least one AI CLI tool if you are not using the Docker all-in-one runtime:
- Claude Code
- OpenCode (
npm install -g opencode-ai) - Codex
- Local tools:
curl,jq,sqlite3(not required for the Docker all-in-one runtime) - Native Windows/PowerShell is not supported
./install.sh -hInstall
bash <(curl -fsSL https://raw.githubusercontent.com/NeoTheCapt/RedteamAgent/dev/install.sh) docker
# or:
./install.sh docker ~/redteam-docker
./install.sh --force docker ~/redteam-dockerStart
cd ~/redteam-docker
./run.shRun
/engage http://your-ctf-target:8080
/autoengage http://your-ctf-target:8080Notes
- This is the cleanest runtime path: the image bundles OpenCode, Redteam Agent, and the pentest toolchain.
run.shstarts from the image-baked clean template, persists engagement files inworkspace/, and persists the full OpenCode state directory inopencode-home/.- Use
./run.sh --ephemeral-opencodeif you do not want to persist OpenCode state outside the container. - Use
./run.sh --rebuildto force a clean image rebuild after install.
Install
bash <(curl -fsSL https://raw.githubusercontent.com/NeoTheCapt/RedteamAgent/dev/install.sh) opencode
# or:
./install.sh opencode
./install.sh opencode ~/my-project
./install.sh --dry-run opencodeStart
cd ~/redteam-agent
opencodeRun
/engage http://your-ctf-target:8080
/autoengage http://your-ctf-target:8080Notes
- Configure your LLM provider in
.opencode/opencode.json. - OpenCode can optionally use the local Metasploit MCP path during
Exploitwhen a finding clearly maps to a known module family, service, product/version, or CVE.
Install
bash <(curl -fsSL https://raw.githubusercontent.com/NeoTheCapt/RedteamAgent/dev/install.sh) claude
# or:
./install.sh claude
./install.sh claude ~/my-projectStart
cd ~/redteam-agent
claudeRun
/engage http://your-ctf-target:8080
/autoengage http://your-ctf-target:8080Install
bash <(curl -fsSL https://raw.githubusercontent.com/NeoTheCapt/RedteamAgent/dev/install.sh) codex
# or:
./install.sh codex
./install.sh codex ~/my-projectStart
cd ~/redteam-agent
codexRun
engage http://your-ctf-target:8080
autoengage http://your-ctf-target:8080
Notes
- Codex does not support slash commands the same way OpenCode and Claude Code do; use natural-language command invocation when needed.
Use the local web UI when you want to manage multiple workspaces or inspect live runs outside the CLI.
Start
./orchestrator/run.sh
# or rebuild the all-in-one image first:
./orchestrator/run.sh --rebuildStop
./orchestrator/stop.shNotes
- Default URL:
http://127.0.0.1:18000 ./orchestrator/run.shbootstraps the backend virtualenv, installs frontend dependencies if needed, and builds the frontend before starting.- The UI exposes projects, live run status, task/phase timelines, artifacts, and terminal run metadata from the runs API.
- Recent backend work also auto-recovers incomplete runs after supervisor loss or backend restarts, so the UI is suitable for long-running unattended sessions.
Every runtime writes engagement artifacts to:
engagements/<timestamp-target>/
Common outputs:
findings.md— vulnerability findings and supporting evidencereport.md— final engagement reportlog.md— execution log and operator timelineintel.md— summary intelligence safe for routine reviewintel-secrets.json— full captured secrets and tokensauth.json— active auth material and session statecases.db— SQLite queue, classification, and work statesurfaces.jsonl— high-risk surface coverage tracking
Sensitive outputs:
- Do not casually share
intel-secrets.json,auth.json, or any engagement directory that still contains live credentials, tokens, or session state. - If you need to share results, prefer
report.md, selected excerpts fromfindings.md, and a reviewed/redacted subset of supporting files.
/engage |
/autoengage |
|
|---|---|---|
| Auth setup | Asks you to choose (proxy/cookie/skip) | Auto-skip, auto-register if endpoint found, auto-use discovered creds |
| Phase approval | Auto-confirm by default, first phase needs approval | Never asks. Every phase auto-proceeds. |
| Decisions | Parallel by default, can choose sequential | Always parallel. No options. |
| Errors | May stop on unexpected issues | Logs error, continues next task |
| When to use | First time on a target, want oversight | Repeat runs, overnight scans, maximum coverage |
The agent runs through 5 phases:
Phase 1: RECON ─── recon-specialist + source-analyzer (parallel)
│
Phase 2: COLLECT ─ Import endpoints → SQLite queue, start Katana crawler
│
Phase 3: TEST ──── Consume queue → vulnerability-analyst + source-analyzer
│ exploit-developer runs in parallel for HIGH/MEDIUM findings
│ (continuous loop with progress display)
Phase 4: EXPLOIT ── osint-analyst + exploit-developer (parallel)
│ osint-analyst: CVE/breach/DNS/social intel from intel.md
│ exploit-developer: chain analysis, impact assessment
│ OSINT high-value intel → 2nd round exploitation
Phase 5: REPORT ── report-writer with coverage statistics + intelligence summary
| Command | Description |
|---|---|
/engage <url> |
Start a new engagement (semi-autonomous) |
/autoengage <url> |
Fully autonomous — zero interaction, max coverage |
/resume |
Continue an interrupted engagement |
/status |
Show progress dashboard with queue stats |
/proxy start/stop |
Manage mitmproxy interception proxy |
/auth cookie/header |
Configure authentication credentials |
/queue |
Show case queue statistics |
/report |
Generate final report |
/stop |
Stop all background containers |
/confirm auto/manual |
Toggle auto/manual approval mode |
/config [key] [value] |
View or set runtime configuration |
/subdomain <domain> |
Enumerate subdomains for a domain |
/vuln-analyze |
Analyze scan results for vulnerabilities |
/osint |
Run OSINT intelligence gathering on current engagement |
/recon /scan /enumerate /exploit /pivot |
Manual phase overrides |
1 — Proxy login (recommended): /proxy start → login in browser
2 — Manual cookie: /auth cookie "session=abc123"
3 — Manual header: /auth header "Authorization: Bearer ..."
4 — Skip: test unauthenticated surface, configure auth later
┌─────────────────────────┐
│ OPERATOR │
│ (primary — drives all) │
└──┬──┬──┬──┬──┬──┬──┬────┘
│ │ │ │ │ │ │
┌────────────────────┘ │ │ │ │ │ └──────────────────┐
▼ ▼ │ ▼ │ │ ▼
recon- source- │ vuln- │ │ report-
specialist analyzer │ analyst│ │ writer
(network) (code) │ (test) │ │ (report)
│ │ ▼ ▼ ▼
│ │ fuzzer exploit- osint-
│ │ (fuzz) developer analyst
│ │ (exploit) (OSINT)
│ │ ▲ │
│ intel.md ◄─┘ │ │
└──► intel.md └────────┘
operator feeds
OSINT intel → exploit
Producers Queue (SQLite) Consumers
┌──────────┐
│ mitmproxy │─┐ ┌──────────┐ ┌────────┐ ┌─ vuln-analyst (api/form)
│ Katana │─┼──→│ cases.db │─→│dispatch│──┼─ source-analyzer (js/css)
│ recon │─┤ └──────────┘ │ (.sh) │ ├─ fuzzer (deep params)
│ spec │─┘ dedup+state └────────┘ └─ exploit-dev (confirmed)
└──────────┘ 15 types 0 tokens ▲
▲ │
└──────────── new endpoints ──────────────┘
RedteamOpencode/ ← dev workspace (git root)
├── install.sh ← installs agent/ to ~/redteam-agent
├── README.md ← project docs
│
└── agent/ ← ALL runtime files (what gets installed)
├── CLAUDE.md ← operator prompt (Claude Code)
├── AGENTS.md ← operator prompt (Codex)
├── .opencode/ ← OpenCode config + single source of truth
│ ├── opencode.json ← agent metadata, skills, commands, plugins
│ ├── prompts/agents/ ← 8 agent prompts (.txt) — SINGLE SOURCE
│ ├── commands/ ← 19 slash commands (.md) — SINGLE SOURCE
│ └── plugins/ ← engagement hooks (TypeScript)
├── .claude/ ← Claude Code config (agents + commands generated)
│ └── settings.json ← hooks (scope check + auto-logging)
├── .codex/ ← Codex config (agents generated)
├── scripts/
│ ├── install-time generators ← install.sh builds .claude/agents + .codex/agents + .claude/commands
│ ├── dispatcher.sh ← case queue management
│ └── ... ← ingest, hooks, shared libraries
├── skills/ ← 32 attack methodology skills
├── references/ ← 78 reference files (OWASP, tools, tactics, AD)
├── docker/ ← Dockerfiles + docker-compose.yml
└── engagements/ ← per-engagement output (created at runtime)
| Feature | Claude Code | OpenCode | Codex |
|---|---|---|---|
| Operator prompt | CLAUDE.md |
.opencode/prompts/agents/operator.txt |
AGENTS.md |
| Subagents (8) | Generated .claude/agents/*.md |
.opencode/prompts/agents/*.txt (source) |
Generated .codex/agents/*.toml |
| Slash commands (19) | Generated .claude/commands/*.md |
.opencode/commands/*.md (source) |
Not supported — use natural language instead |
| Skills (31) | skills/*/SKILL.md (read on demand) |
Loaded via instructions array | skills/*/SKILL.md (read on demand) |
| Build | install.sh claude generates agents + commands at install time |
N/A (source files) | install.sh codex generates agents at install time |
| Auto-logging | .claude/settings.json hooks |
.opencode/plugins/engagement-hooks.ts |
N/A |
| Scope enforcement | Hook blocks out-of-scope | Hook warns out-of-scope | N/A |
| Agent attribution | agent_type in hook JSON |
chat.message event tracking |
N/A |
Development-only wrappers
agent/.claude/agents/operator.mdandagent/.codex/agents/operator.tomlexist only for working inside the source repo.- Installed Claude/Codex workspaces keep
CLAUDE.mdorAGENTS.mdas the operator entrypoint and install only generated subagents.
mkdir agent/skills/my-skill
# Write agent/skills/my-skill/SKILL.md with frontmatter + methodology
# Add "skills/my-skill/SKILL.md" to instructions array in agent/.opencode/opencode.jsonAdd files to agent/references/<category>/ and update agent/references/INDEX.md.
Edit model in agent/.opencode/opencode.json. Supports Anthropic, OpenAI, Google, Ollama.
This repo has two layers:
- Root (
RedteamOpencode/): dev workspace with install script and README. Run your CLI here for development tasks. - Agent (
agent/): all runtime files that get installed to~/redteam-agent. Run your CLI insideagent/(or~/redteam-agent/) for engagements.
Agent prompts and commands are maintained only in OpenCode format (.opencode/). Claude Code and Codex versions are generated at install time by install.sh:
# install.sh handles building for the target product:
./install.sh claude ~/my-project # generates .claude/agents/*.md + commands at install time
./install.sh codex ~/my-project # generates .codex/agents/*.toml at install time
./install.sh opencode ~/my-project # copies .opencode/ directly (no build needed)To modify an agent: edit agent/.opencode/prompts/agents/<name>.txt, then re-run install.sh for your product.
To add a new agent: create the .txt file, add agent entry to opencode.json, re-run install.sh.
Operator prompts use a mixed model:
agent/.opencode/prompts/agents/operator.txtstays as the OpenCode source promptagent/operator-core.mdis the shared Claude/Codex methodology bodyagent/scripts/render-operator-prompts.shrendersCLAUDE.md,AGENTS.md, and the thin local operator wrappersbash tests/agent-contracts/check-operator-prompts.shverifies the generated files are still in sync
| Problem | Solution |
|---|---|
| Docker images fail to build | docker system prune -af && cd agent/docker && docker compose build --no-cache |
| Docker build fails while fetching Kali packages | Re-run the build. The Dockerfiles pin Kali to the official rotator and retry automatically, but transient mirror/network failures can still require another attempt. |
| Katana doesn't start | Check: docker logs redteam-katana |
| Agent refuses to test target | Adjust auth in agent/CLAUDE.md or agent/.opencode/instructions/INSTRUCTIONS.md |
| Queue shows 0 cases | Run /status — check Collect phase was executed |
| ProviderModelNotFoundError | Set model in agent/.opencode/opencode.json |
For authorized security testing only. Only use against targets you have explicit permission to test.

