tchkiller 🍀

基于 claude-agent-sdk 的智能渗透测试 Agent。支持多轮决策、Agent Teams 协作、漏洞知识库、自动 failover 和漏洞去重。

安装配置

# 虚拟环境
apt install -y pipx
pipx install --index-url https://mirrors.aliyun.com/pypi/simple/ claude-agent-sdk --include-deps

source /root/.local/share/pipx/venvs/claude-agent-sdk/bin/activate

# 删除重名 httpx
rm -rf /root/.local/share/pipx/venvs/claude-agent-sdk/bin/httpx
rm -rf /usr/local/bin/httpx

python3 -m pip install pyyaml requests aiohttp -i https://mirrors.aliyun.com/pypi/simple/
pipx ensurepath
python3 -c "import claude_agent_sdk; print('OK', claude_agent_sdk.__version__)"

# 依赖 和 投递物 安装
wget -O f8x https://f8x.wgpsec.org/f8x && chmod +x f8x && bash f8x -h
bash f8x -k -ad -cloud -sliver -arsenal -docker -ruby -evil-winrm -nn
source /etc/bash.bashrc && source ~/.zshrc
bash f8x -playwright-mcp -claude-code

# 在本地打包
./deploy.sh --zip

# 手动上传 tchkiller-deploy.zip 到 vps /tmp 路径上
# 上传到 VPS 后解压
mkdir -p /root/tchkiller && cd /root/tchkiller && unzip -o /tmp/tchkiller-deploy.zip && rm /tmp/tchkiller-deploy.zip && cd /root/tchkiller && git init

# 测试依赖安装情况 (除了CrackMapExec以外,其他全装就行了)
重新登录下 ssh 或者 bash 一下，刷一下环境变量
cd /root/tchkiller && source /root/.local/share/pipx/venvs/claude-agent-sdk/bin/activate && git init
python3 main.py --check
python3 main.py --check --deep
# 如果 chrome 报错，就手动安装 npx playwright install chromium
# 记得配置或者上传 providers.json

API Provider 配置

支持多 provider 自动 failover。当一个 provider 的 API 额度耗尽（429）或不可用时，自动切换到下一个。

providers.json 示例：

{
  "providers": [
    {
      "name": "minimax-1",
      "base_url": "https://api.minimaxi.com/anthropic",
      "api_key": "sk-xxx",
      "models": {
        "sonnet": "claude-sonnet-4-20250514",
        "judge": "MiniMax-M2.7-highspeed"
      }
    },
    {
      "name": "minimax-2",
      "base_url": "https://api.minimaxi.com/anthropic",
      "api_key": "sk-yyy",
      "models": {
        "sonnet": "claude-sonnet-4-20250514",
        "judge": "MiniMax-M2.7-highspeed"
      }
    }
  ]
}

Failover 触发条件：429, rate_limit, insufficient_balance, authentication_error, overloaded, billing, quota。

API Key 轮询代理（多 key 均匀消耗）

当同一个 API 有多个 key 且各有额度限制时，可以使用内置的 key_proxy.py 做 round-robin 轮询，对 tchkiller 完全透明，不丢上下文：

# 启动 proxy（后台运行）
cd /root/tchkiller && source /root/.local/share/pipx/venvs/claude-agent-sdk/bin/activate

# 智谱国际
python3 key_proxy.py --keys-file zai_keys.txt --upstream https://api.z.ai/api/anthropic --port 8090

# 智谱国内
python3 key_proxy.py --keys-file bigmodel_keys.txt --upstream https://open.bigmodel.cn/api/anthropic --port 8091

# minimax 国内
# python3 key_proxy.py --keys-file minimax_keys.txt --upstream https://api.minimaxi.com/anthropic --port 8092 --openai

# 用生产预配置覆盖 实际配置
cp providers.json.prod.all providers.json

# 只测 opus #     cp providers.json.test.claude providers.json
# 只测 gpt #      cp providers.json.test.gpt providers.json
# 只测 qwen #     cp providers.json.test.qwen providers.json
#                cp providers.json.prod.qwen providers.json

python3 main.py --check-providers

# 2. providers.json 中该 provider 的 base_url 改为 proxy
{
  "name": "glm-proxy",
  "base_url": "http://localhost:8090",
  "api_key": "proxy-local-token",
  "models": { "sonnet": "glm-5.1", "judge": "glm-5.1" }
}

# 3. 正常运行 tchkiller，无需其他修改

keys.txt 每行一个 key，# 开头为注释。额度耗尽的 key 自动冷却 60s 后恢复。详细文档见 docs/key-proxy.md。

使用命令

决策 Agent 架构（Orchestrator 模式）（实际使用）

多轮渗透 + 决策评估循环，由决策 agent 判断任务是否完成并给出反馈引导下一轮。

# 渗透测试 — 标准用法（推荐）
python3 main.py http://testfire.net -m src_hunt --orchestrator --team -v

# 自定义参数
python3 main.py http://testfire.net -m src_hunt --orchestrator --team --max-rounds 3 --round-timeout 30 --total-timeout 105 --judge-model MiniMax-M2.7-highspeed -v
# 指定 API provider
python3 main.py http://testfire.net -m src_hunt --orchestrator --team --provider minimax-2 -v
# 排除低价值漏洞（实战渗透）
python3 main.py http://target.com -m src_hunt --orchestrator --team --profile real-pentest -v
# 临时排除 XSS 和重定向
python3 main.py http://target.com --orchestrator --team --exclude xss,open_redirect -v

# 非常复杂且需要操作浏览器
python3 main.py https://k8slanparty.com/challenge/1 -m cve_cloud --orchestrator -v --prompt "这是 wiz 的 ctf 挑战赛，只能在它网页提供的伪终端上面做题，其他路径都和题目无关，我建议你用 Playwright 浏览器 MCP 来交互，然后看看和 cloud 相关的skill" --browser

Competition Runner（比赛编排器）（模拟比赛平台用）

面向 TCH 智能渗透挑战赛的多目标自动编排系统。运行在 tchkiller 之上，自动调度多个靶机攻击、实时提交 flag、管理赛区解锁、支持中断恢复。

快速开始

# 默认 Mock 模式（内置多个测试靶机）
python3 competition.py --mock

# 自定义靶机列表
echo "testfire.net" > targets.txt
echo "10.10.1.1:8080" >> targets.txt
python3 competition.py --mock --targets targets.txt

# 完整赛题定义（见 challenges.example.json）
python3 competition.py --mock --challenges challenges.example.json

# 单目标快速测试
python3 competition.py --mock --single testfire.net

# 试运行（不执行 tchkiller，只看调度逻辑）
python3 competition.py --mock --dry-run

# 查看比赛状态
python3 competition.py --status

# 中断恢复
python3 competition.py --mock --resume

透传参数

# 指定模型和 provider
python3 competition.py --mock --model opus --provider openrouter

# 启用浏览器
python3 competition.py --mock --browser

# 禁用 Agent Teams / Orchestrator
python3 competition.py --mock --no-team
python3 competition.py --mock --no-orchestrator

参数说明

参数	说明
`--mock`	使用 MockPlatform 本地测试
`--targets`	目标列表文件（每行一个 URL/IP）
`--challenges`	赛题 JSON 文件（完整赛题定义）
`--single`	单目标模式（快速端到端测试）
`--resume`	从上次中断处恢复（读取 comp_state.json）
`--status`	查看当前比赛状态
`--dry-run`	试运行，不实际执行 tchkiller
`--state-file`	状态文件路径（默认 comp_state.json）
`--model`	透传给 tchkiller 的主模型
`--provider`	透传给 tchkiller 的 API provider
`--browser`	透传：启用浏览器 MCP
`--no-team`	禁用 Agent Teams 模式
`--no-orchestrator`	禁用决策 agent 多轮评估

赛题 JSON 格式

{
  "challenges": [
    {"id": "zone1-001", "zone": 1, "name": "Web Login Bypass", "target": "10.10.1.1:8080", "points": 100, "flags_total": 1},
    {"id": "zone3-001", "zone": 3, "name": "Multi-layer Network", "target": "10.10.3.1", "points": 300, "flags_total": 3}
  ]
}

输出

comp_state.json — 持久化比赛状态（所有赛题进度、flag、花费）
comp_report.json — 最终比赛报告
output/<timestamp>-<target>/ — 每个目标的 tchkiller 执行结果（同正常模式）

架构

competition.py (确定性 Python 编排器)
  ├── comp/platform.py   — 平台接口 (MockPlatform / TCHPlatform)
  ├── comp/strategy.py   — 目标优先级 + 时间分配
  ├── comp/runner.py     — tchkiller 子进程执行 + 实时 flag 监控
  └── comp/state.py      — 持久化状态 (comp_state.json)

详细设计文档见 docs/competition-runner-design.md。

竞赛日操作手册

赛前检查

cd /root/tchkiller && source /root/.local/share/pipx/venvs/claude-agent-sdk/bin/activate && git init

# 1. 确认 providers.json API Key 有效
# 3 workers, 会各自用独立 provider (默认找不到都会用 providers.json)(连续失败会降级)
cp providers.json.prod.1 providers.json # 中转opus(test) + 中转opus(tx) + qwen(coding) + glm国际版(mzj) + qwen(api) + glm国内版(neko)
cp providers.json.prod.2 providers2.json # qwen(coding) + glm国际版(mzj) + qwen(api) + glm国内版(neko)
cp providers.json.prod.3 providers3.json # qwen(coding) + glm国际版(mzj) + qwen(api) + glm国内版(neko)

python3 main.py --check
python3 main.py --check --deep
python3 main.py --check-providers --providers-config providers.json
python3 main.py --check-providers --providers-config providers2.json
python3 main.py --check-providers --providers-config providers3.json

# 2. 确认 vulndb 索引是最新的
# python3 vulndb_index.py
# sqlite3 vulndb.sqlite "SELECT COUNT(*) FROM vulns"   # 应 ≥ 534

# 3. 确认 skills 已同步
ls .claude/skills/ | wc -l   # 应 ≥ 78

# 4. 确认 competition.py 能正常加载
python3 competition.py --help

启动比赛

# 先看有哪些 claude 进程
ps aux | grep -E 'claude' | grep -v grep
# 一次性 kill 掉所有 claude 进程
ps aux | grep -E '[c]laude' | awk '{print $2}' | xargs kill -9
# 先杀 main.py 再杀 claude
kill -9 $(ps aux | grep -E '[m]ain\.py|[c]laude|[m]cp_server|nmap|naabu|nuclei' | awk '{print $2}')

# 正式比赛 — 3 worker 并发 + watchdog 保活
./watchdog.sh --server <SERVER_HOST> --token <AGENT_TOKEN> --browser --workers 3
# nohup ./watchdog.sh --server <SERVER_HOST> --token <AGENT_TOKEN> --workers 3 &

# 或手动启动（不带 watchdog，适合调试）
python3 competition.py --server <SERVER_HOST> --token <AGENT_TOKEN> --browser --workers 3
# python3 competition.py --server <SERVER_HOST> --token <AGENT_TOKEN> --workers 3

网络隔离自动启动 (赛前配置)

# 1. 修改 autostart.sh 中的 COMP_SERVER / COMP_TOKEN / COMP_WORKERS
vim autostart.sh

# 2. 加定时任务 (crontab)
# 直接编辑 crontab
crontab -e

55 8 13-17 4 * /root/tchkiller/autostart.sh
*/10 9-18 13-17 4 * /root/tchkiller/autostart.sh
59 18 13-17 4 * /root/tchkiller/stop_watchdog.sh

# 保存后验证：
crontab -l

# 3. 手动测试一次
# ./autostart.sh

# 查看实时输出
tmux attach -t tch-comp

实时监控

# 查看比赛进度 (得分/题数/轮次)
python3 competition.py --status

# 实时跟踪 agent 输出
tail -f ~/tchkiller/output/*/log.txt

# 查看当前状态文件
cat comp_state.json | python3 -m json.tool

异常处理

# ⚠️ 槽位卡死（start_challenge 返回 400 "最多同时运行3个实例"）
python3 competition.py --server <SERVER_HOST> --token <AGENT_TOKEN> --stop-all
# 然后恢复
python3 competition.py --server <SERVER_HOST> --token <AGENT_TOKEN> --resume --workers 3

# ⚠️ Agent 卡在某题不动（超时未自动终止）
# 查看 PID
ps aux | grep tchkiller
# 手动终止卡住的 tchkiller 进程 (用具体 PID)
kill <PID>
# competition.py 会自动接管下一题

# ⚠️ API 限流严重（429 频繁出现）
# 降低 worker 数
python3 competition.py --server <SERVER_HOST> --token <AGENT_TOKEN> --resume --workers 1

# ⚠️ 需要完全重来（清除所有进度）
python3 competition.py --server <SERVER_HOST> --token <AGENT_TOKEN> --stop-all
python3 competition.py --server <SERVER_HOST> --token <AGENT_TOKEN> --reset --workers 3

赛后

# 查看最终成绩
python3 competition.py --status

# 分析各题耗时和 token 消耗
python3 benchmark/analyze.py output/<run_dir>

# 多次运行对比
python3 benchmark/compare.py output/2026*

关键文件速查

文件	用途
`comp_state.json`	比赛状态持久化（进度、flag、花费）
`providers.json`	API Key 配置（failover 链）
`key_proxy.py`	API Key 轮询代理（多 key 均匀消耗）
`keys.txt`	Key 轮询代理的 key 列表（不提交 git）
`watchdog.sh`	进程保活脚本
`output/<timestamp>/`	每题的 agent 执行日志和证据

所有参数选项

参数	默认值	说明
`target`	(必填)	目标地址
`-m, --mode`	`auto`	赛区模式: `src_hunt` / `cve_cloud` / `network` / `domain` / `auto`
`--model`	`sonnet`	主模型 (sonnet / opus / haiku)
`--max-turns`	`150`	最大对话轮次（orchestrator 模式下单轮默认 80）
`--thinking-tokens`	`10000`	Thinking token 预算
`--team`	off	启用 Agent Teams（recon + analyze + exploit 三子 agent）
`--prompt`	-	附加提示词（追加到 prompt 之后）
`--scope`	-	额外授权范围（域名/IP）
`-v`	off	详细输出
Orchestrator 参数
`--orchestrator`	off	启用决策 agent 多轮评估
`--max-rounds`	`3`	最大渗透轮数
`--round-timeout`	`30`	单轮超时（分钟）
`--total-timeout`	自动推算	总超时（分钟），默认 `max_rounds × (round_timeout + 5)`
`--judge-model`	`MiniMax-M2.7-highspeed`	决策 agent 模型
Provider 参数
`--provider`	自动选择第一个	指定 API provider 名称（见 `providers.json`）
`--providers-config`	`providers.json`	providers 配置文件路径
漏洞排除参数
`--profile`	-	加载漏洞排除 profile（如 `real-pentest`，见 `profiles/*.yaml`）
`--exclude`	-	排除漏洞类型，逗号分隔（如 `xss,open_redirect,tls_ssl`）
浏览器参数
`--browser`	off	启用 Playwright 浏览器 MCP（需 `@playwright/mcp`）
环境检查
`--check`	-	检查运行环境依赖是否安装完整

赛区模式

每个赛区有独立的 prompt 指导（prompt/modes/*.md），引导 agent 采用对应的攻击策略。赛区策略注入到 User Prompt 中。

模式	赛区名	说明	典型目标
`src_hunt`	识器·明理	SRC 漏洞猎杀 — SQLi, XSS, SSRF, LFI, IDOR 等 Web 漏洞	Web 应用
`cve_cloud`	洞见·虚实	已知 CVE + 云安全配置 + AI 基础设施漏洞	云服务、K8s
`network`	执刃·循迹	多层网络渗透 + 权限维持 + 横向移动	OA 系统、内网
`domain`	铸剑·止戈	AD 域攻击 — Kerberos, NTLM, GPO, DCSync	Windows 域环境
`auto`	—	自动模式，根据目标特征自适应选择策略	任意

漏洞排除偏好

通过 --profile 或 --exclude 排除不需要测试的漏洞类型。排除仅影响 prompt 引导（引导 agent 跳过），如果 agent 仍然发现了被排除类型的漏洞，会正常记录。

# 使用预设 profile
python3 main.py http://target.com --profile real-pentest --orchestrator --team

# 临时排除
python3 main.py http://target.com --exclude xss,open_redirect,tls_ssl

# profile + 额外排除（叠加）
python3 main.py http://target.com --profile real-pentest --exclude weak_cred

预设 Profile

Profile	描述	排除项
`real-pentest`	实战渗透，排除低价值项	xss, csrf, open_redirect, tls_ssl, missing_header, cookie_security, session_mgmt, cors
`full-scan`	全量扫描（靶场测试用）	无

可排除漏洞类型

类型 key	快捷别名	说明
`sqli`		SQL 注入
`xss_reflected`	`xss`(展开全部)	反射型 XSS
`xss_stored`	`xss`	存储型 XSS
`xss_dom`	`xss`	DOM XSS
`idor`		越权访问
`csrf`		CSRF
`ssrf`		SSRF
`path_traversal`		路径遍历/文件包含
`info_disclosure`	`info`	信息泄露
`business_logic`		业务逻辑漏洞
`auth_bypass`		认证绕过
`cmd_injection`		命令注入/RCE
`open_redirect`	`redirect`	URL 重定向
`missing_header`	`header`	安全头缺失
`weak_cred`		弱口令
`tls_ssl`	`tls`	TLS/SSL 风险
`cookie_security`	`cookie`	Cookie 安全
`session_mgmt`	`session`	会话管理安全 (Session Fixation/Cookie 操纵)
`cors`		CORS 跨域配置错误

自定义 profile: 在 profiles/ 目录创建 .yaml 文件：

name: 我的偏好
description: 自定义排除规则
exclude_vulns:
  - xss
  - open_redirect
  - tls_ssl

输出

每次运行在 output/<timestamp>-<target>/ 下生成：

vulns.json — 漏洞结构化报告
evidence/*.md — 证据文件
prompt_audit.md — Prompt 审计记录
judge_round*.json — 决策记录（orchestrator 模式）
report.json — 📋 结构化运行报告（耗时、费用、token、漏洞数、工具统计）
timeline.jsonl — 🔧 工具调用时间线（每行一条事件，增量写入）

report.json 字段说明

字段	说明
`duration_s`	总耗时（秒）
`duration_api_s`	纯 AI 推理耗时（秒），不含工具执行
`total_cost_usd`	总费用（美元）
`usage`	Token 用量 `{input_tokens, output_tokens}`
`tool_calls_count`	工具调用总次数
`tool_errors_count`	工具调用失败次数
`vulns_count`	发现漏洞数
`rounds`	各轮明细数组（orchestrator 模式）

timeline.jsonl 事件类型

type	说明
`tool_call`	工具调用：name, input_preview, tool_use_id
`tool_result`	工具返回：is_error, output_preview
`assistant_msg`	AI 消息：token_in, token_out
`rate_limit`	限流事件：resets_at, utilization

skills 库

所有 skills 都来自于 https://github.com/wgpsec/AboutSecurity/blob/master/skills/ 下

使用前，请 clone 到本地 /pentest/AboutSecurity 或者 $HOME/AboutSecurity 目录，然后运行 sync-skills.sh 将 skills 软连接到本地 tchkiller 目录下

漏洞知识库 (vulndb)

tchkiller 内置漏洞知识库，收录高频漏洞的完整利用步骤和 Payload。渗透 agent 在侦察阶段识别出产品/技术栈后，强制查询知识库获取精确利用方法，而非依赖模型自身知识。

为什么需要 vulndb？

Skills 适合方法论：SQL 注入技巧、IDOR 绕过策略等可聚类的攻击方法
vulndb 适合漏洞细节：具体 CVE 的 payload、利用步骤、验证方法，尤其是国产软件（通达OA、致远OA、用友NC）和冷门 CVE

架构

vulndb/                     ← Markdown 漏洞条目（人类可编辑）
├── middleware/{product}/    ← 中间件：shiro, fastjson, log4j, tomcat
├── web/{product}/           ← Web 应用：tongda-oa, zhiyuan-oa, thinkphp
└── network/{product}/       ← 网络协议

vulndb.sqlite               ← 自动生成的 FTS5 搜索索引（unicode61 分词，支持中文）

工作流集成

agent 侦察 → 识别产品 → search_vulndb("产品名") → read_vuln("CVE-xxx") → 使用知识库 payload 利用

此流程写入系统提示词和 prompt/agents/*.md，为强制步骤。

MCP 工具

工具	说明
`search_vulndb(query, product, severity)`	搜索漏洞（支持中文、CVE 编号、产品名）
`read_vuln(id)`	读取漏洞完整利用详情（payload、步骤、验证方法）
`rebuild_vulndb()`	重建搜索索引（添加新漏洞后调用）

添加漏洞

在 vulndb/{分类}/{产品}/ 下创建 .md 文件（YAML frontmatter + Markdown 正文）
运行 python3 vulndb_index.py 重建索引（或 ./sync-skills.sh 会自动重建）

详细模板和维护手册见 VULNDB.md。

benchmark

模拟比赛平台 benchmark (验证 api 交互)

python3 competition.py --server <HOST>:9000 --token test-token --reset --reset-range --workers 3

python3 competition.py --status

wiz crf benchmark (验证 playwright 浏览器交互和 skill 推荐能力)

python3 main.py https://k8slanparty.com/challenge/1 -m cve_cloud --orchestrator -v --prompt "分析站点解决 CTF 挑战 https://k8slanparty.com/challenge/1" --browser --team

python3 main.py https://cloudsecuritychampionship.com/challenge/10 -m cve_cloud --orchestrator -v --prompt "分析站点解决 CTF 挑战 https://cloudsecuritychampionship.com/challenge/10" --browser --team

volfocus Benchmark（渗透靶场测试）

python3 main.py http://101.42.116.74:57834 -m src_hunt --orchestrator -v --prompt "这个场景的 flag 格式可能不太一样，可能为 flag-{xxxxxx}" --team

testfire（渗透测试）

python3 main.py http://testfire.net -m src_hunt --orchestrator -v --team

XBOW Benchmark（CTF 靶场测试）

# 单个靶场
cd benchmark
./bench-single.sh XBEN-005-24 --remote root@<YOUR_HOST>

# 单个靶场 + orchestrator
./bench-single.sh XBEN-005-24 --remote root@<YOUR_HOST> --orchestrator

# 批量运行
./bench-runner.sh --remote root@<YOUR_HOST>

hack the box

openvpn --config xxx.ovpn

python3 main.py 10.129.24.21 -m src_hunt --orchestrator --team -v --prompt "你正在进行 hackthebox 挑战，题目名称叫 Snapped,目标机器地址是 10.129.24.21,题目描述是 Snapped is a hard-difficulty machine that features two recent CVEs. The foothold showcases CVE-2026-27944 in Nginx-UI, which exposes the /api/backup endpoint without authentication. The endpoint will produce a full backup of the nginx and nginx-UI configuration files, and includes the key to decrypt the backup in the response headers. This leads to finding and decrypting a weak user password from the Nginx-UI database file. Root exploits CVE-2026-3888, a TOCTOU race condition between snap-confine and systemd-tmpfiles. After the system's cleanup daemon deletes a stale mimic directory under /tmp, the attacker recreates it with controlled content and single-steps snap-confine's execution via AF_UNIX socket backpressure to win the race during the mimic bind-mount sequence reliably. This poisons the sandbox's shared libraries, enabling dynamic linker hijacking on the SUID-root snap-confine binary to compromise the system. 请你 获取 User Flag 和 Root Flag，都是32 hex characters"

下面都是不重要的，看不看都可以

单 Agent 模式（默认）(可忽略)

# 基础渗透测试
python3 main.py http://testfire.net -m src_hunt -v
# Agent Teams 多 agent 协作（recon + analyze + exploit）
python3 main.py http://testfire.net -m src_hunt --team -v
# 使用 Opus 模型 + 更多轮次
python3 main.py http://testfire.net --model opus --max-turns 200 -v

漏洞去重机制

两层独立去重保障：

MCP 层 (report_vuln): _normalize_key() 归一化为 vuln_type:endpoint_path?params，同一 key 自动合并
Prompt 层 (R2/R3 注入): 去重后的漏洞列表 + "不要重复测试" 指令

report_vuln 字段容错

Agent 可能传非标参数，5 层容错链确保不丢漏洞：

层级	处理
容错层 0	`vuln='{JSON}'` blob 解析（提取 title/url/severity）
容错层 1	`vuln_id` → 交叉引用 evidence 文件恢复 title/url
容错层 2	空 title → 从 description 首行提取
容错层 3	空 URL → 从所有参数值 + detail 中正则提取
容错层 4	仍为空 → isError 拒绝 + 返回用法指引

参数别名支持：title/name/vuln_name/vuln_type, url/target/endpoint/vuln_url, detail/description/evidence

Orchestrator 架构

┌──────────────────────────────────────────────────────┐
│           Orchestrator (Python main.py)              │
│                                                      │
│  ProviderManager ← providers.json (auto failover)    │
│                                                      │
│  for round in 1..max_rounds:                         │
│    ┌─ 渗透 Agent (Claude Sonnet/Opus) ────────────┐  │
│    │  bash → nmap, sqlmap, ffuf, curl              │  │
│    │  MCP Tools → evidence, vulns, scope, skills   │  │
│    │  MCP Tools → search_vulndb, read_vuln         │  │
│    │  Sub-agents (--team):                         │  │
│    │    recon (侦察) + analyze (分析) + exploit    │  │
│    │    prompt 从 prompt/agents/*.md 加载          │  │
│    └──────────────────────────────────────────────-┘  │
│    ↓ vulns.json + evidence + flag status             │
│    ┌─ 决策 Agent (MiniMax-M2.7) ──────────────────┐  │
│    │  评估完成度 → {complete, feedback, missing}   │  │
│    └──────────────────────────────────────────────-┘  │
│    if complete → break                               │
│    else → inject feedback + deduped vulns → next     │
│                                                      │
│  MCP 层:                                             │
│    report_vuln → 5 层容错 + 去重 (_normalize_key)    │
│    evidence_save → 自动去重                          │
│    skills → 68 个方法论 (read_skill / list_skills)   │
│    vulndb → 漏洞知识库 (search_vulndb / read_vuln)   │
└──────────────────────────────────────────────────────┘

分析工具

单次运行分析

# 基本分析（报告摘要 + 工具统计 + token 分析 + 限流）
python3 benchmark/analyze.py output/<run_dir>

# 包含完整时间线
python3 benchmark/analyze.py output/<run_dir> --timeline

# JSON 格式输出（供脚本消费）
python3 benchmark/analyze.py output/<run_dir> --json

输出包含：

📋 运行报告摘要（耗时、费用、AI/工具耗时拆分）
📊 各轮明细表（orchestrator 模式）
🔧 工具调用 TOP 排行 + 错误率
📊 Token 消耗分布（按轮次）
⏳ 限流事件

多次运行对比

# 对比多次运行
python3 benchmark/compare.py output/run1 output/run2 output/run3

# 通配符
python3 benchmark/compare.py output/2026*

# 按效率排序
python3 benchmark/compare.py output/run* --sort efficiency

# 排序选项: time | cost | vulns | efficiency

输出包含：

对比表格（耗时、费用、漏洞、工具统计）
🏆 自动标注最佳（最多漏洞、最低费用、最快速度）
📊 效率指标（漏洞/分钟、漏洞/$、$/漏洞、错误率、AI 占比）
🔄 Orchestrator 轮次对比

# 以下忽略
# Claude CLI 需在 root 下绕过权限检查
# export IS_SANDBOX=1
# claude --dangerously-skip-permissions

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

tchkiller 🍀

安装配置

API Key 轮询代理（多 key 均匀消耗）

使用命令

决策 Agent 架构（Orchestrator 模式）（实际使用）

Competition Runner（比赛编排器）（模拟比赛平台用）

竞赛日操作手册

所有参数选项

skills 库

漏洞知识库 (vulndb)

benchmark

模拟比赛平台 benchmark (验证 api 交互)

wiz crf benchmark (验证 playwright 浏览器交互和 skill 推荐能力)

volfocus Benchmark（渗透靶场测试）

testfire（渗透测试）

XBOW Benchmark（CTF 靶场测试）

hack the box

下面都是不重要的，看不看都可以

单 Agent 模式（默认）(可忽略)

漏洞去重机制

report_vuln 字段容错

Orchestrator 架构

分析工具

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 1

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.github/skills		.github/skills
benchmark		benchmark
comp		comp
docker		docker
docs		docs
hooks		hooks
profiles		profiles
prompt		prompt
scripts		scripts
vulndb		vulndb
.gitignore		.gitignore
AUTHORIZATION.md		AUTHORIZATION.md
CLAUDE.md		CLAUDE.md
Dockerfile		Dockerfile
README.md		README.md
VULNDB.md		VULNDB.md
autostart.sh		autostart.sh
challenges.example.json		challenges.example.json
competition.py		competition.py
config.yaml		config.yaml
deploy.sh		deploy.sh
key_proxy.py		key_proxy.py
main.py		main.py
mcp_server.py		mcp_server.py
providers.json.example		providers.json.example
providers.py		providers.py
pyproject.toml		pyproject.toml
stop_watchdog.sh		stop_watchdog.sh
sync-skills.sh		sync-skills.sh
test_range.py		test_range.py
vulndb_index.py		vulndb_index.py
watchdog.sh		watchdog.sh

Folders and files

Latest commit

History

Repository files navigation

tchkiller 🍀

安装配置

API Key 轮询代理（多 key 均匀消耗）

使用命令

决策 Agent 架构（Orchestrator 模式）（实际使用）

Competition Runner（比赛编排器）（模拟比赛平台用）

竞赛日操作手册

所有参数选项

skills 库

漏洞知识库 (vulndb)

benchmark

模拟比赛平台 benchmark (验证 api 交互)

wiz crf benchmark (验证 playwright 浏览器交互 和 skill 推荐能力)

volfocus Benchmark（渗透靶场测试）

testfire（渗透测试）

XBOW Benchmark（CTF 靶场测试）

hack the box

下面都是不重要的，看不看都可以

单 Agent 模式（默认）(可忽略)

漏洞去重机制

report_vuln 字段容错

Orchestrator 架构

分析工具

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 1

Languages

wiz crf benchmark (验证 playwright 浏览器交互和 skill 推荐能力)

Packages