feat(skills): add /pentest bundled skill by emal-avala · Pull Request #137 · avala-ai/agent-code

emal-avala · 2026-04-18T23:13:49Z

Summary

Adds a /pentest bundled skill that runs a five-phase white-box penetration test in-session: recon → slice → vuln analysis → exploit-or-discard → report. Modeled on the phase structure used by mature autonomous pentest pipelines, but runs entirely in the agent loop rather than shelling out.

The exploit-or-discard gate is the load-bearing piece: every finding must ship with a reproducible PoC (curl command, payload string, or UI repro steps). Findings that cannot be demonstrated from code inspection alone get demoted to INFO or dropped — no theoretical findings pollute the report. Dynamic verification is explicitly constrained to local development instances.

Output is a severity-grouped markdown report (CRITICAL / HIGH / MEDIUM / LOW / INFO) with file:line, CWE mapping, vulnerable snippet, fix, impact, and PoC per finding, plus a ship-readiness verdict.

Complements the existing /security-review skill: security-review is a diff-scoped CWE pass over pending changes; pentest is a full-repo or subsystem-scoped workflow with a stricter proof gate.

Changes

crates/lib/src/skills/mod.rs — register the bundled skill body
README.md — add /pentest row to the bundled skills table
CHANGELOG.md — note under Unreleased

Test plan

cargo test -p agent-code-lib --test skills_integration — all 6 tests pass
cargo build -p agent-code-lib — clean
Reviewer: invoke /pentest <some-subdir> in a local agent session and confirm phase 1 recon runs as expected

Adds a white-box penetration test workflow as a bundled skill, runnable via /pentest [target-dir]. The skill runs five phases in order: recon, slice, vulnerability analysis, exploit-or-discard, and reporting. The exploit-or-discard gate requires a concrete proof-of-concept for every finding; anything that cannot be demonstrated from code inspection alone is demoted to INFO or dropped, keeping the output free of theoretical findings. The skill writes a severity-grouped markdown report (CRITICAL / HIGH / MEDIUM / LOW / INFO) with file:line, CWE mapping, vulnerable snippet, fix, impact, and PoC per finding. Dynamic verification is constrained to local development instances — the skill explicitly rejects running against production. - crates/lib/src/skills/mod.rs: register the bundled skill body - README.md: add /pentest row to the bundled skills table - CHANGELOG.md: note under Unreleased All 6 skills_integration tests pass.

chatgpt-codex-connector · 2026-04-18T23:13:52Z

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

emal-avala merged commit 14f1e99 into main Apr 18, 2026
13 of 14 checks passed

emal-avala deleted the feat/skill-pentest branch April 18, 2026 23:26

emal-avala mentioned this pull request Apr 23, 2026

Release v0.17.0 #172

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(skills): add /pentest bundled skill#137

feat(skills): add /pentest bundled skill#137
emal-avala merged 1 commit intomainfrom
feat/skill-pentest

emal-avala commented Apr 18, 2026

Uh oh!

chatgpt-codex-connector Bot commented Apr 18, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

emal-avala commented Apr 18, 2026

Summary

Changes

Test plan

Uh oh!

chatgpt-codex-connector Bot commented Apr 18, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant