Skip to content

feat(skills): add /pentest bundled skill#137

Merged
emal-avala merged 1 commit intomainfrom
feat/skill-pentest
Apr 18, 2026
Merged

feat(skills): add /pentest bundled skill#137
emal-avala merged 1 commit intomainfrom
feat/skill-pentest

Conversation

@emal-avala
Copy link
Copy Markdown
Member

Summary

Adds a /pentest bundled skill that runs a five-phase white-box penetration test in-session: recon → slice → vuln analysis → exploit-or-discard → report. Modeled on the phase structure used by mature autonomous pentest pipelines, but runs entirely in the agent loop rather than shelling out.

The exploit-or-discard gate is the load-bearing piece: every finding must ship with a reproducible PoC (curl command, payload string, or UI repro steps). Findings that cannot be demonstrated from code inspection alone get demoted to INFO or dropped — no theoretical findings pollute the report. Dynamic verification is explicitly constrained to local development instances.

Output is a severity-grouped markdown report (CRITICAL / HIGH / MEDIUM / LOW / INFO) with file:line, CWE mapping, vulnerable snippet, fix, impact, and PoC per finding, plus a ship-readiness verdict.

Complements the existing /security-review skill: security-review is a diff-scoped CWE pass over pending changes; pentest is a full-repo or subsystem-scoped workflow with a stricter proof gate.

Changes

  • crates/lib/src/skills/mod.rs — register the bundled skill body
  • README.md — add /pentest row to the bundled skills table
  • CHANGELOG.md — note under Unreleased

Test plan

  • cargo test -p agent-code-lib --test skills_integration — all 6 tests pass
  • cargo build -p agent-code-lib — clean
  • Reviewer: invoke /pentest <some-subdir> in a local agent session and confirm phase 1 recon runs as expected

Adds a white-box penetration test workflow as a bundled skill, runnable
via /pentest [target-dir]. The skill runs five phases in order: recon,
slice, vulnerability analysis, exploit-or-discard, and reporting. The
exploit-or-discard gate requires a concrete proof-of-concept for every
finding; anything that cannot be demonstrated from code inspection alone
is demoted to INFO or dropped, keeping the output free of theoretical
findings.

The skill writes a severity-grouped markdown report (CRITICAL / HIGH /
MEDIUM / LOW / INFO) with file:line, CWE mapping, vulnerable snippet,
fix, impact, and PoC per finding. Dynamic verification is constrained
to local development instances — the skill explicitly rejects running
against production.

- crates/lib/src/skills/mod.rs: register the bundled skill body
- README.md: add /pentest row to the bundled skills table
- CHANGELOG.md: note under Unreleased

All 6 skills_integration tests pass.
@chatgpt-codex-connector
Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

@emal-avala emal-avala merged commit 14f1e99 into main Apr 18, 2026
13 of 14 checks passed
@emal-avala emal-avala deleted the feat/skill-pentest branch April 18, 2026 23:26
@emal-avala emal-avala mentioned this pull request Apr 23, 2026
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant