GitHub - PentesterFlow/agent: Agentic offensive-security in your terminal

Human-in-the-loop Agentic AI CLI for penetration testers and bug hunters.

PentesterFlow helps security engineers move through recon, enumeration, validation, evidence collection, and reporting while keeping the analyst in control.

Install · Quickstart · Lifecycle · Memory · Burp · Security

$ pentesterflow
╭────────────────────────────────────────────────╮
│  PentesterFlow                                 │
│  local agent · tools ready · analyst approved  │
╰────────────────────────────────────────────────╯

› /target https://app.example.com
  target set to https://app.example.com

› test the orders API for broken access control
⏺ Skill webvuln
  ⎿ loaded skill: webvuln
⏺ http GET https://app.example.com/api/v1/orders/1043
  ⎿ 200 OK
⏺ BashTool(curl -s -H "Authorization: Bearer $USER_B" ...)
  ⎿ cross-account response confirmed
⏺ Confirmed Finding (high) IDOR on /api/v1/orders/{id}
  ⎿ written to ./findings/idor-orders.md

Overview

PentesterFlow is an open-source terminal assistant designed specifically for authorized offensive-security work. It connects to local or hosted LLMs, plans against a scoped target, uses real pentesting tools, asks for approval before sensitive actions, remembers useful lessons across sessions, and writes evidence-backed findings.

It is built around three ideas:

Analyst control: the human approves sensitive actions and decides scope.
Transparent execution: curl-first, reproducible commands, visible tool calls, saved evidence, and audit-friendly logs.
Operational learning: local project and personal knowledge bases improve future sessions without retraining the model or adding user-facing complexity.

Warning

Use PentesterFlow only on systems where you have explicit authorization. The agent can run shell commands, make HTTP requests, edit files, and process captured traffic after approval.

Why PentesterFlow

Current agentic AI systems often struggle with security-specific workflows, hallucinated findings, weak context retention, poor tool integration, and limited auditability. PentesterFlow addresses those gaps with:

Challenge	PentesterFlow approach
Generic AI workflows	Built-in pentest skills for recon, web vulns, SSRF, SSTI, JWT, GraphQL, race, takeover, Supabase, and deserialization.
Hallucinated findings	`confirm_finding` should be used only after reproduction with request/response evidence.
Long engagements	Saved sessions, compaction, context snapshots, resume recap, and continuous local learning.
Real-world tooling	Shell/Bash, HTTP, Burp bridge, browser capture, MCP, file tools, grep/glob, and custom plugins.
Human oversight	Permission prompts, allow-once/session decisions, and explicit YOLO mode for labs.
Reproducibility	Copy-pasteable commands, Markdown findings, JSON-lines logs, and stable session files.
Large attack surfaces	Coverage tracking, `/next`, skills, captured traffic queries, and learned coverage gaps.

Core Capabilities

Area	What it provides
Agent loop	Plan, act, observe, verify, report, and learn across scoped tasks.
Model backends	Ollama, LM Studio, Kimi, Groq, Gemini, and OpenAI-compatible APIs.
Tools	Shell/Bash, HTTP, file tools, search, browser capture, Burp ingest, MCP, and finding confirmation.
Skills	Markdown playbooks with methodology, payloads, constraints, and allowed tools.
Memory	Session memory, context snapshots, resume recap, and continuous local intelligence.
Reporting	Confirmed findings saved to `./findings/<slug>.md` with evidence, impact, PoC, and remediation.
UX	Full-width terminal UI, slash commands, compact transcripts, permission modals, and interactive provider/model setup.

Install

The installers download the latest standalone binary for your OS and verify the published SHA-256 checksum when available.

# macOS / Linux
curl -fsSL https://raw.githubusercontent.com/PentesterFlow/agent/main/install.sh | sh

# Windows PowerShell
irm https://raw.githubusercontent.com/PentesterFlow/agent/main/install.ps1 | iex

Pin a release or choose an install directory:

PENTESTERFLOW_VERSION=v0.1.6 PENTESTERFLOW_INSTALL_DIR="$HOME/.local/bin" \
  sh -c "$(curl -fsSL https://raw.githubusercontent.com/PentesterFlow/agent/main/install.sh)"

Download binaries directly from GitHub Releases:

OS	Assets
macOS	`pentesterflow-darwin-arm64`, `pentesterflow-darwin-x64`
Linux	`pentesterflow-linux-arm64`, `pentesterflow-linux-x64`
Windows	`pentesterflow-windows-x64.exe`

The x64 standalone binaries are built with Bun's baseline runtime for older x86_64 CPUs. They do not require AVX2.

Quickstart

# Local model example
ollama pull qwen2.5-coder:32b
pentesterflow

Inside the CLI:

/provider
/target https://app.example.com
map the authenticated API surface and test for IDOR

Resume a previous assessment:

pentesterflow --resume <session-id>

On resume, PentesterFlow automatically shows a recap of the previous session's persistent memory so you can continue without manually reconstructing context.

Providers

Interactive setup:

/provider
/model list
/model <id>

CLI examples:

# Ollama
pentesterflow --backend ollama --model qwen2.5-coder:32b

# LM Studio
pentesterflow --backend lmstudio --model zai-org/glm-4.7-flash

# OpenAI-compatible endpoint
pentesterflow --backend openai-compat \
  --base-url https://api.example.com/v1 \
  --api-key sk-...

# Kimi
MOONSHOT_API_KEY=sk-... pentesterflow --backend kimi --model kimi-k2.6

# Groq
GROQ_API_KEY=gsk_... pentesterflow --backend groq --model openai/gpt-oss-20b

# OpenRouter
OPENROUTER_API_KEY=sk-or-... pentesterflow --backend openrouter --model openrouter/auto

# DeepSeek
DEEPSEEK_API_KEY=sk-... pentesterflow --backend deepseek --model deepseek-v4-flash

# Gemini
GEMINI_API_KEY=AIza... pentesterflow --backend gemini --model models/gemini-3.5-flash

Notes:

Groq sessions use a compact prompt and lower compaction threshold to avoid on-demand TPM errors during long assessments.
LM Studio responses are protected with stop tokens and template-marker trimming to avoid repeated <|user|> / <|observation|> leakage.
Gemini picker highlights recommended and cheap-cost models.

Pentest Lifecycle

PentesterFlow is designed to assist across the full engagement:

Scope: set target URL, constraints, credentials, and authorization notes.
Recon: discover hosts, endpoints, technologies, files, APIs, and exposed metadata.
Enumeration: map parameters, roles, auth states, captured browser/Burp traffic, and attack surfaces.
Validation: reproduce candidate issues with deterministic requests and compare evidence.
Coverage: track tested endpoint/parameter/vulnerability-class tuples and ask /next for untested work.
Reporting: persist confirmed findings with PoC, evidence, impact, and remediation.
Learning: save reusable lessons silently so future sessions improve.

Continuous Learning

PentesterFlow includes a local Continuous Learning System. It improves future sessions without retraining model weights and without requiring users to manage memory manually.

What it stores:

User preferences and working style.
Important decisions and project context.
Successful workflows and proven commands.
Mistakes, failed assumptions, and lessons learned.
Coverage gaps, missed checks, and follow-up scenarios.
Finding patterns and evidence requirements.
Tool/config patterns that worked well.

Where it stores memory:

Path	Purpose
`./.pentesterflow/intelligence/scenarios.jsonl`	Project-specific intelligence for the current engagement/workspace.
`~/.pentesterflow/intelligence/scenarios.jsonl`	Personal reusable intelligence across future projects.

How it behaves:

Learning runs in the background after completed turns and compactions.
Retrieval is silent and injected as hidden context only when relevant.
Duplicate project/personal memories are deduped before reaching the model.
Secrets are redacted before storage.
Learning failures are logged, not shown as user-facing task errors.

This keeps the user experience simple while making the agent more effective over time.

Session Memory And Resume

PentesterFlow saves sessions under ~/.pentesterflow/sessions/*.json.

ls -lt ~/.pentesterflow/sessions/*.json | head
pentesterflow --resume <session-id>

Session continuity includes:

Saved conversation history.
Persistent compacted memory.
Target state.
Resume recap on startup.
Context snapshots under ~/.pentesterflow/context/.
Five-minute automatic snapshots during active sessions.

Useful commands:

Command	Purpose
`/compact`	Summarize the current session into persistent memory.
`/memory`	Show saved facts + the session checkpoint.
`/memory add <text>`	Save a durable fact (same as `#<text>`).
`/memory list`	List saved facts.
`/memory forget <text>`	Drop saved facts and checkpoint items matching the text.
`/snapshot`	Write a redacted context snapshot immediately.
`/next [objective]`	Ask for coverage-driven next steps.

Saved memory (`#` quick-add)

Type # followed by anything you want the agent to remember for the rest of this session and beyond — for example #orders API is IDOR-prone on /api/orders/{id}. Use #!<text> to save it to your personal scope instead of the project.

Saved facts are durable, human-readable Markdown — one file per fact with frontmatter — under ./.pentesterflow/memory/ (project) and ~/.pentesterflow/memory/ (personal), with a generated MEMORY.md index.
The fact catalog is pinned into the system prompt on every turn, so it survives compaction; the facts most relevant to the current turn are recalled in full automatically (you'll see a recalled memory: … line).
Secrets are redacted before a fact is written to disk.
Manage them with #<text> / /memory add, /memory list, and /memory forget <text>.

Burp Integration

Use the companion PentesterFlow Burp Integration tool to send selected Burp traffic into the CLI and import confirmed findings back into Burp.

Start the local PentesterFlow listener:

pentesterflow --burp
pentesterflow --burp 9999

From source:

npm run dev -- --burp 9999

The Burp/PentesterFlow bridge supports:

Sending selected Burp requests into PentesterFlow.
Queuing requests as scan tasks.
Importing confirmed findings back into Burp issues.
Preserving full raw requests for evidence and replay.
Reading captured requests and issues through browser_capture_* tools.

The default listener is http://127.0.0.1:9999.

Browser Capture And MCP

pentesterflow --burp starts a local ingest server for captured requests, endpoints, and browser snapshots. The companion pentesterflow-browser-mcp binary exposes the same capture data as an MCP server for compatible clients.

{
  "mcpServers": {
    "pentesterflow-browser": {
      "command": "pentesterflow-browser-mcp",
      "args": []
    }
  }
}

Slash Commands

Command	Description
`/help`	Show keybindings and command reference.
`/provider`	Pick backend, API key, and model interactively.
`/model <id>` / `/model list`	Switch or list backend models.
`/plan [objective]`	Plan-only turn without tool execution.
`/next [objective]`	Coverage-driven next test suggestions.
`/target <url>`	Set or clear the engagement base URL.
`/compact`	Summarize into persistent session memory.
`/memory`	Show current persistent session memory.
`/snapshot`	Write a redacted context snapshot now.
`/burp [port]`	Start the local Burp/PentesterFlow bridge and print its URL + token.
`/skills [enable\|disable\|new <name>]`	Manage or scaffold skills.
`/maxsteps <n>`	Set the per-turn tool-call cap.
`/thinking on\|off`	Toggle visible reasoning guidance.
`/update [version]`	Install the latest or pinned release.
`/yolo [on\|off]`	Toggle auto-approval mode for labs.
`/reset`	Clear conversation and saved session state.
`/clear`	Clear only the on-screen transcript.
`/<skill-name>`	Load a skill into the next turn.
`/exit`	Quit.

Command-Line Flags

Flag	Description
`--backend ollama\|lmstudio\|kimi\|groq\|openrouter\|deepseek\|gemini\|openai-compat`	Select the LLM backend.
`--model <id>`	Set the model id.
`--base-url <url>` / `--api-key <key>`	Configure remote or OpenAI-compatible backends.
`--skills <dirs>`	Load extra skill directories.
`--resume <session-id>`	Resume a saved session and show recap.
`--browser`	Enable Browser MCP tools for the current session.
`--burp [port]`	Start the local Burp/PentesterFlow bridge.
`--browser-ingest [port]`	Deprecated alias for `--burp`.
`--no-stream`	Disable streaming for providers with SSE/tool-call issues.
`--yolo`	YOLO mode: auto-approve non-sensitive tool calls (alias: `--dangerously-skip-permissions`).
`--list-tools` / `--list-skills`	Print registered tools or discovered skills.
`--log <path>`	Override the JSON-lines log path.
`--debug-session`	Write a full JSON-lines debug session log.
`--debug-session-path <path>`	Write debug session log to a custom path.
`--version` / `--help`	Print version or help.

Tools

Tool	Purpose
`shell` / `BashTool`	Run shell commands with approval and safety checks.
`http`	Send HTTP/HTTPS requests against full URLs or active `/target`.
`file_read` / `file_write` / `file_edit`	Read, create, and patch files.
`GlobTool` / `GrepTool`	Discover files and search content.
`web_fetch` / `web_search`	Fetch pages or run web searches.
`ask_user`	Ask for a decision when scope or direction is ambiguous.
`confirm_finding`	Save verified findings to `./findings/<slug>.md`.
`coverage`	Track tested endpoint/parameter/vulnerability-class tuples.
`load_skill`	Load methodology playbooks into context.
`browser_capture_*`	Query captured browser/Burp traffic, endpoints, requests, issues, and snapshots.

Skills

Skills are Markdown playbooks that package methodology, payloads, and tool constraints. Built-in skills include:

Skill	Focus
`recon`	Subdomains, fingerprinting, content discovery, and attack-surface mapping.
`webvuln`	IDOR, broken access control, injection, auth, and session logic.
`ssrf`	Filter bypasses, metadata access, internal reachability, and blind SSRF.
`ssti`	Template-engine fingerprinting and escalation paths.
`jwt`	Algorithm confusion, `kid` abuse, weak secrets, and token validation flaws.
`graphql`	Introspection, authorization gaps, batching, and depth abuse.
`race`	TOCTOU issues, limit bypasses, and race-condition verification.
`takeover`	Dangling DNS and unclaimed cloud resources.
`supabase`	Row-Level Security and anonymous access mistakes.
`deserialize`	Unsafe deserialization sinks and gadget-chain testing.

Discovery order:

Built-in skills/
Project-local ./.pentesterflow/skills/
Personal ~/.pentesterflow/skills/
Directories passed with --skills

Later entries win on name collisions.

Reporting

The confirm_finding tool writes confirmed issues to:

./findings/<slug>.md

Reports include:

Title and severity.
Affected URL, method, parameter, and payload when available.
Response excerpt proving the issue.
Impact and remediation.
Copy-pasteable curl reproduction command.
Raw request material for Burp issue import when available.

Security Model

Authorized use only: built for permitted security work.
Human-in-the-loop by default: permission-gated tools require allow once, allow session, or deny.
Sensitive path protection: high-risk local paths remain gated.
Shell safeguards: catastrophic command patterns are blocked before execution.
Credential redaction: compaction, snapshots, and learning paths redact common secret formats.
Transparent evidence: findings should be backed by reproducible requests and observed responses.
Auditability: sessions, logs, findings, coverage, and release artifacts are written to deterministic local paths.

Configuration And Data

Path	Contents
`~/.pentesterflow/config.json`	Backend, model, endpoint, and disabled-skill settings.
`~/.pentesterflow/sessions/*.json`	Saved sessions for `--resume`.
`~/.pentesterflow/context/*.md`	Redacted context snapshots.
`./.pentesterflow/intelligence/scenarios.jsonl`	Project intelligence learned from this workspace.
`~/.pentesterflow/intelligence/scenarios.jsonl`	Personal reusable intelligence across projects.
`~/.pentesterflow/builtin-skills/<name>/SKILL.md`	Installer-managed shipped skills.
`~/.pentesterflow/skills/<name>/SKILL.md`	Personal skills.
`./.pentesterflow/skills/<name>/SKILL.md`	Project-local skills.
`./findings/<slug>.md`	Confirmed findings for the current engagement.
`./findings/coverage-<session-id>.json`	Coverage state for endpoint/parameter/vulnerability-class testing.
`~/.pentesterflow/logs/pentesterflow.log`	Structured JSON-lines logs.
`~/.pentesterflow/debug/session-*.jsonl`	Opt-in full session debug logs.

Enable complete debug logs when reproducing usage issues:

pentesterflow --debug-session
PENTESTERFLOW_DEBUG_SESSION=1 pentesterflow
PENTESTERFLOW_DEBUG_SESSION=1 PENTESTERFLOW_DEBUG_SESSION_PATH=/tmp/pf-debug.jsonl pentesterflow

Treat debug logs as sensitive because they can contain target data, command output, and copied request material.

Develop

npm install
npm run dev -- --version
npm run dev -- --burp 9999
npm run typecheck
npm run lint
npm run test
npm run build
node dist/cli.js

npm run ci runs typecheck, lint, tests, and build.

Contributing

Issues and pull requests are welcome. Keep changes focused, include tests for behavioral updates, and run npm run ci before opening a pull request. New skills should include a SKILL.md and pass the skill conformance tests.

License

Apache-2.0. Use responsibly and only with authorization.

Report an issue · Request a feature · Releases

Name		Name	Last commit message	Last commit date
Latest commit History 44 Commits
.github/workflows		.github/workflows
assets		assets
scripts		scripts
skills		skills
src		src
.gitignore		.gitignore
AUDIT.md		AUDIT.md
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
PROJECT.md		PROJECT.md
README.md		README.md
biome.json		biome.json
install.ps1		install.ps1
install.sh		install.sh
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json
tsup.config.ts		tsup.config.ts
vitest.config.ts		vitest.config.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Human-in-the-loop Agentic AI CLI for penetration testers and bug hunters.

Overview

Why PentesterFlow

Core Capabilities

Install

Quickstart

Providers

Pentest Lifecycle

Continuous Learning

Session Memory And Resume

Saved memory (`#` quick-add)

Burp Integration

Browser Capture And MCP

Slash Commands

Command-Line Flags

Tools

Skills

Reporting

Security Model

Configuration And Data

Develop

Contributing

License

About

Uh oh!

Releases 16

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Human-in-the-loop Agentic AI CLI for penetration testers and bug hunters.

Overview

Why PentesterFlow

Core Capabilities

Install

Quickstart

Providers

Pentest Lifecycle

Continuous Learning

Session Memory And Resume

Saved memory (# quick-add)

Burp Integration

Browser Capture And MCP

Slash Commands

Command-Line Flags

Tools

Skills

Reporting

Security Model

Configuration And Data

Develop

Contributing

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 16

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Saved memory (`#` quick-add)

Packages