Nightfang

Security research automation for the AI era
Scan LLM endpoints. Audit npm packages. Review source code. Prove every finding is real.

Quick Start · Commands · How It Works · What It Scans · Comparison · CI/CD · About

Nightfang is an open-source pentesting toolkit that combines four autonomous AI agents with a template-driven attack engine. Point it at an API, an npm package, or a Git repo — it discovers vulnerabilities, attacks them, re-exploits each finding to eliminate false positives, and generates SARIF reports that plug straight into GitHub's Security tab.

One command. Zero config. Every finding verified with proof.

Agent	Status	Findings
Discover	Completed	Endpoints mapped
Attack	Completed	9 probes executed
Verify	Completed	9 findings confirmed
Report	Completed	4 critical, 3 high, 2 medium

Last scan: 2026-03-27 23:41 UTC — View full report

Quick Start

# Scan an LLM endpoint
npx nightfang scan --target https://your-app.com/api/chat

# Audit an npm package for vulnerabilities
npx nightfang audit lodash

# Deep security review of a codebase
npx nightfang review ./my-ai-app

That's it. Nightfang discovers your attack surface, launches targeted attacks, verifies findings, and generates a report — all in under 5 minutes.

Commands

Nightfang ships five commands — from quick API probes to deep source-level audits:

Command	What It Does	Example
`scan`	Probe LLM endpoints, MCP servers, and AI APIs for vulnerabilities	`npx nightfang scan --target https://api.example.com/chat`
`audit`	Install and security-audit any npm package with static analysis + AI review	`npx nightfang audit express@4.18.2`
`review`	Deep source code security review of a local repo or GitHub URL	`npx nightfang review https://github.com/user/repo`
`history`	Browse past scans with status, depth, findings count, and duration	`npx nightfang history --limit 20`
`findings`	Query, filter, and inspect verified findings across all scans	`npx nightfang findings list --severity critical`

How It Works

Nightfang runs four specialized AI agents in sequence. Each agent builds on the previous one's output:

  +-----------+     +-----------+     +-----------+     +-----------+
  | DISCOVER  | --> |  ATTACK   | --> |  VERIFY   | --> |  REPORT   |
  |  (Recon)  |     | (Offense) |     | (Confirm) |     | (Output)  |
  +-----------+     +-----------+     +-----------+     +-----------+
   Maps endpoints    Runs 47+ test    Re-exploits       Generates SARIF,
   Model detection   cases across     each finding       Markdown, and JSON
   System prompt     7 categories     to kill false      with severity +
   extraction        of attacks       positives          remediation

Agent	Role	What It Does
Discover	Recon	Maps endpoints, detects models, extracts system prompts, enumerates MCP tool schemas
Attack	Offense	Prompt injection, jailbreaks, tool poisoning, data exfiltration, encoding bypasses — 12 attack templates, 7 categories
Verify	Validation	Re-exploits each finding independently. If it can't reproduce it, it's killed as a false positive
Report	Output	SARIF for GitHub Security tab, Markdown for humans, JSON for pipelines — with severity scores and remediation

The verification step is the differentiator. No more triaging 200 "possible prompt injections" that turn out to be nothing.

What Nightfang Scans

Target	Command	How
LLM Endpoints — ChatGPT, Claude, Llama APIs, custom chatbots	`scan --target <url>`	HTTP probing + multi-turn agent attacks
MCP Servers — Tool schemas, input validation, authorization	`scan --target <url> --mode mcp`	Connects to server, enumerates tools, tests each
Web Apps & APIs — AI-powered copilots, agents, RAG pipelines	`scan --target <url> --mode deep --repo ./src`	API probing + source code analysis
npm Packages — Dependency supply chain, malicious code	`audit <package>`	Installs in sandbox, runs semgrep + AI code review
Git Repositories — Source-level security review	`review <path-or-url>`	Deep analysis with Claude Code, Codex, or Gemini CLI

OWASP LLM Top 10 Coverage

#	Category	Status
LLM01	Prompt Injection	✅ Direct + indirect + encoding bypass
LLM02	Insecure Output Handling	✅ XSS, code exec via output
LLM03	Training Data Poisoning	🚧 Detection only
LLM04	Model Denial of Service	✅ Resource exhaustion probes
LLM05	Supply Chain Vulnerabilities	✅ MCP tool poisoning, npm audit, dependency confusion
LLM06	Sensitive Information Disclosure	✅ PII/secret extraction
LLM07	Insecure Plugin Design	✅ Tool schema abuse, SSRF via tools
LLM08	Excessive Agency	✅ Privilege escalation, unauthorized actions
LLM09	Overreliance	🚧 Hallucination-based trust attacks
LLM10	Model Theft	✅ Model extraction, prompt theft

Example Output

See the demo GIF above for real scan output, or run it yourself:

npx nightfang scan --target https://your-app.com/api/chat --depth quick

For a verbose view with the animated attack replay:

npx nightfang scan --target https://your-app.com/api/chat --verbose

Scan Depth & Cost

Depth	Test Cases	Time	Estimated Cost
`quick`	~15	~1 min	$0.05–$0.15
`default`	~50	~3 min	$0.15–$0.50
`deep`	~150	~10 min	$0.50–$1.00

Cost depends on the LLM provider you configure. Nightfang supports OpenAI, Anthropic, and local models via Ollama.

# Quick scan for CI
npx nightfang scan --target https://api.example.com/chat --depth quick

# Deep audit before launch
npx nightfang scan --target https://api.example.com/chat --depth deep

# Source + API scan with Claude Code
npx nightfang scan --target https://api.example.com/chat --runtime claude --mode deep --repo ./src

# MCP server audit
npx nightfang scan --target https://mcp-server.example.com --mode mcp --runtime claude

# Audit an npm package
npx nightfang audit react --depth deep --runtime claude

# Review a GitHub repo
npx nightfang review https://github.com/user/repo --runtime codex --depth deep

Runtime Modes

Bring your own agent CLI — Nightfang orchestrates it:

Runtime	Flag	Best For
`api`	`--runtime api`	CI, quick scans — fast, cheap, no dependencies (default)
`claude`	`--runtime claude`	Attack generation, deep analysis — spawns Claude Code CLI
`codex`	`--runtime codex`	Verification, source analysis — spawns Codex CLI
`gemini`	`--runtime gemini`	Large context source analysis — spawns Gemini CLI
`opencode`	`--runtime opencode`	Multi-provider flexibility — spawns OpenCode CLI
`auto`	`--runtime auto`	Best overall — auto-detects installed runtimes, picks best per stage

Combined with scan modes:

Mode	Flag	Description
`probe`	`--mode probe`	Send payloads to API, check responses (default)
`deep`	`--mode deep`	API probing + source code audit (requires `--repo`)
`mcp`	`--mode mcp`	Connect to MCP server, enumerate tools, test each for security issues

deep and mcp modes require a process runtime (claude, codex, gemini, opencode, or auto).

How It Compares

Feature	Nightfang	promptfoo	garak	semgrep	nuclei
Autonomous multi-agent pipeline	✅ 4 specialized agents	❌ Single runner	❌ Single runner	❌ Rule-based	❌ Template runner
Verification (no false positives)	✅ Re-exploits to confirm	❌	❌	❌	❌
LLM endpoint scanning	✅ Prompt injection, jailbreaks, exfil	✅ Red-teaming	✅ Probes	❌	❌
MCP server security	✅ Tool poisoning, schema abuse	❌	❌	❌	❌
npm package audit	✅ Semgrep + AI review	❌	❌	✅ Rules only	❌
Source code review	✅ AI-powered deep analysis	❌	❌	✅ Rules only	❌
OWASP LLM Top 10	✅ 8/10 covered	Partial	Partial	N/A	N/A
SARIF + GitHub Security tab	✅	✅	❌	✅	✅
One command, zero config	✅ `npx nightfang scan`	Needs YAML config	Needs Python setup	Needs rules config	Needs templates
Open source	✅ MIT	✅ (acquired by OpenAI)	✅	✅	✅
Cost per scan	$0.05–$1.00	Varies	Free (local)	Free (OSS) / Paid (Pro)	Free

Nightfang isn't replacing semgrep or nuclei — it covers the AI-specific attack surface they can't see. Use them together.

GitHub Action

Add Nightfang to your CI/CD pipeline:

name: AI Security Scan
on: [push, pull_request]

permissions:
  contents: read
  security-events: write

jobs:
  nightfang:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Run Nightfang
        uses: peaktwilight/nightfang/action@v1
        with:
          target: ${{ secrets.STAGING_API_URL }}
          depth: default  # quick | default | deep
          fail-on-severity: high  # critical | high | medium | low | info | none

      - name: Upload SARIF
        uses: github/codeql-action/upload-sarif@v3
        with:
          sarif_file: nightfang-report/report.sarif

Findings show up directly in the Security tab of your repository.

Findings Management

Every finding is persisted in a local SQLite database. Query across scans:

# List critical findings
npx nightfang findings list --severity critical

# Filter by category
npx nightfang findings list --category prompt-injection --status confirmed

# Inspect a specific finding with full evidence
npx nightfang findings show NF-001

# Browse scan history
npx nightfang history --limit 10

Finding lifecycle: discovered → verified → confirmed → scored → reported (or false-positive if verification fails).

Roadmap

Built By

Created by a security researcher with 7 published CVEs across node-forge, mysql2, uptime-kuma, liquidjs, picomatch, and jspdf.

Nightfang exists because traditional security tools can't see AI attack surfaces. You can't nmap a language model. You can't write a static rule for a jailbreak that hasn't been invented yet. You need agents that think like attackers — and then prove what they find.

Contributing

Contributions welcome! See CONTRIBUTING.md for guidelines.

git clone https://github.com/peaktwilight/nightfang.git
cd nightfang
pnpm install
pnpm test

License

MIT — use it, fork it, ship it.

Name		Name	Last commit message	Last commit date
Latest commit History 67 Commits
.github		.github
action		action
assets		assets
demo		demo
packages		packages
scripts		scripts
templates		templates
test-targets		test-targets
www		www
.gitignore		.gitignore
.npmignore		.npmignore
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
ROADMAP.md		ROADMAP.md
SECURITY.md		SECURITY.md
package.json		package.json
pnpm-lock.yaml		pnpm-lock.yaml
pnpm-publish-summary.json		pnpm-publish-summary.json
pnpm-workspace.yaml		pnpm-workspace.yaml
tsconfig.base.json		tsconfig.base.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Nightfang

Quick Start

Commands

How It Works

What Nightfang Scans

OWASP LLM Top 10 Coverage

Example Output

Scan Depth & Cost

Runtime Modes

How It Compares

GitHub Action

Findings Management

Roadmap

Built By

Contributing

License

About

Uh oh!

Releases 2

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Nightfang

Quick Start

Commands

How It Works

What Nightfang Scans

OWASP LLM Top 10 Coverage

Example Output

Scan Depth & Cost

Runtime Modes

How It Compares

GitHub Action

Findings Management

Roadmap

Built By

Contributing

License

About

Topics

Resources

License

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages