Kingfisher is a blazingly fast secret‑scanning and validation tool built in Rust. It combines Intel’s hardware‑accelerated Hyperscan regex engine with language‑aware parsing via Tree‑Sitter, and ships with 700+ built‑in rules to detect, validate, and triage secrets before they ever reach production
- Performance: Multi‑threaded, Hyperscan‑powered scanning for massive codebases
- Language‑Aware Accuracy: AST parsing in 20+ languages via Tree‑Sitter reduces contextless regex matches. see docs/PARSING.md
- Built-In Validation: 700+ built-in detection rules, many with live-credential validators that call the relevant service APIs (AWS, Azure, GCP, Stripe, etc.) to confirm a secret is active. You can extend or override the library by adding YAML-defined rules on the command line—see docs/RULES.md for details
- Git History Scanning: Scan local repos, remote GitHub/GitLab orgs/users, or arbitrary GitHub/GitLab repos
On macOS, you can simply
brew install kingfisher
Pre-built binaries are also available on the Releases section of this page.
Or you may compile for your platform via make
:
# NOTE: Requires Docker
make linux
# macOS
make darwin
# Windows x64 --- requires building from a Windows host with Visual Studio installed
./buildwin.bat -force
# Build all targets
make linux-all # builds both x64 and arm64
make darwin-all # builds both x64 and arm64
make all # builds for every OS and architecture supported
Kingfisher ships with 700+ rules with HTTP and service‑specific validation checks (AWS, Azure, GCP, etc.) to confirm if a detected string is a live credential.
However, you may want to add your own custom rules, or modify a detection to better suit your needs / environment.
First, review docs/RULES.md to learn how to create custom Kingfisher rules, or find a prompt to provide to an LLM (eg ChatGPT, Gemini, Claude, etc) to help generate one.
Once you've done that, you can provide your custom rules (defined in a YAML file) and provide it to Kingfisher at runtime --- no recompiling required!
Note
kingfisher scan
detects whether the input is a Git repository or a plain directory—no extra flags required.
kingfisher scan /path/to/code
## NOTE: This path can refer to:
# 1. a local git repo
# 2. a directory with many git repos
# 3. or just a folder with files and subdirectories
## To explicitly prevent scanning git commit history add:
# `--git-history=none`
kingfisher scan /projects/mono‑repo‑dir
kingfisher scan ~/src/myrepo --no-validate
kingfisher scan ./service --only-valid
kingfisher scan . --format json | tee kingfisher.json
kingfisher scan . --format sarif --output findings.sarif
cat /path/to/file.py | kingfisher scan -
(prefix matching: --rule kingfisher.aws
loads kingfisher.aws.*
)
# Only apply AWS-related rules (kingfisher.aws.1 + kingfisher.aws.2)
kingfisher scan /path/to/repo --rule kingfisher.aws
kingfisher scan --github-organization my-org
kingfisher scan --git-url https://github.com/org/repo.git
# Optionally provide a GitHub Token
KF_GITHUB_TOKEN="ghp_…" kingfisher scan --git-url https://github.com/org/private_repo.git
kingfisher scan --gitlab-group my-group
kingfisher scan --gitlab-user johndoe
kingfisher scan --git-url https://gitlab.com/group/project.git
kingfisher gitlab repos list --group my-group
Variable | Purpose |
---|---|
KF_GITHUB_TOKEN |
GitHub Personal Access Token |
KF_GITLAB_TOKEN |
GitLab Personal Access Token |
Set them temporarily per command:
KF_GITLAB_TOKEN="glpat-…" kingfisher scan --gitlab-group my-group
Or export for the session:
export KF_GITLAB_TOKEN="glpat-…"
If no token is provided Kingfisher still works for public repositories.
Code | Meaning |
---|---|
0 | No findings |
200 | Findings discovered |
205 | Validated findings discovered |
kingfisher rules list
kingfisher scan \
--load-builtins=false \
--rules-path path/to/my_rules.yaml \
./src/
kingfisher scan \
--rules-path ./custom-rules/ \
--rules-path my_rules.yml \
~/path/to/project-dir/
# Check custom rules - this ensures all regular expressions compile, and can match the rule's `examples` in the YML file
kingfisher rules check --rules-path ./my_rules.yml
# List GitHub repos
kingfisher github repos list --user my-user
kingfisher github repos list --organization my-org
--no-dedup
: Report every occurrence of a finding (disable the default de-duplicate behavior)--confidence <LEVEL>
: (low|medium|high)--min-entropy <VAL>
: Override default threshold--no-binary
: Skip binary files--no-extract-archives
: Do not scan inside archives--extraction-depth <N>
: Specifies how deep nested archives should be extracted and scanned (default: 2)--redact
: Replaces discovered secrets with a one-way hash for secure output
The document below details the four-field formula (rule SHA-1, origin label, start & end offsets) hashed with XXH3-64 to create Kingfisher’s 64-bit finding fingerprint, and explains how this ID powers safe deduplication; plus how --no-dedup
can be used shows every raw match.
See (docs/FINGERPRINT.md)
kingfisher scan --help
By integrating Kingfisher into your development lifecycle, you can:
- Prevent Costly Breaches
Early detection of embedded credentials avoids expensive incident response, legal fees, and reputation damage - Automate Compliance
Enforce secret‑scanning policies across GitOps, CI/CD, and pull requests to help satisfy SOC 2, PCI‑DSS, GDPR, and other standards - Reduce Noise, Focus on Real Threats
Validation logic filters out false positives and highlights only active, valid secrets (--only-valid
) - Accelerate Dev Workflows
Run in parallel across dozens of languages, integrate with GitHub Actions or any pipeline, and shift security left to minimize delays
Embedding credentials in code repositories is a pervasive, ever‑present risk that leads directly to data breaches:
-
Uber (2016)
- Incident: Attackers stole GitHub credentials, retrieved an AWS key from a developer’s private repo, and accessed data on 57 million riders and 600 000 drivers.
- Sources: BBC News, Ars Technica
-
AWS
- Incident: An AWS engineer accidentally published log files and CloudFormation templates containing AWS key pairs (including “rootkey.csv”) to a public GitHub repo.
- Sources: The Register, UpGuard
-
Infosys
- Incident: Infosys published an internal PyPI package embedding a FullAdminAccess AWS key for a Johns Hopkins data bucket; the key remained active for over a year.
- Sources: The Stack, Tom Forbes Blog
-
Microsoft
- Incident: Microsoft’s AI research GitHub repo included an overly permissive Azure SAS token, exposing 38 TB of private data (workstation backups, 30,000+ Teams messages).
- Sources: Wiz Blog, TechCrunch
-
GitHub
- Incident: GitHub discovered its RSA SSH host private key was briefly exposed in a public repository and rotated it out of caution.
- Sources: GitHub Blog
Left unchecked, leaked secrets can lead to unauthorized access, pivoting within your environment, regulatory fines, and brand‑damaging incident response costs.
See (docs/COMPARISON.md)