██████╗ ██╗ ██╗ ██████╗ ████████╗ ██████╗ ███╗ ██╗
██╔══██╗██║ ██║██╔═══██╗╚══██╔══╝██╔═══██╗████╗ ██║
██████╔╝███████║██║ ██║ ██║ ██║ ██║██╔██╗ ██║
██╔═══╝ ██╔══██║██║ ██║ ██║ ██║ ██║██║╚██╗██║
██║ ██║ ██║╚██████╔╝ ██║ ╚██████╔╝██║ ╚████║
╚═╝ ╚═╝ ╚═╝ ╚═════╝ ╚═╝ ╚═════╝ ╚═╝ ╚═══╝
RECODED BY RERO · Built for security researchers, bug bounty hunters & red teamers.
- Overview
- Features
- Scan Modules
- Installation
- Usage
- Arguments
- Advanced Usage
- Output Structure
- Architecture
- Legal Disclaimer
Photon v2 is a fully rewritten, modular passive web reconnaissance and crawling framework. Built on top of the original Photon project, it introduces a plugin-based scan module system, an enhanced regex engine, multi-threaded crawl architecture, and structured output pipelines.
Designed for passive OSINT collection, bug bounty reconnaissance, security auditing, and web infrastructure analysis — Photon v2 gives you a complete surface-level picture of a target with minimal noise.
All core components and the plugin system have been rewritten from scratch by RERO.
| Category | Capability |
|---|---|
| Crawling | Multi-threaded, depth-controlled, domain-aware crawler |
| Link Analysis | Internal / external / JS / fuzzable URL classification |
| Intelligence | Email, phone, social profile, credit card pattern extraction |
| Wayback | Historical URL seeding via archive.org CDX API |
| Mirroring | Clone target website to local filesystem |
| Entropy Scanning | High-entropy key and token detection |
| Export | Structured JSON and CSV output |
| Proxy Support | Single proxy or multi-proxy rotation from file |
| User-Agent | Randomized UA rotation from bundled wordlist |
| Custom Regex | User-defined extraction patterns across all pages |
Photon v2 ships with 10 independent scan modules, each toggled with a dedicated flag. Modules run post-crawl against collected data or fire inline during crawling.
Analyzes page source, HTTP response headers, and cookies to identify the underlying technology stack. Matches against a signature database of 30+ frameworks, libraries, servers, and services.
Detects: WordPress, Drupal, Joomla, Django, Laravel, React, Vue.js, Angular, jQuery, Bootstrap, Nginx, Apache, PHP, ASP.NET, Node.js, Ruby on Rails, Cloudflare, Google Analytics, Shopify, Wix, WooCommerce, Elasticsearch, GraphQL, and more.
[+] 4 technologies identified:
▸ Nginx
▸ PHP/8.1.2
▸ WordPress
▸ Cloudflare
Inspects response headers and classifies them into three categories:
- Present security headers —
Strict-Transport-Security,Content-Security-Policy,X-Frame-Options,X-Content-Type-Options,Referrer-Policy,Permissions-Policy - Missing security headers — Required headers absent from the response
- Information-leaking headers —
Server,X-Powered-By,X-Generatorexposing version strings
[+] Security headers present:
✓ Strict-Transport-Security: max-age=31536000; includeSubDomains
✓ X-Frame-Options: SAMEORIGIN
[-] Missing security headers:
✗ Content-Security-Policy
✗ Permissions-Policy
[!] Info-leaking headers:
! X-Powered-By: PHP/8.1.2
! Server: Apache/2.4.54 (Ubuntu)
Scans every page's source code against 17 compiled regex patterns to detect exposed credentials, API keys, tokens, and private keys.
| Pattern Type | Detection Logic |
|---|---|
AWS_ACCESS_KEY |
AKIA[0-9A-Z]{16} |
AWS_SECRET |
Contextual aws...secret key pairs |
GITHUB_TOKEN |
ghp_, gho_, ghs_, ghu_, ghr_ prefixed tokens |
GOOGLE_API |
AIza[0-9A-Za-z-_]{35} |
SLACK_TOKEN |
xoxb-, xoxp-, xoxa- prefixed tokens |
SLACK_WEBHOOK |
hooks.slack.com/services/T.../B.../... |
STRIPE_KEY |
pk_live_, sk_live_, pk_test_, sk_test_ |
JWT_TOKEN |
eyJ[header].[payload].[signature] three-part format |
PRIVATE_KEY |
-----BEGIN (RSA|EC)? PRIVATE KEY----- |
FIREBASE |
Firebase Cloud Messaging server key |
MAILCHIMP |
[a-f0-9]{32}-us[0-9]{1,2} format |
SENDGRID |
SG.[22-char].[43-char] format |
TWILIO |
SK + 32 hex characters |
HEROKU |
UUID-format application identifier |
GENERIC_SECRET |
password=, secret=, passwd= assignments |
GENERIC_API_KEY |
api_key=, apikey=, api-key= assignments |
[!] SECRET [GITHUB_TOKEN] at https://example.com/assets/config.js
[!] SECRET [AWS_ACCESS_KEY] at https://example.com/js/app.bundle.js
Crawls JavaScript files and internal URLs to surface REST API endpoints. The rendpoint regex engine extracts path patterns directly from JS source code, revealing endpoints not linked anywhere in the HTML.
[+] 12 API endpoints found:
▸ /api/v1/users
▸ /api/v1/auth/login
▸ /api/v2/products/{id}
▸ /internal/admin/dashboard
▸ /api/v1/payments/charge
Analyzes all collected URLs to enumerate GET parameters, identify parameter names, and map parameterized endpoints. Produces clean output ready for fuzzing pipelines.
[+] 8 parameterized URLs found:
▸ /search [q, page, sort, filter]
▸ /user [id, token, redirect]
▸ /product [id, category, ref]
▸ /api/items [limit, offset, order]
Detects login forms, token-gated endpoints, and authentication walls through URL pattern analysis and endpoint classification. Useful for mapping the full authentication attack surface.
[+] 5 auth surfaces found:
▸ https://example.com/login
▸ https://example.com/admin
▸ https://example.com/account/signin
▸ https://example.com/api/v1/auth/token
▸ https://example.com/oauth/authorize
Tests a sample of internal URLs against non-standard HTTP methods — OPTIONS, PUT, DELETE, TRACE, PATCH, CONNECT. Risky methods are color-coded in output and saved separately.
▸ /api/v1/users → GET, POST, PUT, DELETE ← DANGEROUS
▸ /api/v1/products → GET, POST
▸ /admin/ → GET
▸ /api/v1/config → GET, PUT, DELETE ← DANGEROUS
Analyzes status codes, server identification strings, and response characteristics across internal URLs. Classifies accessible, redirecting, forbidden, and error-state endpoints.
[200] /dashboard Apache/2.4.54
[200] /api/v1/users nginx/1.22.0
[301] /admin → /admin/
[403] /config
[500] /api/legacy
Queries two passive DNS sources to enumerate subdomains without sending a single packet to the target:
- crt.sh — Certificate Transparency log queries
- HackerTarget — Passive DNS database lookup
[*] Querying crt.sh & HackerTarget for example.com
[+] 14 subdomains discovered:
▸ api.example.com
▸ dev.example.com
▸ staging.example.com
▸ admin.example.com
▸ mail.example.com
▸ vpn.example.com
Queries the Wayback Machine CDX API for historically archived URLs of the target domain and injects them into the crawl queue. Surfaces removed pages, old API versions, and deprecated endpoints still available in the archive.
Downloads every visited page during the crawl, preserving the original directory structure on the local filesystem. Enables offline analysis, diffing, and long-term archiving.
# Clone the repository
git clone https://github.com/WHY-RERO/Photon_v2
cd Photon_v2
# Install dependencies
pip install -r requirements.txtDependencies:
requests>=2.28.0
tld>=0.12.6
urllib3>=1.26.0
Requires Python 3.8 or higher.
python photon.py -u https://example.compython photon.py -u https://example.com -l 3 -t 16python photon.py -u https://example.com \
--tech \
--secrets \
--header-scan \
--api-scan \
--param-scan \
--auth-scan \
--methods \
--response-scan \
--dns \
--waybackpython photon.py -u https://example.com \
-c "session=abc123; auth_token=xyz789" \
--secrets --api-scan --param-scan# Single proxy
python photon.py -u https://example.com -p 127.0.0.1:8080
# Proxy list from file
python photon.py -u https://example.com -p proxies.txt# Export as JSON
python photon.py -u https://example.com --export json -o ./results
# Export as CSV
python photon.py -u https://example.com --export csv -o ./resultspython photon.py -u https://example.com --only-urls --export json# Pipe fuzzable URLs directly to ffuf
python photon.py -u https://example.com --stdout fuzzable | \
ffuf -w - -u FUZZ -c
# Write internal URLs to file
python photon.py -u https://example.com --stdout internal > urls.txt| Flag | Description |
|---|---|
-u, --url |
Target URL (required) |
-s, --seeds |
Additional seed URLs to include in the crawl queue |
--wayback |
Seed crawl queue with archived URLs from archive.org |
| Flag | Default | Description |
|---|---|---|
-l, --level |
2 |
Crawl depth |
-t, --threads |
4 |
Thread count |
-d, --delay |
0 |
Delay between requests (seconds) |
--timeout |
6 |
Request timeout (seconds) |
--exclude |
— | Exclude URLs matching a regex pattern |
--only-urls |
— | Collect URLs only, skip all analysis |
| Flag | Description |
|---|---|
-o, --output |
Output directory (defaults to target hostname) |
-e, --export |
Export format: json or csv |
--stdout |
Print a specific dataset to stdout |
-v, --verbose |
Enable verbose output |
| Flag | Description |
|---|---|
-c, --cookie |
Cookie string for authenticated crawling |
--headers |
Prompt for custom request headers interactively |
--user-agent |
Custom user-agent string(s), comma-separated for rotation |
-p, --proxy |
Proxy (IP:PORT) or path to proxy list file |
-r, --regex |
Custom regex pattern — extract matches from every page |
| Flag | Description |
|---|---|
--tech |
Technology stack fingerprinting |
--header-scan |
HTTP security header analysis |
--secrets |
Credential and secret detection (17 patterns) |
--api-scan |
API endpoint discovery from JS and URLs |
--param-scan |
GET parameter enumeration |
--auth-scan |
Authentication surface mapping |
--methods |
HTTP method discovery (PUT, DELETE, TRACE, etc.) |
--response-scan |
HTTP response status and server analysis |
--dns |
Subdomain enumeration via crt.sh + HackerTarget |
--keys |
High-entropy key and token extraction |
--clone |
Mirror the target website locally |
--update |
Update Photon from GitHub |
# Phase 1 — Surface mapping: subdomains + historical URLs
python photon.py -u https://target.com \
--dns --wayback \
-l 2 -t 8 \
--export json -o recon/phase1/
# Phase 2 — Technology + security posture analysis
python photon.py -u https://target.com \
--tech --header-scan --secrets \
-l 3 -t 12 \
--export json -o recon/phase2/
# Phase 3 — API surface and attack vector discovery
python photon.py -u https://target.com \
--api-scan --param-scan --auth-scan --methods \
-l 3 -t 16 \
--export json -o recon/phase3/# Capture session cookies from your browser after logging in
python photon.py -u https://target.com/dashboard \
-c "PHPSESSID=abc123; remember_token=def456" \
--secrets --api-scan --param-scan \
-l 3 --export json -o recon/authed/python photon.py -u https://target.com \
-l 5 -t 30 -d 0.1 \
--wayback --dns --clone \
--tech --secrets --header-scan \
--api-scan --param-scan --auth-scan \
--methods --response-scan --keys \
--export json -o ./full_recon -v# Skip logout endpoints, static assets, and CDN URLs
python photon.py -u https://example.com \
--exclude "(logout|\.css|\.png|\.woff|cdn\.)" \
-l 3 --api-scan --secrets# Crawl from multiple starting points simultaneously
python photon.py -u https://example.com \
-s https://example.com/blog https://example.com/docs \
-l 3 -t 10 --api-scan# Feed fuzzable URLs directly into ffuf
python photon.py -u https://example.com --stdout fuzzable | \
ffuf -w - -u FUZZ -mc 200,403
# Feed endpoints into nuclei for template-based scanning
python photon.py -u https://example.com --stdout internal | \
nuclei -l - -t technologies/
# Extract subdomains and pass to httpx for probing
python photon.py -u https://example.com --dns --stdout internal | \
httpx -silent -title -tech-detectOn completion, results are saved to a directory named after the target hostname. Each dataset is written to its own file for easy downstream processing.
example.com/
│
├── internal.txt # All internal URLs discovered
├── external.txt # All external URLs discovered
├── scripts.txt # JavaScript file URLs
├── endpoints.txt # API endpoints extracted from JS
├── fuzzable.txt # Parameterized URLs (ready for fuzzing)
├── intel.txt # Emails, social profiles, phone numbers
├── files.txt # Downloadable file URLs
├── robots.txt # Entries parsed from robots.txt
├── keys.txt # High-entropy key strings
├── failed.txt # Failed / unreachable URLs
│
├── technologies.txt # --tech results
├── security_headers.txt # --header-scan results
├── secrets.txt # --secrets results
├── api_endpoints.txt # --api-scan results
├── parameters.txt # --param-scan results
├── auth_surfaces.txt # --auth-scan results
├── http_methods.txt # --methods results
├── responses.txt # --response-scan results
├── subdomains.txt # --dns results
│
├── results.json # --export json (all datasets combined)
└── results.csv # --export csv
{
"internal": [
"https://example.com/",
"https://example.com/about",
"https://example.com/api/v1/users"
],
"fuzzable": [
"https://example.com/search?q=test&page=1&sort=asc"
],
"endpoints": [
"/api/v1/users",
"/api/v2/payments/charge",
"/internal/admin/settings"
],
"secrets": [
{
"type": "AWS_ACCESS_KEY",
"value": "AKIAIOSFODNN7EXAMPLE",
"url": "https://example.com/js/config.bundle.js"
}
],
"technologies": ["WordPress", "PHP/8.1", "Nginx", "Cloudflare"],
"subdomains": ["api.example.com", "staging.example.com", "mail.example.com"]
}Photon_v2/
│
├── photon.py # Entry point · argument parser · crawl loop · results output
│
├── core/
│ ├── config.py # VERSION · INTELS list · BAD_TYPES · sensitive patterns
│ ├── colors.py # Terminal color codes and output symbols
│ ├── flash.py # Multi-threaded task executor (thread pool wrapper)
│ ├── requester.py # HTTP request engine (proxy rotation · UA rotation · retry)
│ ├── regex.py # Compiled regex patterns (href · script · intel · endpoint · entropy)
│ ├── utils.py # Helpers: Luhn check · entropy · proxy test · writer · timer
│ ├── mirror.py # Local website clone engine
│ ├── prompt.py # Interactive custom header input
│ ├── updater.py # GitHub update checker
│ ├── zap.py # robots.txt parsing and initial URL seeding
│ └── user-agents.txt # User-agent rotation wordlist
│
└── plugins/
├── tech_scan.py # Technology fingerprinting · 30+ signatures · header + body + cookie
├── header_scan.py # Security header audit · present / missing / leaking classification
├── secret_scan.py # Credential detection · 17 compiled patterns
├── api_scan.py # API endpoint classification from JS and URL patterns
├── param_scan.py # GET parameter enumeration and grouping
├── auth_scan.py # Authentication surface detection and mapping
├── method_scan.py # HTTP method probing (PUT · DELETE · TRACE · PATCH)
├── response_scan.py # HTTP status code and server header analysis
├── find_subdomains.py # Subdomain enumeration via crt.sh + HackerTarget
├── dnsdumpster.py # DNS Dumpster passive DNS integration
├── wayback.py # Wayback Machine CDX API historical URL seeding
└── exporter.py # JSON / CSV structured export engine
This tool is intended for authorized security testing, research, and educational purposes only.
Running Photon against systems you do not own or have explicit written permission to test is illegal in most jurisdictions and may result in civil or criminal prosecution. The author assumes no liability for any misuse, damage, or legal consequences arising from the use of this software. Always obtain proper written authorization before scanning any target.
RECODED BY RERO · Photon v2.0.0