GitHub - WHY-RERO/Photon_V2: Phontom v2 – Advanced OSINT & reconnaissance tool for cybersecurity researchers and penetration testers.

  ██████╗ ██╗  ██╗ ██████╗ ████████╗ ██████╗ ███╗   ██╗
  ██╔══██╗██║  ██║██╔═══██╗╚══██╔══╝██╔═══██╗████╗  ██║
  ██████╔╝███████║██║   ██║   ██║   ██║   ██║██╔██╗ ██║
  ██╔═══╝ ██╔══██║██║   ██║   ██║   ██║   ██║██║╚██╗██║
  ██║     ██║  ██║╚██████╔╝   ██║   ╚██████╔╝██║ ╚████║
  ╚═╝     ╚═╝  ╚═╝ ╚═════╝    ╚═╝    ╚═════╝ ╚═╝  ╚═══╝

Passive Web Reconnaissance & Crawling Framework

RECODED BY RERO · Built for security researchers, bug bounty hunters & red teamers.

Overview

Photon v2 is a fully rewritten, modular passive web reconnaissance and crawling framework. Built on top of the original Photon project, it introduces a plugin-based scan module system, an enhanced regex engine, multi-threaded crawl architecture, and structured output pipelines.

Designed for passive OSINT collection, bug bounty reconnaissance, security auditing, and web infrastructure analysis — Photon v2 gives you a complete surface-level picture of a target with minimal noise.

All core components and the plugin system have been rewritten from scratch by RERO.

Features

Category	Capability
Crawling	Multi-threaded, depth-controlled, domain-aware crawler
Link Analysis	Internal / external / JS / fuzzable URL classification
Intelligence	Email, phone, social profile, credit card pattern extraction
Wayback	Historical URL seeding via archive.org CDX API
Mirroring	Clone target website to local filesystem
Entropy Scanning	High-entropy key and token detection
Export	Structured JSON and CSV output
Proxy Support	Single proxy or multi-proxy rotation from file
User-Agent	Randomized UA rotation from bundled wordlist
Custom Regex	User-defined extraction patterns across all pages

Scan Modules

Photon v2 ships with 10 independent scan modules, each toggled with a dedicated flag. Modules run post-crawl against collected data or fire inline during crawling.

`--tech` — Technology Fingerprinting

Analyzes page source, HTTP response headers, and cookies to identify the underlying technology stack. Matches against a signature database of 30+ frameworks, libraries, servers, and services.

Detects: WordPress, Drupal, Joomla, Django, Laravel, React, Vue.js, Angular, jQuery, Bootstrap, Nginx, Apache, PHP, ASP.NET, Node.js, Ruby on Rails, Cloudflare, Google Analytics, Shopify, Wix, WooCommerce, Elasticsearch, GraphQL, and more.

[+] 4 technologies identified:
  ▸ Nginx
  ▸ PHP/8.1.2
  ▸ WordPress
  ▸ Cloudflare

`--header-scan` — HTTP Security Header Analysis

Inspects response headers and classifies them into three categories:

Present security headers — Strict-Transport-Security, Content-Security-Policy, X-Frame-Options, X-Content-Type-Options, Referrer-Policy, Permissions-Policy
Missing security headers — Required headers absent from the response
Information-leaking headers — Server, X-Powered-By, X-Generator exposing version strings

[+] Security headers present:
  ✓ Strict-Transport-Security: max-age=31536000; includeSubDomains
  ✓ X-Frame-Options: SAMEORIGIN
[-] Missing security headers:
  ✗ Content-Security-Policy
  ✗ Permissions-Policy
[!] Info-leaking headers:
  ! X-Powered-By: PHP/8.1.2
  ! Server: Apache/2.4.54 (Ubuntu)

`--secrets` — Credential & Secret Detection

Scans every page's source code against 17 compiled regex patterns to detect exposed credentials, API keys, tokens, and private keys.

Pattern Type	Detection Logic
`AWS_ACCESS_KEY`	`AKIA[0-9A-Z]{16}`
`AWS_SECRET`	Contextual `aws...secret` key pairs
`GITHUB_TOKEN`	`ghp_`, `gho_`, `ghs_`, `ghu_`, `ghr_` prefixed tokens
`GOOGLE_API`	`AIza[0-9A-Za-z-_]{35}`
`SLACK_TOKEN`	`xoxb-`, `xoxp-`, `xoxa-` prefixed tokens
`SLACK_WEBHOOK`	`hooks.slack.com/services/T.../B.../...`
`STRIPE_KEY`	`pk_live_`, `sk_live_`, `pk_test_`, `sk_test_`
`JWT_TOKEN`	`eyJ[header].[payload].[signature]` three-part format
`PRIVATE_KEY`	`-----BEGIN (RSA\|EC)? PRIVATE KEY-----`
`FIREBASE`	Firebase Cloud Messaging server key
`MAILCHIMP`	`[a-f0-9]{32}-us[0-9]{1,2}` format
`SENDGRID`	`SG.[22-char].[43-char]` format
`TWILIO`	`SK` + 32 hex characters
`HEROKU`	UUID-format application identifier
`GENERIC_SECRET`	`password=`, `secret=`, `passwd=` assignments
`GENERIC_API_KEY`	`api_key=`, `apikey=`, `api-key=` assignments

[!] SECRET [GITHUB_TOKEN] at https://example.com/assets/config.js
[!] SECRET [AWS_ACCESS_KEY] at https://example.com/js/app.bundle.js

`--api-scan` — API Endpoint Discovery

Crawls JavaScript files and internal URLs to surface REST API endpoints. The rendpoint regex engine extracts path patterns directly from JS source code, revealing endpoints not linked anywhere in the HTML.

[+] 12 API endpoints found:
  ▸ /api/v1/users
  ▸ /api/v1/auth/login
  ▸ /api/v2/products/{id}
  ▸ /internal/admin/dashboard
  ▸ /api/v1/payments/charge

`--param-scan` — Parameter Enumeration

Analyzes all collected URLs to enumerate GET parameters, identify parameter names, and map parameterized endpoints. Produces clean output ready for fuzzing pipelines.

[+] 8 parameterized URLs found:
  ▸ /search       [q, page, sort, filter]
  ▸ /user         [id, token, redirect]
  ▸ /product      [id, category, ref]
  ▸ /api/items    [limit, offset, order]

`--auth-scan` — Authentication Surface Mapping

Detects login forms, token-gated endpoints, and authentication walls through URL pattern analysis and endpoint classification. Useful for mapping the full authentication attack surface.

[+] 5 auth surfaces found:
  ▸ https://example.com/login
  ▸ https://example.com/admin
  ▸ https://example.com/account/signin
  ▸ https://example.com/api/v1/auth/token
  ▸ https://example.com/oauth/authorize

`--methods` — HTTP Method Discovery

Tests a sample of internal URLs against non-standard HTTP methods — OPTIONS, PUT, DELETE, TRACE, PATCH, CONNECT. Risky methods are color-coded in output and saved separately.

  ▸ /api/v1/users      → GET, POST, PUT, DELETE   ← DANGEROUS
  ▸ /api/v1/products   → GET, POST
  ▸ /admin/            → GET
  ▸ /api/v1/config     → GET, PUT, DELETE          ← DANGEROUS

`--response-scan` — HTTP Response Analysis

Analyzes status codes, server identification strings, and response characteristics across internal URLs. Classifies accessible, redirecting, forbidden, and error-state endpoints.

  [200] /dashboard              Apache/2.4.54
  [200] /api/v1/users           nginx/1.22.0
  [301] /admin            →     /admin/
  [403] /config
  [500] /api/legacy

`--dns` — DNS & Subdomain Enumeration

Queries two passive DNS sources to enumerate subdomains without sending a single packet to the target:

crt.sh — Certificate Transparency log queries
HackerTarget — Passive DNS database lookup

[*] Querying crt.sh & HackerTarget for example.com
[+] 14 subdomains discovered:
  ▸ api.example.com
  ▸ dev.example.com
  ▸ staging.example.com
  ▸ admin.example.com
  ▸ mail.example.com
  ▸ vpn.example.com

`--wayback` — Wayback Machine URL Seeding

Queries the Wayback Machine CDX API for historically archived URLs of the target domain and injects them into the crawl queue. Surfaces removed pages, old API versions, and deprecated endpoints still available in the archive.

`--clone` — Website Mirroring

Downloads every visited page during the crawl, preserving the original directory structure on the local filesystem. Enables offline analysis, diffing, and long-term archiving.

Installation

# Clone the repository
git clone https://github.com/WHY-RERO/Photon_v2
cd Photon_v2

# Install dependencies
pip install -r requirements.txt

Dependencies:

requests>=2.28.0
tld>=0.12.6
urllib3>=1.26.0

Requires Python 3.8 or higher.

Usage

Basic Scan

python photon.py -u https://example.com

Adjusted Depth & Threads

python photon.py -u https://example.com -l 3 -t 16

All Modules Enabled

python photon.py -u https://example.com \
  --tech \
  --secrets \
  --header-scan \
  --api-scan \
  --param-scan \
  --auth-scan \
  --methods \
  --response-scan \
  --dns \
  --wayback

Authenticated Scan with Cookies

python photon.py -u https://example.com \
  -c "session=abc123; auth_token=xyz789" \
  --secrets --api-scan --param-scan

Proxy-Routed Scan

# Single proxy
python photon.py -u https://example.com -p 127.0.0.1:8080

# Proxy list from file
python photon.py -u https://example.com -p proxies.txt

Export Results

# Export as JSON
python photon.py -u https://example.com --export json -o ./results

# Export as CSV
python photon.py -u https://example.com --export csv -o ./results

URL-Only Collection

python photon.py -u https://example.com --only-urls --export json

Pipe a Dataset to Stdout

# Pipe fuzzable URLs directly to ffuf
python photon.py -u https://example.com --stdout fuzzable | \
  ffuf -w - -u FUZZ -c

# Write internal URLs to file
python photon.py -u https://example.com --stdout internal > urls.txt

Arguments

Target

Flag	Description
`-u`, `--url`	Target URL (required)
`-s`, `--seeds`	Additional seed URLs to include in the crawl queue
`--wayback`	Seed crawl queue with archived URLs from archive.org

Crawl Options

Flag	Default	Description
`-l`, `--level`	`2`	Crawl depth
`-t`, `--threads`	`4`	Thread count
`-d`, `--delay`	`0`	Delay between requests (seconds)
`--timeout`	`6`	Request timeout (seconds)
`--exclude`	—	Exclude URLs matching a regex pattern
`--only-urls`	—	Collect URLs only, skip all analysis

Output

Flag	Description
`-o`, `--output`	Output directory (defaults to target hostname)
`-e`, `--export`	Export format: `json` or `csv`
`--stdout`	Print a specific dataset to stdout
`-v`, `--verbose`	Enable verbose output

Auth & Network

Flag	Description
`-c`, `--cookie`	Cookie string for authenticated crawling
`--headers`	Prompt for custom request headers interactively
`--user-agent`	Custom user-agent string(s), comma-separated for rotation
`-p`, `--proxy`	Proxy (`IP:PORT`) or path to proxy list file
`-r`, `--regex`	Custom regex pattern — extract matches from every page

Scan Modules

Flag	Description
`--tech`	Technology stack fingerprinting
`--header-scan`	HTTP security header analysis
`--secrets`	Credential and secret detection (17 patterns)
`--api-scan`	API endpoint discovery from JS and URLs
`--param-scan`	GET parameter enumeration
`--auth-scan`	Authentication surface mapping
`--methods`	HTTP method discovery (PUT, DELETE, TRACE, etc.)
`--response-scan`	HTTP response status and server analysis
`--dns`	Subdomain enumeration via crt.sh + HackerTarget
`--keys`	High-entropy key and token extraction
`--clone`	Mirror the target website locally
`--update`	Update Photon from GitHub

Advanced Usage

Bug Bounty Recon Workflow

# Phase 1 — Surface mapping: subdomains + historical URLs
python photon.py -u https://target.com \
  --dns --wayback \
  -l 2 -t 8 \
  --export json -o recon/phase1/

# Phase 2 — Technology + security posture analysis
python photon.py -u https://target.com \
  --tech --header-scan --secrets \
  -l 3 -t 12 \
  --export json -o recon/phase2/

# Phase 3 — API surface and attack vector discovery
python photon.py -u https://target.com \
  --api-scan --param-scan --auth-scan --methods \
  -l 3 -t 16 \
  --export json -o recon/phase3/

Post-Login Authenticated Crawl

# Capture session cookies from your browser after logging in
python photon.py -u https://target.com/dashboard \
  -c "PHPSESSID=abc123; remember_token=def456" \
  --secrets --api-scan --param-scan \
  -l 3 --export json -o recon/authed/

Maximum Coverage — Aggressive Mode

python photon.py -u https://target.com \
  -l 5 -t 30 -d 0.1 \
  --wayback --dns --clone \
  --tech --secrets --header-scan \
  --api-scan --param-scan --auth-scan \
  --methods --response-scan --keys \
  --export json -o ./full_recon -v

Exclude Noise — Regex Filtering

# Skip logout endpoints, static assets, and CDN URLs
python photon.py -u https://example.com \
  --exclude "(logout|\.css|\.png|\.woff|cdn\.)" \
  -l 3 --api-scan --secrets

Multi-Seed Crawl

# Crawl from multiple starting points simultaneously
python photon.py -u https://example.com \
  -s https://example.com/blog https://example.com/docs \
  -l 3 -t 10 --api-scan

Pipeline Integration

# Feed fuzzable URLs directly into ffuf
python photon.py -u https://example.com --stdout fuzzable | \
  ffuf -w - -u FUZZ -mc 200,403

# Feed endpoints into nuclei for template-based scanning
python photon.py -u https://example.com --stdout internal | \
  nuclei -l - -t technologies/

# Extract subdomains and pass to httpx for probing
python photon.py -u https://example.com --dns --stdout internal | \
  httpx -silent -title -tech-detect

Output Structure

On completion, results are saved to a directory named after the target hostname. Each dataset is written to its own file for easy downstream processing.

example.com/
│
├── internal.txt          # All internal URLs discovered
├── external.txt          # All external URLs discovered
├── scripts.txt           # JavaScript file URLs
├── endpoints.txt         # API endpoints extracted from JS
├── fuzzable.txt          # Parameterized URLs (ready for fuzzing)
├── intel.txt             # Emails, social profiles, phone numbers
├── files.txt             # Downloadable file URLs
├── robots.txt            # Entries parsed from robots.txt
├── keys.txt              # High-entropy key strings
├── failed.txt            # Failed / unreachable URLs
│
├── technologies.txt      # --tech results
├── security_headers.txt  # --header-scan results
├── secrets.txt           # --secrets results
├── api_endpoints.txt     # --api-scan results
├── parameters.txt        # --param-scan results
├── auth_surfaces.txt     # --auth-scan results
├── http_methods.txt      # --methods results
├── responses.txt         # --response-scan results
├── subdomains.txt        # --dns results
│
├── results.json          # --export json  (all datasets combined)
└── results.csv           # --export csv

Sample JSON Output

{
  "internal": [
    "https://example.com/",
    "https://example.com/about",
    "https://example.com/api/v1/users"
  ],
  "fuzzable": [
    "https://example.com/search?q=test&page=1&sort=asc"
  ],
  "endpoints": [
    "/api/v1/users",
    "/api/v2/payments/charge",
    "/internal/admin/settings"
  ],
  "secrets": [
    {
      "type": "AWS_ACCESS_KEY",
      "value": "AKIAIOSFODNN7EXAMPLE",
      "url": "https://example.com/js/config.bundle.js"
    }
  ],
  "technologies": ["WordPress", "PHP/8.1", "Nginx", "Cloudflare"],
  "subdomains": ["api.example.com", "staging.example.com", "mail.example.com"]
}

Architecture

Photon_v2/
│
├── photon.py                  # Entry point · argument parser · crawl loop · results output
│
├── core/
│   ├── config.py              # VERSION · INTELS list · BAD_TYPES · sensitive patterns
│   ├── colors.py              # Terminal color codes and output symbols
│   ├── flash.py               # Multi-threaded task executor (thread pool wrapper)
│   ├── requester.py           # HTTP request engine (proxy rotation · UA rotation · retry)
│   ├── regex.py               # Compiled regex patterns (href · script · intel · endpoint · entropy)
│   ├── utils.py               # Helpers: Luhn check · entropy · proxy test · writer · timer
│   ├── mirror.py              # Local website clone engine
│   ├── prompt.py              # Interactive custom header input
│   ├── updater.py             # GitHub update checker
│   ├── zap.py                 # robots.txt parsing and initial URL seeding
│   └── user-agents.txt        # User-agent rotation wordlist
│
└── plugins/
    ├── tech_scan.py           # Technology fingerprinting · 30+ signatures · header + body + cookie
    ├── header_scan.py         # Security header audit · present / missing / leaking classification
    ├── secret_scan.py         # Credential detection · 17 compiled patterns
    ├── api_scan.py            # API endpoint classification from JS and URL patterns
    ├── param_scan.py          # GET parameter enumeration and grouping
    ├── auth_scan.py           # Authentication surface detection and mapping
    ├── method_scan.py         # HTTP method probing (PUT · DELETE · TRACE · PATCH)
    ├── response_scan.py       # HTTP status code and server header analysis
    ├── find_subdomains.py     # Subdomain enumeration via crt.sh + HackerTarget
    ├── dnsdumpster.py         # DNS Dumpster passive DNS integration
    ├── wayback.py             # Wayback Machine CDX API historical URL seeding
    └── exporter.py            # JSON / CSV structured export engine

Legal Disclaimer

This tool is intended for authorized security testing, research, and educational purposes only.

Running Photon against systems you do not own or have explicit written permission to test is illegal in most jurisdictions and may result in civil or criminal prosecution. The author assumes no liability for any misuse, damage, or legal consequences arising from the use of this software. Always obtain proper written authorization before scanning any target.

RECODED BY RERO · Photon v2.0.0

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
core		core
plugins		plugins
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
photon.py		photon.py
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

Passive Web Reconnaissance & Crawling Framework

Table of Contents

Overview

Features

Scan Modules

--tech — Technology Fingerprinting

--header-scan — HTTP Security Header Analysis

--secrets — Credential & Secret Detection

--api-scan — API Endpoint Discovery

--param-scan — Parameter Enumeration

--auth-scan — Authentication Surface Mapping

--methods — HTTP Method Discovery

--response-scan — HTTP Response Analysis

--dns — DNS & Subdomain Enumeration

--wayback — Wayback Machine URL Seeding

--clone — Website Mirroring

Installation

Usage

Basic Scan

Adjusted Depth & Threads

All Modules Enabled

Authenticated Scan with Cookies

Proxy-Routed Scan

Export Results

URL-Only Collection

Pipe a Dataset to Stdout

Arguments

Target

Crawl Options

Output

Auth & Network

Scan Modules

Advanced Usage

Bug Bounty Recon Workflow

Post-Login Authenticated Crawl

Maximum Coverage — Aggressive Mode

Exclude Noise — Regex Filtering

Multi-Seed Crawl

Pipeline Integration

Output Structure

Sample JSON Output

Architecture

Legal Disclaimer

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`--tech` — Technology Fingerprinting

`--header-scan` — HTTP Security Header Analysis

`--secrets` — Credential & Secret Detection

`--api-scan` — API Endpoint Discovery

`--param-scan` — Parameter Enumeration

`--auth-scan` — Authentication Surface Mapping

`--methods` — HTTP Method Discovery

`--response-scan` — HTTP Response Analysis

`--dns` — DNS & Subdomain Enumeration

`--wayback` — Wayback Machine URL Seeding

`--clone` — Website Mirroring

Packages