Skip to content

WHY-RERO/Photon_V2

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

  ██████╗ ██╗  ██╗ ██████╗ ████████╗ ██████╗ ███╗   ██╗
  ██╔══██╗██║  ██║██╔═══██╗╚══██╔══╝██╔═══██╗████╗  ██║
  ██████╔╝███████║██║   ██║   ██║   ██║   ██║██╔██╗ ██║
  ██╔═══╝ ██╔══██║██║   ██║   ██║   ██║   ██║██║╚██╗██║
  ██║     ██║  ██║╚██████╔╝   ██║   ╚██████╔╝██║ ╚████║
  ╚═╝     ╚═╝  ╚═╝ ╚═════╝    ╚═╝    ╚═════╝ ╚═╝  ╚═══╝

Passive Web Reconnaissance & Crawling Framework

Python Version License Modules

RECODED BY RERO  ·  Built for security researchers, bug bounty hunters & red teamers.


Table of Contents


Overview

Photon v2 is a fully rewritten, modular passive web reconnaissance and crawling framework. Built on top of the original Photon project, it introduces a plugin-based scan module system, an enhanced regex engine, multi-threaded crawl architecture, and structured output pipelines.

Designed for passive OSINT collection, bug bounty reconnaissance, security auditing, and web infrastructure analysis — Photon v2 gives you a complete surface-level picture of a target with minimal noise.

All core components and the plugin system have been rewritten from scratch by RERO.


Features

Category Capability
Crawling Multi-threaded, depth-controlled, domain-aware crawler
Link Analysis Internal / external / JS / fuzzable URL classification
Intelligence Email, phone, social profile, credit card pattern extraction
Wayback Historical URL seeding via archive.org CDX API
Mirroring Clone target website to local filesystem
Entropy Scanning High-entropy key and token detection
Export Structured JSON and CSV output
Proxy Support Single proxy or multi-proxy rotation from file
User-Agent Randomized UA rotation from bundled wordlist
Custom Regex User-defined extraction patterns across all pages

Scan Modules

Photon v2 ships with 10 independent scan modules, each toggled with a dedicated flag. Modules run post-crawl against collected data or fire inline during crawling.


--tech — Technology Fingerprinting

Analyzes page source, HTTP response headers, and cookies to identify the underlying technology stack. Matches against a signature database of 30+ frameworks, libraries, servers, and services.

Detects: WordPress, Drupal, Joomla, Django, Laravel, React, Vue.js, Angular, jQuery, Bootstrap, Nginx, Apache, PHP, ASP.NET, Node.js, Ruby on Rails, Cloudflare, Google Analytics, Shopify, Wix, WooCommerce, Elasticsearch, GraphQL, and more.

[+] 4 technologies identified:
  ▸ Nginx
  ▸ PHP/8.1.2
  ▸ WordPress
  ▸ Cloudflare

--header-scan — HTTP Security Header Analysis

Inspects response headers and classifies them into three categories:

  • Present security headersStrict-Transport-Security, Content-Security-Policy, X-Frame-Options, X-Content-Type-Options, Referrer-Policy, Permissions-Policy
  • Missing security headers — Required headers absent from the response
  • Information-leaking headersServer, X-Powered-By, X-Generator exposing version strings
[+] Security headers present:
  ✓ Strict-Transport-Security: max-age=31536000; includeSubDomains
  ✓ X-Frame-Options: SAMEORIGIN
[-] Missing security headers:
  ✗ Content-Security-Policy
  ✗ Permissions-Policy
[!] Info-leaking headers:
  ! X-Powered-By: PHP/8.1.2
  ! Server: Apache/2.4.54 (Ubuntu)

--secrets — Credential & Secret Detection

Scans every page's source code against 17 compiled regex patterns to detect exposed credentials, API keys, tokens, and private keys.

Pattern Type Detection Logic
AWS_ACCESS_KEY AKIA[0-9A-Z]{16}
AWS_SECRET Contextual aws...secret key pairs
GITHUB_TOKEN ghp_, gho_, ghs_, ghu_, ghr_ prefixed tokens
GOOGLE_API AIza[0-9A-Za-z-_]{35}
SLACK_TOKEN xoxb-, xoxp-, xoxa- prefixed tokens
SLACK_WEBHOOK hooks.slack.com/services/T.../B.../...
STRIPE_KEY pk_live_, sk_live_, pk_test_, sk_test_
JWT_TOKEN eyJ[header].[payload].[signature] three-part format
PRIVATE_KEY -----BEGIN (RSA|EC)? PRIVATE KEY-----
FIREBASE Firebase Cloud Messaging server key
MAILCHIMP [a-f0-9]{32}-us[0-9]{1,2} format
SENDGRID SG.[22-char].[43-char] format
TWILIO SK + 32 hex characters
HEROKU UUID-format application identifier
GENERIC_SECRET password=, secret=, passwd= assignments
GENERIC_API_KEY api_key=, apikey=, api-key= assignments
[!] SECRET [GITHUB_TOKEN] at https://example.com/assets/config.js
[!] SECRET [AWS_ACCESS_KEY] at https://example.com/js/app.bundle.js

--api-scan — API Endpoint Discovery

Crawls JavaScript files and internal URLs to surface REST API endpoints. The rendpoint regex engine extracts path patterns directly from JS source code, revealing endpoints not linked anywhere in the HTML.

[+] 12 API endpoints found:
  ▸ /api/v1/users
  ▸ /api/v1/auth/login
  ▸ /api/v2/products/{id}
  ▸ /internal/admin/dashboard
  ▸ /api/v1/payments/charge

--param-scan — Parameter Enumeration

Analyzes all collected URLs to enumerate GET parameters, identify parameter names, and map parameterized endpoints. Produces clean output ready for fuzzing pipelines.

[+] 8 parameterized URLs found:
  ▸ /search       [q, page, sort, filter]
  ▸ /user         [id, token, redirect]
  ▸ /product      [id, category, ref]
  ▸ /api/items    [limit, offset, order]

--auth-scan — Authentication Surface Mapping

Detects login forms, token-gated endpoints, and authentication walls through URL pattern analysis and endpoint classification. Useful for mapping the full authentication attack surface.

[+] 5 auth surfaces found:
  ▸ https://example.com/login
  ▸ https://example.com/admin
  ▸ https://example.com/account/signin
  ▸ https://example.com/api/v1/auth/token
  ▸ https://example.com/oauth/authorize

--methods — HTTP Method Discovery

Tests a sample of internal URLs against non-standard HTTP methods — OPTIONS, PUT, DELETE, TRACE, PATCH, CONNECT. Risky methods are color-coded in output and saved separately.

  ▸ /api/v1/users      → GET, POST, PUT, DELETE   ← DANGEROUS
  ▸ /api/v1/products   → GET, POST
  ▸ /admin/            → GET
  ▸ /api/v1/config     → GET, PUT, DELETE          ← DANGEROUS

--response-scan — HTTP Response Analysis

Analyzes status codes, server identification strings, and response characteristics across internal URLs. Classifies accessible, redirecting, forbidden, and error-state endpoints.

  [200] /dashboard              Apache/2.4.54
  [200] /api/v1/users           nginx/1.22.0
  [301] /admin            →     /admin/
  [403] /config
  [500] /api/legacy

--dns — DNS & Subdomain Enumeration

Queries two passive DNS sources to enumerate subdomains without sending a single packet to the target:

  • crt.sh — Certificate Transparency log queries
  • HackerTarget — Passive DNS database lookup
[*] Querying crt.sh & HackerTarget for example.com
[+] 14 subdomains discovered:
  ▸ api.example.com
  ▸ dev.example.com
  ▸ staging.example.com
  ▸ admin.example.com
  ▸ mail.example.com
  ▸ vpn.example.com

--wayback — Wayback Machine URL Seeding

Queries the Wayback Machine CDX API for historically archived URLs of the target domain and injects them into the crawl queue. Surfaces removed pages, old API versions, and deprecated endpoints still available in the archive.


--clone — Website Mirroring

Downloads every visited page during the crawl, preserving the original directory structure on the local filesystem. Enables offline analysis, diffing, and long-term archiving.


Installation

# Clone the repository
git clone https://github.com/WHY-RERO/Photon_v2
cd Photon_v2

# Install dependencies
pip install -r requirements.txt

Dependencies:

requests>=2.28.0
tld>=0.12.6
urllib3>=1.26.0

Requires Python 3.8 or higher.


Usage

Basic Scan

python photon.py -u https://example.com

Adjusted Depth & Threads

python photon.py -u https://example.com -l 3 -t 16

All Modules Enabled

python photon.py -u https://example.com \
  --tech \
  --secrets \
  --header-scan \
  --api-scan \
  --param-scan \
  --auth-scan \
  --methods \
  --response-scan \
  --dns \
  --wayback

Authenticated Scan with Cookies

python photon.py -u https://example.com \
  -c "session=abc123; auth_token=xyz789" \
  --secrets --api-scan --param-scan

Proxy-Routed Scan

# Single proxy
python photon.py -u https://example.com -p 127.0.0.1:8080

# Proxy list from file
python photon.py -u https://example.com -p proxies.txt

Export Results

# Export as JSON
python photon.py -u https://example.com --export json -o ./results

# Export as CSV
python photon.py -u https://example.com --export csv -o ./results

URL-Only Collection

python photon.py -u https://example.com --only-urls --export json

Pipe a Dataset to Stdout

# Pipe fuzzable URLs directly to ffuf
python photon.py -u https://example.com --stdout fuzzable | \
  ffuf -w - -u FUZZ -c

# Write internal URLs to file
python photon.py -u https://example.com --stdout internal > urls.txt

Arguments

Target

Flag Description
-u, --url Target URL (required)
-s, --seeds Additional seed URLs to include in the crawl queue
--wayback Seed crawl queue with archived URLs from archive.org

Crawl Options

Flag Default Description
-l, --level 2 Crawl depth
-t, --threads 4 Thread count
-d, --delay 0 Delay between requests (seconds)
--timeout 6 Request timeout (seconds)
--exclude Exclude URLs matching a regex pattern
--only-urls Collect URLs only, skip all analysis

Output

Flag Description
-o, --output Output directory (defaults to target hostname)
-e, --export Export format: json or csv
--stdout Print a specific dataset to stdout
-v, --verbose Enable verbose output

Auth & Network

Flag Description
-c, --cookie Cookie string for authenticated crawling
--headers Prompt for custom request headers interactively
--user-agent Custom user-agent string(s), comma-separated for rotation
-p, --proxy Proxy (IP:PORT) or path to proxy list file
-r, --regex Custom regex pattern — extract matches from every page

Scan Modules

Flag Description
--tech Technology stack fingerprinting
--header-scan HTTP security header analysis
--secrets Credential and secret detection (17 patterns)
--api-scan API endpoint discovery from JS and URLs
--param-scan GET parameter enumeration
--auth-scan Authentication surface mapping
--methods HTTP method discovery (PUT, DELETE, TRACE, etc.)
--response-scan HTTP response status and server analysis
--dns Subdomain enumeration via crt.sh + HackerTarget
--keys High-entropy key and token extraction
--clone Mirror the target website locally
--update Update Photon from GitHub

Advanced Usage

Bug Bounty Recon Workflow

# Phase 1 — Surface mapping: subdomains + historical URLs
python photon.py -u https://target.com \
  --dns --wayback \
  -l 2 -t 8 \
  --export json -o recon/phase1/

# Phase 2 — Technology + security posture analysis
python photon.py -u https://target.com \
  --tech --header-scan --secrets \
  -l 3 -t 12 \
  --export json -o recon/phase2/

# Phase 3 — API surface and attack vector discovery
python photon.py -u https://target.com \
  --api-scan --param-scan --auth-scan --methods \
  -l 3 -t 16 \
  --export json -o recon/phase3/

Post-Login Authenticated Crawl

# Capture session cookies from your browser after logging in
python photon.py -u https://target.com/dashboard \
  -c "PHPSESSID=abc123; remember_token=def456" \
  --secrets --api-scan --param-scan \
  -l 3 --export json -o recon/authed/

Maximum Coverage — Aggressive Mode

python photon.py -u https://target.com \
  -l 5 -t 30 -d 0.1 \
  --wayback --dns --clone \
  --tech --secrets --header-scan \
  --api-scan --param-scan --auth-scan \
  --methods --response-scan --keys \
  --export json -o ./full_recon -v

Exclude Noise — Regex Filtering

# Skip logout endpoints, static assets, and CDN URLs
python photon.py -u https://example.com \
  --exclude "(logout|\.css|\.png|\.woff|cdn\.)" \
  -l 3 --api-scan --secrets

Multi-Seed Crawl

# Crawl from multiple starting points simultaneously
python photon.py -u https://example.com \
  -s https://example.com/blog https://example.com/docs \
  -l 3 -t 10 --api-scan

Pipeline Integration

# Feed fuzzable URLs directly into ffuf
python photon.py -u https://example.com --stdout fuzzable | \
  ffuf -w - -u FUZZ -mc 200,403

# Feed endpoints into nuclei for template-based scanning
python photon.py -u https://example.com --stdout internal | \
  nuclei -l - -t technologies/

# Extract subdomains and pass to httpx for probing
python photon.py -u https://example.com --dns --stdout internal | \
  httpx -silent -title -tech-detect

Output Structure

On completion, results are saved to a directory named after the target hostname. Each dataset is written to its own file for easy downstream processing.

example.com/
│
├── internal.txt          # All internal URLs discovered
├── external.txt          # All external URLs discovered
├── scripts.txt           # JavaScript file URLs
├── endpoints.txt         # API endpoints extracted from JS
├── fuzzable.txt          # Parameterized URLs (ready for fuzzing)
├── intel.txt             # Emails, social profiles, phone numbers
├── files.txt             # Downloadable file URLs
├── robots.txt            # Entries parsed from robots.txt
├── keys.txt              # High-entropy key strings
├── failed.txt            # Failed / unreachable URLs
│
├── technologies.txt      # --tech results
├── security_headers.txt  # --header-scan results
├── secrets.txt           # --secrets results
├── api_endpoints.txt     # --api-scan results
├── parameters.txt        # --param-scan results
├── auth_surfaces.txt     # --auth-scan results
├── http_methods.txt      # --methods results
├── responses.txt         # --response-scan results
├── subdomains.txt        # --dns results
│
├── results.json          # --export json  (all datasets combined)
└── results.csv           # --export csv

Sample JSON Output

{
  "internal": [
    "https://example.com/",
    "https://example.com/about",
    "https://example.com/api/v1/users"
  ],
  "fuzzable": [
    "https://example.com/search?q=test&page=1&sort=asc"
  ],
  "endpoints": [
    "/api/v1/users",
    "/api/v2/payments/charge",
    "/internal/admin/settings"
  ],
  "secrets": [
    {
      "type": "AWS_ACCESS_KEY",
      "value": "AKIAIOSFODNN7EXAMPLE",
      "url": "https://example.com/js/config.bundle.js"
    }
  ],
  "technologies": ["WordPress", "PHP/8.1", "Nginx", "Cloudflare"],
  "subdomains": ["api.example.com", "staging.example.com", "mail.example.com"]
}

Architecture

Photon_v2/
│
├── photon.py                  # Entry point · argument parser · crawl loop · results output
│
├── core/
│   ├── config.py              # VERSION · INTELS list · BAD_TYPES · sensitive patterns
│   ├── colors.py              # Terminal color codes and output symbols
│   ├── flash.py               # Multi-threaded task executor (thread pool wrapper)
│   ├── requester.py           # HTTP request engine (proxy rotation · UA rotation · retry)
│   ├── regex.py               # Compiled regex patterns (href · script · intel · endpoint · entropy)
│   ├── utils.py               # Helpers: Luhn check · entropy · proxy test · writer · timer
│   ├── mirror.py              # Local website clone engine
│   ├── prompt.py              # Interactive custom header input
│   ├── updater.py             # GitHub update checker
│   ├── zap.py                 # robots.txt parsing and initial URL seeding
│   └── user-agents.txt        # User-agent rotation wordlist
│
└── plugins/
    ├── tech_scan.py           # Technology fingerprinting · 30+ signatures · header + body + cookie
    ├── header_scan.py         # Security header audit · present / missing / leaking classification
    ├── secret_scan.py         # Credential detection · 17 compiled patterns
    ├── api_scan.py            # API endpoint classification from JS and URL patterns
    ├── param_scan.py          # GET parameter enumeration and grouping
    ├── auth_scan.py           # Authentication surface detection and mapping
    ├── method_scan.py         # HTTP method probing (PUT · DELETE · TRACE · PATCH)
    ├── response_scan.py       # HTTP status code and server header analysis
    ├── find_subdomains.py     # Subdomain enumeration via crt.sh + HackerTarget
    ├── dnsdumpster.py         # DNS Dumpster passive DNS integration
    ├── wayback.py             # Wayback Machine CDX API historical URL seeding
    └── exporter.py            # JSON / CSV structured export engine

Legal Disclaimer

This tool is intended for authorized security testing, research, and educational purposes only.

Running Photon against systems you do not own or have explicit written permission to test is illegal in most jurisdictions and may result in civil or criminal prosecution. The author assumes no liability for any misuse, damage, or legal consequences arising from the use of this software. Always obtain proper written authorization before scanning any target.


RECODED BY RERO  ·  Photon v2.0.0

About

Phontom v2 – Advanced OSINT & reconnaissance tool for cybersecurity researchers and penetration testers.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages