Skip to content

Canopy is a Python OSINT framework for collecting and correlating publicly available information about people

Notifications You must be signed in to change notification settings

guyvolvo/canopy

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

34 Commits
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Canopy-scanner

Canopy is a Python-based OSINT framework designed to help individuals discover, organize, and analyze publicly available information about their own digital footprint.
Note: Canopy was developed and tested exclusively on my own publicly available data as a learning and portfolio project, canopy was made as part of my IT and Cyber-security journey and may contain bugs.

πŸ“Œ Project Goals

  • Understand how publicly available information is indexed and exposed online
  • Practice structured OSINT methodology using search engines
  • Correlate results from multiple sources into meaningful categories
  • Demonstrate ethical boundaries and legal awareness in OSINT work

πŸ’« Features

  • Multi-platform username enumeration across social media, coding sites, gaming platforms, and more.
  • Avoids false positives using fingerprint-based validation.
  • Supports local caching of platform fingerprints to speed up scans.
  • Multi-threaded, high-performance scanning with optional rate-limiting and delays.
  • Generates reports in JSON, CSV, HTML, or TXT formats.
  • CLI interface for easy integration into scripts or automation workflows.
  • Categorized results for better organization (e.g., social, professional, gaming).

πŸ“¦ Installation

pip install canopy-scanner
-----------------------------------
git clone https://github.com/guyvolvo/Canopy.git
cd Canopy
usage: canopy [-h] [-u USERNAME] [-U USERNAMES] [-t THREADS] [--timeout TIMEOUT] [--delay DELAY]
                 [--rate-limit RATE_LIMIT] [-c CATEGORIES] [-p PLATFORMS] [--exclude EXCLUDE] [--only-found]
                 [--list-categories] [-o OUTPUT] [-f {json,csv,html,txt}] [-v] [-q] [--print-found]

Canopy - Username Enumeration Tool

options:
  -h, --help            show this help message and exit

Target Options:
  -u, --username USERNAME
                        Username to search for
  -U, --usernames USERNAMES
                        File containing list of usernames (one per line)

Performance Options:
  -t, --threads THREADS
                        Number of concurrent threads (default: 10)
  --timeout TIMEOUT     Request timeout in seconds (default: 10)
  --delay DELAY         Delay between requests in seconds (default: 0)
  --rate-limit RATE_LIMIT
                        Max requests per second (default: unlimited)

Filtering Options:
  -c, --categories CATEGORIES
                        Comma-separated categories to check (e.g., social,gaming)
  -p, --platforms PLATFORMS
                        Comma-separated specific platforms to check
  --exclude EXCLUDE     Comma-separated platforms to exclude
  --only-found          Only show found accounts
  --list-categories     Show all available platform categories and exit

Output Options:
  -o, --output OUTPUT   Output file path
  -f, --format {json,csv,html,txt}
                        Output format: json, csv, html, txt (default: json)
  -v, --verbose         Verbose output
  -q, --quiet           Minimal output (only results)
  --print-found         Print found accounts in real-time

    Examples:
      canopy -u johndoe
      canopy -u johndoe -t 50 --timeout 15
      canopy -u johndoe -o report.json --format json
      canopy -u johndoe --categories social,gaming
      canopy --list-categories

πŸ“ƒ Legal & Ethical Disclamer

This framework is intended strictly for self-OSINT, educational use, or explicit consent-based research.
This tool should be used to analyze:
  • Your own digital footprint
  • Accounts, domains, and identifiers you own
  • Targets for which you have explicit written permission
  • Using this tool against private individuals without consent may violate privacy laws and platform Terms of Service.
  • I have no responsibility for misuse of this software.

Canopy collects metadata only, such as:

  • Page titles
  • URLs
  • Search snippets
  • Source domain

It does not:

  • Bypass CAPTCHAs
  • Scrape authenticated content
  • Harvest private data
  • Enumerate personal contact lists

Theoretical Project Structure (Generated by ChatGPT and Cluade Made for reference so I can follow along and add or remove things as I see fit):

GPT Workflow :

canopy/
β”œβ”€β”€ README.md
β”œβ”€β”€ DISCLAIMER.md
β”œβ”€β”€ methodology/
β”‚ └── osint_methodology.md
β”œβ”€β”€ canopy/
β”‚ β”œβ”€β”€ query_generator.py
β”‚ β”œβ”€β”€ collector.py
β”‚ β”œβ”€β”€ parser.py
β”‚ └── correlator.py
β”œβ”€β”€ output/
β”‚ └── sample_report.md
└── lessons_learned.md\

Claude workflow :

Canopy/
β”œβ”€β”€ main.py # Entry point, CLI interface
β”œβ”€β”€ platforms.json # Platform database
β”œβ”€β”€ query_generator.py # Generate queries from usernames
β”œβ”€β”€ username_checker.py # Check if username exists on platforms
β”œβ”€β”€ data_collector.py # Collect and aggregate data
β”œβ”€β”€ report_generator.py # Format and export results
β”œβ”€β”€ config.py # Configuration settings
β”œβ”€β”€ utils.py # Helper functions
└── requirements.txt # Dependencies\

platforms.json inspired by the Sherlock OSINT project :)

Methodology

  • Canopy uses a structured OSINT approach:
  • Generate a list of platforms to query (social, professional, gaming).
  • Create URL patterns for a target username.
  • Validate account existence using HTTP responses, redirects, error messages, and HTML fingerprints.
  • Aggregate results into structured reports.
  • Optionally store fingerprints locally to avoid redundant requests.
  • This ensures high accuracy while reducing false positives.

Best Practices

  • Only scan accounts you own or have explicit permission to analyze.
  • Use --threads and --rate-limit responsibly to avoid being blocked by platforms.
  • Review your JSON/CSV/HTML reports for patterns before taking any action.
  • Update platforms.json regularly to include new platforms.
  • Periodically refresh fingerprints for platforms that change their 404 pages.

About

Canopy is a Python OSINT framework for collecting and correlating publicly available information about people

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published