Skip to content
Discovery will gather DNS entry, subdomains and domains linked, all files publicly exposed to gather metadatas, check for dumps on Pastebin/pwndb and create lists of emails for bruteforce phase !
Python Shell
Branch: master
Clone or download
Latest commit b08519c Mar 18, 2019
Type Name Latest commit message Commit time
Failed to load latest commit information.
images Add files via upload Feb 22, 2019
modules Update Mar 15, 2019
LICENSE Update Mar 18, 2019
configuration Update configuration Mar 13, 2019 Added PTR reverse lookup Mar 13, 2019 Update Mar 11, 2019 Update Mar 15, 2019

Discovery 2.0

Discovery is a fully automated OSINT tool that will gather informations from a lot of differents sources and cross the results in order to fingerprint as best as possible a domain name.


This tool (for the moment ;) ) relies on 4 FREE API's :

You can chose whether or not you want to use them. If you don't provide the API keys then the hunter, shodan, rocketreach and whatcms API won't be used.

The tool configuration can be set in the configuration file :

Whois/DNS request

How to use :

python3 -d domain.tld --dns

This module actually does a lot of things :

  • Query Whois databases to gather the name of the registrant, who's responsible of it and some DNS servers name
  • Query Google DNS to retrieve records
  • Try to perform DNS transfert zone on the previously found DNS servers
  • Gather IPv4 ranges beloging to the domain.tld

From this IP range Discovery will list all existant IP's :

DNS enumeration

The second module is the implementaiton of the sublist3r python3 module written by aboulela : You can call it using two differents options :

python3 -d domain.tld --sublist


python3 -d domain.tld --subrute

The difference is that when using --subrute, sublist3r will perform a DNS bruteforce which will take much more time but will also find more subdomains.

Discovery will also use the IP2host and reverse DNS lookup in order to gather new Virtual Hosts.

In the configuration file you can choose whether or not the www.domain.tld and domain.tld should be merged. You must be aware that almost all the time, www.domain.tld is the same thing as domain.tld. But sometimes it is not which means we might loose some Virtual Hosts by merging www.domain.tld and domain.tld

This configuration can be done in the configuration file :

Scanner module

This module is composed of two functions and can be called that way :

python3 -d domain.tld --scan ["full" or "fast"]

The first function is an implementaiton of the python nmap librairy. It will scan the discovered IP's either the "full" way (which means it will check for the all 65535 ports) or the "fast" way (-F nmap option).

Depending of the services discovered it will perform a few actions :

  • Check SSL certificate
  • Check for the CMS used (if there is one)
  • Check for comon important files (.git, /status, trace.axd, robots.txt. If those files are found, they will be downloaded).
  • Look for WAF using Wafw00f (
  • Look For reverse proxy using made by Nicolas Gregoire and Julien Cayssol

You can add as much files as you want in the configuration file :

The tool will output an XML files that you will be able to add in the NMAP plugin (rapport type)

The second function will use the Shodan API to gather informations about the domain name : found servers, services, CVE's related to the services and a quick decription

Metadatas Scrapper

This module is basically my python version of FOCA :

python3 -d domain.tld --gather

Using Google dorks it will gather publicly exposed documents and parse their metadatas in order to find sensitive informations (credentials for exemple) This module is inspired by the pyfoca script written by altjx :

To parse the metadatas I used exiftool.

You can add as much extensions as you want in the configuration file :

Note that in order to be parsed, the gathered documents must be downloaded. Sometimes it might take a lot of space. If you don't want to keep the downloaded files you can set the option in the configuration so that they will be deleted once parsed :

This module will also check for sensitive files on Pastebin and Github. For each document found,it will check for the words filled in the configuration file :


The last module will use differents API's to gather names of employee working for the given domain name. Especially the RocketReach API using some parts of N3tsky's PeopleScrap tool :

It will then create a few lists of emails :

python3 -d domain.tld --harvest 

You can set the pattern to use for the mail creation in the configuration file :

If you specify a non handled pattern or don't specify any then Discovery will create all possible lists using all handled patterns.

Finally it will query the Web API pwndb available on the Deep Web to retrieve emails and passwords linked to a certain domain.

Full commannd

So basically if you want to run all modules you can use this command :

python3 -d domain.tld --dns (--sublist or --subrute) --scan (full or fast) --gather --harvest

All results will be written in a file in this tree :

To Do list :

  • Code refactoring

    • Use of class
    • each functions/modules in separate files
    • Review of the already existing code (adding some performance)
  • Document gathering :

    • Search for sensitive files on Github (to finish)
  • Scanning function

  • DNS enumeration

    • Add FDNS databases lookup
    • Googid : search for Google Analytics ID to find other websites connected

  • Final :

    • Threads files downloads in order to speed up file download process
    • Get ride of the API's (especially the whatcms api)
    • Add the possibility to use some modules with a list of domains (at least --sublist/--subrute and --harvest)
    • If IP scope is already known, add the possibilitity to remove every domain/ips not in scope
You can’t perform that action at this time.