A little Python project to automate gathering website profiling data from "BuiltWith" & "Wappalyzer" for tech stack information, technographic data, website reports, website tech lookups, website architecture lookups, etc.
- 👁️ Data Privacy Activities
- Vendor Discovery for Websites
- Risk Management
- Data Privacy Read-Ahead Material for Privacy Assessments
- 🖥️ Cyber Security Activities
- Reconnaissance
- OSINT
- 🗺️ Other Discovery Activities
- Business Intelligence
- Marketing Activities
- Competition Analysis
[ domain , tech_profiler_tool_used , category , technology_name , description (if one exists) ]
All data is exported into the CSV file designated in the config file.
- Contributions are welcome! 😁 Just fork my repo and make a pull request.
- Use Git or download this repo
- Git
- Open
cmd
or your terminal of choice cd
to the folder you want togit clone
togit clone https://github.com/cybersader/WebsiteTechMiner-py.git
- Open
- Download
- Simply download this repo, as is.
- Python dependencies:
- Make sure you've installed the project
cd
into the project- If you don't have Python, then you're going to need it to use pip https://www.python.org/downloads/
pip install -r requirements.txt
- Make an email with https://temp-mail.org/en/
- No need to use your real email for short-term discovery projects.
- I'm not going to design any automated fradulent solutions to automatically generate temporary accounts and emails.
- If you are trying to process very large amounts of URLs, then please purchase plans from these tech lookup services.
- Create a Wappalyzer Account - https://www.wappalyzer.com/
- Go to https://www.wappalyzer.com/apikey/
- Create and copy the API key into the
WebTechMinerNG_setup.json
file using a notepad or editor - Make sure to put it in the quotes after
wappalyzer-API-key
- Create a BuiltWith Account - https://builtwith.com/
- Go to https://api.builtwith.com/
- Create and copy the API key into the
WebTechMiner_setup.json
file using a notepad or editor - Make sure to put it in the quotes after
builtwith-API-key
- BuiltWith API Credits are relatively cheap for what you get
- Go to https://builtwith.com/api-credits
- You can buy 2,000 API credits (2000 tech lookups) for 💵99$
- -s, "single" (analyze a single domain)
- -b, "bulk" (analyze a list of domains using a CSV file)
- put them into rows, columns, or a combination of the two in Excel (it doesn't matter).
python WebsiteTechMiner.py -s example.com
- Be careful running this:
- if you don't have a paid plan, then you will quickly go over your limits
- This is not recommended unless you have a high limit for API credits with:
- Wappalyzer, Builtwith
python WebsiteTechMiner.py -b example_website_list.csv
- Stop WTM if you run out of API credits for all tools
- Error fidelity on error prints
- Multiple API tokens in config file or some csv file
- More fields from APIs to csv
- Ability to use flags for fields
- Unlimited domains on command line
- http and https flags
- Default command with domains after
- Add throttling features for when requests start dropping Wapp and BW
- Recursive Subdomain discovery option
- Connected website discovery
- Risk Management
- Assumed PI discovery
- OneTrust Vendorpedia API
- Other Vendor Risk Management DBs & APIs
- Security Risk Score Attribution
- Other additional information to pull in from external sources
- Policies
- Available Data Processing Agreement links?
- Assumed PI discovery