forked from urbanadventurer/WhatWeb
-
Notifications
You must be signed in to change notification settings - Fork 0
Next generation web scanner. Identify what websites are running. [unstable development build]
License
gremwell/WhatWeb
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
.$$$ $. .$$$ $. $$$$ $$. .$$$ $$$ .$$$$$$. .$$$$$$$$$$. $$$$ $$. .$$$$$$$. .$$$$$$. $ $$ $$$ $ $$ $$$ $ $$$$$$. $$$$$ $$$$$$ $ $$ $$$ $ $$ $$ $ $$$$$$. $ `$ $$$ $ `$ $$$ $ `$ $$$ $$' $ `$ `$$ $ `$ $$$ $ `$ $ `$ $$$' $. $ $$$ $. $$$$$$ $. $$$$$$ `$ $. $ :' $. $ $$$ $. $$$$ $. $$$$$. $::$ . $$$ $::$ $$$ $::$ $$$ $::$ $::$ . $$$ $::$ $::$ $$$$ $;;$ $$$ $$$ $;;$ $$$ $;;$ $$$ $;;$ $;;$ $$$ $$$ $;;$ $;;$ $$$$ $$$$$$ $$$$$ $$$$ $$$ $$$$ $$$ $$$$ $$$$$$ $$$$$ $$$$$$$$$ $$$$$$$$$' WhatWeb - Next generation web scanner. By urbanadventurer aka Andrew Horton from Security-Assessment.com Version : 0.4.6, Unreleased Development Version Homepage: http://www.morningstarsecurity.com/research/whatweb License: GPLv2 Identify content management systems (CMS), blogging platforms, stats/analytics packages, javascript libraries, servers and more. When you visit a website in your browser the transaction includes many unseen hints about how the webserver is set up and what software is delivering the webpage. Some of these hints are obvious, eg. "Powered by XYZ" and others are more subtle. WhatWeb recognises these cues and reports what it finds. WhatWeb has over 500 plugins and needs community support to develop more. Plugins can identify systems with obvious identifying hints removed by also looking for subtle clues. For example, a WordPress site might remove the tag <meta name="generator" content="WordPress 2.6.5"> but the WordPress plugin also looks for "wp-content" which is less easy to disguise. Plugins are flexible and can return any datatype, for example plugins can return version numbers, email addresses, account ID's and more. There are both passive and aggressive plugins. Passive plugins use information on the page, in cookies and in the URL to identify the system. A passive request is as light weight as a simple GET / HTTP/1.1 request. Aggressive plugins guess URLs and request more files. Plugins are easy to write, you don't need to know ruby to make them. EXAMPLE USAGE Using WhatWeb on a handful of websites, standard WhatWeb output is in colour. $ ./whatweb slashdot.org reddit.com digg.com http://www.engadget.com/ www.whitehouse.gov http://reddit.com [302] Akamai-Global-Host, Country[US], HTTPServer[AkamaiGHost], IP[24.29.138.57], RedirectLocation[http://www.reddit.com/] http://www.whitehouse.gov [200] Cookies[d], Country[US], Drupal, Google-Analytics[GA][10791350], HTTPServer[White House], IP[206.57.116.170], OpenSearch[http://www.whitehouse.gov/opensearch/apachesolr_search], Subdomains[www], Title[The White House] http://digg.com [302] Apache, Cookies[d,imp_id,traffic_control], Country[US], HTTPServer[Apache], IP[64.191.203.30], PHP[5.2.9-digg8], RedirectLocation[/news], UncommonHeaders[x-digg-time,keep-alive], X-Powered-By[PHP/5.2.9-digg8] http://slashdot.org [200] Apache[1.3.42], Country[US], Google-Analytics[GA][32013], HTTPServer[Apache/1.3.42 (Unix) mod_perl/1.31], IP[216.34.181.45], OpenGraphProtocol[100000696822412], PasswordField[upasswd], Subdomains[rss,books], Title[Slashdot - News for nerds, stuff that matters], UncommonHeaders[x-varnish,x-bender,x-xrds-location,slash_log_data], X-Powered-By[Slash 2.005001] http://www.engadget.com/ [200] Apache[2.2], probably BlogSmithMedia, Cookies[GEO-64_251_29_84], Country[US], HTTPServer[Apache/2.2], IP[64.12.173.101], Mobile-Website[Apple iPhone], PasswordField[login-pw,newkey], Subdomains[www,mobile,hd,alt,es,chinese,cn,japanese,kr,de,pl], Title[Engadget], UncommonHeaders[keep-alive] http://www.reddit.com/ [200] Cookies[reddit_first], Country[US], Google-Analytics[GA][12131688], HTTPServer['; DROP TABLE servertypes; --], IP[206.57.116.160], JQuery, PasswordField[passwd,passwd2], Subdomains[www,thumbs,pixel,blog], Title[reddit: the voice of the internet -- news before it happens] http://digg.com/news [200] Apache, Cookies[d,imp_id,traffic_control], Country[US], HTML5, HTTPServer[Apache], IP[64.191.203.30], Mobile-Website[Apple iPhone], PHP[5.2.9-digg8], PasswordField[password], Subdomains[dads.new,about,developers,jobs], Title[Digg - All Topics - The Latest News Headlines, Videos and Images], UncommonHeaders[x-digg-time,keep-alive], X-Powered-By[PHP/5.2.9-digg8] HELP WhatWeb - Next generation web scanner. Version 0.4.6 by Andrew Horton aka urbanadventurer from Security-Assessment.com Homepage: http://www.morningstarsecurity.com/research/whatweb Usage: whatweb [options] <URLs> <URLs> Enter URLs or filenames. Use /dev/stdin to pipe HTML directly --input-file=FILE, -i Identify URLs found in FILE, eg. -i /dev/stdin --aggression, -a 1 passive - on-page 2 polite - unimplemented 3 impolite - guess URLs when plugin matches (smart, guess a few urls) 4 aggressive - guess URLs for every plugin (guess a lot of urls like nikto) --recursion, -r Follow links recursively. Only follows links under the path (default: off) --depth, -d Maximum recursion depth (default: 10) --max-links, -m Maximum number of links to follow on one page (default: 250) --spider-skip-extensions Redefine extensions to skip. (default: zip,gz,tar,jpg,exe,png,pdf) --list-plugins, -l List the plugins --run-plugins, -p Run comma delimited list of plugins. Default is all --info-plugins, -I Display information for all plugins. Optionally search with keywords in a comma delimited list. --example-urls, -e Add example urls for each plugin to the target list --colour=[WHEN], --color=[WHEN] control whether colour is used. WHEN may be `never', `always', or `auto' --log-full=FILE Log verbose output --log-brief=FILE Log brief, one-line output --log-xml=FILE Log XML format --log-json=FILE Log JSON format --log-json-verbose=FILE Log JSON Verbose format --log-errors=FILE Log errors --user-agent, -U Identify as user-agent instead of WhatWeb/0.4.6. --max-threads, -t Number of simultaneous threads. Default is 25. --no-redirect Do not follow HTTP 3xx or meta-refresh redirects --proxy <hostname[:port]> Set proxy hostname and port (default: 8080) --proxy-user <username:password> Set proxy user and password --open-timeout Time in seconds --read-timeout Time in seconds --wait=SECONDS Wait SECONDS between connections (combine with threads = 1). --custom-plugin Define a custom plugin call Custom, Examples: ":text=>'powered by abc'" ":regexp=>/powered[ ]?by ab[0-9]/" ":ghdb=>'intitle:abc \"powered by abc\"'" ":md5=>'8666257030b94d3bdb46e05945f60b42'" "{:text=>'powered by abc'},{:regexp=>/abc [ ]?1/i}" --url-prefix Add a prefix to target URLs --url-suffix Add a suffix to target URLs --url-pattern Insert the targets into a URL. Requires --input-file, eg. www.example.com/%insert%/robots.txt --help, -h This help --verbose, -v Increase verbosity, use twice for debugging. --version Display version information. VERBOSE OUTPUT ./whatweb -v www.morningstarsecurity.com www.morningstarsecurity.com/ [200] http://www.morningstarsecurity.com [200] Apache, Country[NZ], Google-API[ajax/libs/jquery/1.3.2/jquery.min.js ], Google-Analytics[GA][791888], HTTPServer[Apache], IP[210.48.71.202], JQuery[1.4.2], MetaGenerator[WordPress 3.0.1], Subdomains[www], Title[MorningStar Security], UncommonHeaders[x-pingback], WordPress[3.0.1], x-pingback[http://www.morningstarsecurity.com/xmlrpc.php,] Apache => HTTP Server Header Country => (string: NZ) Google-API => google javascript API (version: ajax/libs/jquery/1.3.2/jquery.min.js ) Google-Analytics => pageTracker = ...UA-123-1231 (string: GA,accounts: 791888) HTTPServer => server string (string: Apache) IP => IP (string: 210.48.71.202) JQuery => script (version: 1.4.2) MetaGenerator => meta generator tag (string: WordPress 3.0.1) Subdomains => (version: www) Title => page title (string: MorningStar Security) UncommonHeaders => headers (string: x-pingback) WordPress => wp-content (certainty: 75), meta generator tag (version: 3.0.1), Relative /wp-content/ link x-pingback => (string: http://www.morningstarsecurity.com/xmlrpc.php) LOG OUTPUT There are currently 6 types of log output. They are: --log-brief=FILE Log brief, one-line output. Default output. --log-full=FILE Log verbose output (might be removed in future) --log-xml=FILE Log XML format --log-json=FILE Log JSON format --log-json-verbose=FILE Log JSON Verbose format --log-errors=FILE Log errors. This is usually printed to the screen in red. You can output to multiple logs simulatenously by specifying muliple command line logging options. Brief Logging. Example usage: ./whatweb --log-brief b.log digg.com http://digg.com [302] Apache, Cookies[d,imp_id,traffic_control], Country[US], HTTPServer[Apache], IP[64.191.203.30], PHP[5.2.9-digg8], RedirectLocation[/news], UncommonHeaders[x-digg-time,keep-alive], X-Powered-By[PHP/5.2.9-digg8] http://digg.com/news [200] Apache, Cookies[d,imp_id,traffic_control], Country[US], HTML5, HTTPServer[Apache], IP[64.191.203.30], Mobile-Website[Apple iPhone], PHP[5.2.9-digg8], PasswordField[password], Subdomains[dads.new,about,developers,jobs], Title[Digg - All Topics - The Latest News Headlines, Videos and Images], UncommonHeaders[x-digg-time,keep-alive], X-Powered-By[PHP/5.2.9-digg8] This is one connection per line and is searchable with grep. XML Logging The XML logging is currently naive and may change. Please contact me if you have suggestions. Example usage: ./whatweb --log-xml x.log digg.com Contents of x.log: <?xml version="1.0"?><?xml-stylesheet type="text/xml" href="whatweb.xsl"?> <log> <target> <uri>http://digg.com</uri> <http-status>302</http-status> <plugin> <name>Apache</name> </plugin> <plugin> <name>Cookies</name> <string>d,imp_id,traffic_control</string> </plugin> <plugin> <name>Country</name> <string>US</string> </plugin> <plugin> <name>HTTPServer</name> <string>Apache</string> </plugin> <plugin> <name>IP</name> <string>64.191.203.30</string> </plugin> <plugin> <name>PHP</name> <version>5.2.9-digg8</version> </plugin> <plugin> <name>RedirectLocation</name> <string>/news</string> </plugin> <plugin> <name>UncommonHeaders</name> <string>x-digg-time,keep-alive</string> </plugin> <plugin> <name>X-Powered-By</name> <string>PHP/5.2.9-digg8</string> </plugin> </target> <target> <uri>http://digg.com/news</uri> <http-status>200</http-status> <plugin> <name>Apache</name> </plugin> <plugin> <name>Cookies</name> <string>d,imp_id,traffic_control</string> </plugin> <plugin> <name>Country</name> <string>US</string> </plugin> <plugin> <name>HTML5</name> </plugin> <plugin> <name>HTTPServer</name> <string>Apache</string> </plugin> <plugin> <name>IP</name> <string>64.191.203.30</string> </plugin> <plugin> <name>Mobile-Website</name> <version>Apple iPhone</version> </plugin> <plugin> <name>PHP</name> <version>5.2.9-digg8</version> </plugin> <plugin> <name>PasswordField</name> <string>password</string> </plugin> <plugin> <name>Subdomains</name> <version>about,dads.new,developers,jobs</version> </plugin> <plugin> <name>Title</name> <string>Digg - All Topics - The Latest News Headlines, Videos and Images</string> </plugin> <plugin> <name>UncommonHeaders</name> <string>x-digg-time,keep-alive</string> </plugin> <plugin> <name>X-Powered-By</name> <string>PHP/5.2.9-digg8</string> </plugin> </target> </log> PLUGINS Plugins are easy to make. Matches are made with: * Text strings (case sensitive) * Regular expressions * Google Hack Database queries (limited set of keywords) * MD5 hashes * URL recognition * HTML tag patterns * Custom ruby code for passive and aggressive operations * and more... To list the plugins supported: $ ./whatweb -l Plugins Loaded ------------------------------ 1024-CMS,0.1 360-Web-Manager,0.2 4images,0.1 AChecker,0.1 ACollab,0.1 ... (there are a lot) To view more detail about a plugin or search plugins for a keyword: ./whatweb -I joomla Plugin Information Searching for joomla ------------------------------ Joomla version 0.6 by Andrew Horton [14] examples, [3] matches, [x] aggressive, [x] passive, [x] version. Description: Opensource CMS written in PHP. Homepage: http://joomla.org. Plugin can aggresively identify version by comparing md5 hashes of 4 files. Valid up to version 1.5.15. -------------------------------------------------------------------------------- WRITING PLUGINS To get started modify plugin-template.rb.txt in the my-plugins folder. Also read the (out of date) tutorial on writing WhatWeb plugins at www.morningstarsecurity.com/downloads/How-to-develop-WhatWeb-plugins-1.1.txt. A typical plugin looks like this: Plugin.define "Plone" do author "Andrew Horton" version "0.2" description "CMS http://plone.org" examples %w| www.norden.org www.trolltech.com www.plone.net www.smeal.psu.edu| matches [ {:name=>"meta generator tag", :regexp=>/<meta name="generator" content="[^>]*http:\/\/plone.org" \/>/}, {:name=>"plone css", :regexp=>/(@import url|text\/css)[^>]*portal_css\/.*plone.*css(\)|")/}, #" {:name=>"plone javascript", :regexp=>/src="[^"]*ploneScripts[0-9]+.js"/}, #" {:text=>'<div class="visualIcon contenttype-plone-site">'}, {:name=>"div tag, visual-portal-wrapper", :certainty=>75, :text=>'<div id="visual-portal-wrapper">'}, ] def passive m=[] #X-Caching-Rule-Id: plone-content-types #X-Cache-Rule: plone-content-types m << {:name=>"X-Caching-Rule-Id: plone-content-types" } if @meta["x-caching-rule-id"] =~ /plone-content-types/i m << {:name=>"X-Cache-Rule: plone-content-types" } if @meta["x-cache-rule"] =~ /plone-content-types/i m end end There are 3 levels to a plugin. Simple matches, passive and agressive tests. You don't need to know ruby to write plugins with simple matches. Passive and aggressive tests are written in ruby. The matches [] array contains a set of ways to match a website to a system. The methods are: * :text Matches text within the webpage * :regexp A regular expression. Append /i for case insensitive matches * :ghdb Like a google query * :md5 MD5 hash of the HTML page * :tagpattern Pattern of HTML tags * :name This is the name of the match, and is optional. * :url The URL has to match this. Used for passive and agressive testing * :certainty Optional, defaults to 100. Values are maybe (25), probably (75) and certain (100). * :version As a string or number this is a version returned when other methods match * :version As a regular expression, this extracts the version information from the HTML. Also requires :regexp_offset=>1 * :model As a string or number this is a device model returned when other methods match * :model As a regular expression, this extracts the device model information from the HTML. Also requires :regexp_offset=>11 * :module As a string or number this is a web application module returned when other methods match * :module As a regular expression, this extracts the web application module information from the HTML. Also requires :regexp_offset=>1 * :firmware As a string or number this is a device firmware version returned when other methods match * :firmware As a regular expression, this extracts the device firmware version information from the HTML. Also requires :regexp_offset=>1 * :filepath As a string or number this is a file path returned when other methods match * :filepath As a regular expression, this extracts the file path information from the HTML. Also requires :regexp_offset=>1 * :account As a string or number this is a user credential returned when other methods match * :account As a regular expression, this extracts the user credential information from the HTML. Also requires :regexp_offset=>1 If you port a GHDB match, use :ghdb. I usually rewrite the GHDB matches with regular expressions, especially if they require inurl: Example: # http://johnny.ihackstuff.com/ghdb?function=detail&id=1840 { :ghdb=>'"Powered by Vsns Lemon" intitle:"Vsns Lemon"' } Note the GHDB queries are case insensitive, as a Google query is. Supported GHDB codes are: * intitle: * inurl: * filetype: Each plugin can access @body, @meta, @status, @base_uri, @md5 and @tagpattern variables. Passive tests add matches to the m array, each match is a hash containing the name of the match, probability and more. The entire hash is returned with Full output, Brief output returns just the match, :version and :string To discover the regular expressions to match against, wget about 20-30 examples into the tests/ folder. Be aware that some software can have dramatic variations between versions. AGGRESSIVE PLUGINS There are currently aggressive plugins for Joomla, phpBB, FluxBB, OSCommerce and Tomcat. With the passive plugin we know that smartor.is-root.com/forum/ is running phpBB version 2 $ ./whatweb smartor.is-root.com/forum/ http://smartor.is-root.com/forum/ [200] Apache[2.2.15], Cookies[phpbb2mysql_data,phpbb2mysql_sid], Country[DE], HTTPServer[Apache/2.2.15], IP[88.198.177.36], PHP[5.2.13], PasswordField[password], PoweredBy[phpBB], Title[Smartors Mods Forums - Reloaded], X-Powered-By[PHP/5.2.13], phpBB[2] With the aggressive plugin we know that ardentcreative.co.nz is running phpBB version 2.0.20 $ ./whatweb -a 3 smartor.is-root.com/forum/ http://smartor.is-root.com/forum/ [200] Apache[2.2.15], Cookies[phpbb2mysql_data,phpbb2mysql_sid], Country[DE], HTTPServer[Apache/2.2.15], IP[88.198.177.36], PHP[5.2.13], PasswordField[password], PoweredBy[phpBB], Title[Smartors Mods Forums - Reloaded], X-Powered-By[PHP/5.2.13], phpBB[2,2.0.20] Do not use aggressive plugins with recursive site crawling. WhatWeb has no understanding of a website, instead it currently treats each URL separately. It also has no caching so if you use aggressive plugins with recursion you will fetch the same files multiple times. The same is true for aggressive modes on redirecting URLs. RECURSIVE SPIDER The recursion option is used to scan some or all of a website with whatweb. Recursive spidering will follow each link on a webpage if it is within the same website, then repeat the process on the followed pages. The configurable settings for recursive spidering are: --recursion, -r Follow links recursively. Only follows links under the path (default: off) --depth, -d Maximum recursion depth (default: 3) --max-links, -m Maximum number of links to follow on one page (default: 25) The spider skips this hardcoded list of file extensions: /\.zip$/,/\.gz$/,/\.tar$/,/\.jpg$/,/\.exe$/,/\.png$/,/\.pdf$/ Limitations of the spidering. This follows links in <a> tags, these are the HTML tags designed specifically for links. The spider does not obtain urls from other sources. Some good choices for future improvement are image tags, eg. <img src="/images/boats.jpg">, form tags, eg. <form action="/vote.php">, url paths in CSS files, etc. The spider is provided by Anemone, a third party ruby gem. It doesn't follow redirects. For example the URL treshna.com will fail and www.treshna.com will produce results. KNOWN ISSUES OR BUGS * The web spider, Anemone doesn't following 302 redirects * Don't use aggressive modes with website spidering as it will repeat the tests and URL requests. * :text matches are case sensitive * :regexp matches are case sensitive, you can make then case insensitive by appending i, eg. /foobar/i * The MD5 hashing algorithm is used not for it's security but for it's speed as a hashing algorithm especially compared to SHA1 * Timeout settings do not apply to the Anemone spider * Meta refresh redirects do not apply to the Anemone spider RELATED PROJECTS WhatWeb is unique however there are some web projects with the same goal of identifying a website. www.whatweb.net Brendan Coles has set up a web frontend to WhatWeb. No registration required. http://www.whatweb.net/ Blind Elephant The BlindElephant Web Application Fingerprinter attempts to discover the version of a (known) web application by comparing static files at known locations against precomputed hashes for versions of those files in all all available releases. The technique is fast, low-bandwidth, non-invasive, generic, and highly automatable. http://blindelephant.sourceforge.net/ WAFP - Web Application Finger Printing Wafp identifies systems by requesting a large quantity of URLs and comparing md5 sums of the results against a database. This method is reliable for known systems in the database and it is simple to add new ones. Unlike whatweb, this method is intrusive and will create a lot of webserver log entries. http://www.mytty.org/wafp Wappalyzer This is the most similar project to WhatWeb. Firefox plugin identifies sites using 1 regexp each. Only looks for obvious identifiers like meta generator tags. Sends all recognized urls to a DB. Has nice icons https://addons.mozilla.org/en-US/firefox/addon/10229 w3af http://w3af.sourceforge.net Very slight overlap of features in the grep and discovery scripts section. HTTPRecon No feature overlap, fingerprints the HTTP Server http://w3dt.net/tools/httprecon/ http://www.net-square.com/httprint/httprint_paper.html http://www.darknet.org.uk/2007/09/httprint-v301-web-server-fingerprinting-tool-download/ Nmap version scan Nmap shows some info about HTTP servers when using version scan, eg. nmap -sV -p80 treshna.com Shodan Computer Search Engine ShodanHQ returns the HTTP header, country and model for some devices. Full results requires registration. THC's Amap This tool is an application fingerprint scanner which can identify an HTTP protocol server. It doesn't identify types of HTTP servers. What's that web server running 1.0 (whatweb.exe) This shares the same name and goal but is shit. It ONLY uses the HTTP Server string. For example 'Apache/2.0.55 (Ubuntu) PHP/5.1.2' http://www.spambutcher.com/whatweb.html http-stats.com Lots of info about HTTP server names Builtwith.com Stats of popularity of web stuff. FUNNY & UNUSUAL Slashdot.org X-Fry: You mean Bender is the evil Bender? I'm shocked! Shocked! Well not that shocked. popurls.com X-popurls-a: in the future every url will be popular for 1.5 seconds reddit.com HTTPServer:'; DROP TABLE servertypes; -- NOTES Version 0.3 Released at Kiwicon III (kiwicon.org), 2009. Version 0.4 Released March 14th, 2010 Version 0.4.1 Released April 28th, 2010 Version 0.4.2 Released April 30th, 2010 Version 0.4.3 Released May 24th, 2010 Version 0.4.4 Released June 29th, 2010 Version 0.4.5 Released August 17th, 2010 Version 0.4.6 Unreleased Development Version CREDITS Written by urbanadventurer aka Andrew Horton from Security-Assessment.com Homepage: http://www.morningstarsecurity.com/research/whatweb License: GPLv2 Anemone library (used for spidering) is written by Chris Kite Homepage: http://anemone.rubyforge.org/ License: MIT COMMUNITY PLUGINS Thank you to the following people who have contributed plugins to WhatWeb. Brendan Coles (Major contributor!) Emilio Casbas Louis Nyffenegger Patrik Wallström Caleb Anderson Tonmoy Saikia Aung Khant Erik Inge Bolsø nk@dsigned.gr Michal Ambroz for writing the Makefile and Man pages, not plugins.
About
Next generation web scanner. Identify what websites are running. [unstable development build]
Resources
License
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published
Languages
- Ruby 100.0%