Skip to content
master
Go to file
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
 
 
 
 
 
 
 
 

README.md

wappalyzer-python -- UNMAINTAINED

pypi badge

Python wrapper for Wappalizer (utility that uncovers the technologies used on websites)

Warning: this package is not maintained anymore.

Scrapinghub and Javier Casas, the original author, have no plans to support wappalyzer-python in the foreseeable future (this includes fixing bugs, supporting upgraded dependencies like PyV8 etc.)

If you are interested in continuing the work, please get in touch via opensource@scrapinghub.com so that we can discuss transferring ownership of this repository.

How to use it

>>> from wappalyzer import Wappalyzer
>>> w = Wappalyzer()

>>> w.analyze('http://wikipedia.org')
{u'Apache': {u'confidence': 100, u'version': u'', u'categories': [u'web-servers']},
u'Varnish': {u'confidence': 100, u'version': u'', u'categories': [u'cache-tools']}}

>>> w.analyze('http://tripadvisor.com')
{u'Apache': {u'confidence': 100, u'version': u'', u'categories': [u'web-servers']},
u'Google Analytics': {u'confidence': 100, u'version': u'', u'categories': [u'analytics']},
u'comScore': {u'confidence': 100, u'version': u'', u'categories': [u'analytics']}}

>>> w.analyze('http://facebook.com')
{u'reCAPTCHA': {u'confidence': 100, u'version': u'', u'categories': [u'captchas']}}

You can specify the User-Agent to use:

>>> w.analyze('http://www.google.com', user_agent='your_user_agent')

Or analyze from already downloaded pages (in this case you'll need to have the url and response headers too):

>>> w.analyze_from_data(url=the_url, html=the_html, headers=the_response_headers)

Apps and Categories are available as dict objects:

>>> w.apps
{u'Google Wallet': {u'website': u'wallet.google.com', u'cats': [41], u'script': [u'checkout\\.google\\.com',
u'wallet\\.google\\.com']}, u'Lockerz Share': ...}

>>> w.categories
{u'42': u'tag-managers', u'48': u'network-storage', u'43': u'paywalls', u'49': u'feed-readers', u'24':
u'rich-text-editors', u'25': u'javascript-graphics', u'26': u'mobile-frameworks', ...}

Data can be also updated with the latest version available from the Wappalyzer Github repo:

>>> from wappalyzer import updater
>>> updater.update_all()

By default app icons will be updated to the data/icons folder, in case you need them somewhere else you can specify the destination folder:

>>> from wappalyzer import updater
>>> updater.update_all(icons_folder='your_icons_folder')

Or update them individually:

>>> updater.update_icons(icons_folder='your_icons_folder')

Requirements

Note for macos users: If you have problems installing PyV8 you can use PyV8-OS-X:

pip install -e git://github.com/brokenseal/PyV8-OS-X#egg=pyv8

Install

Using setup:

python setup.py install

Using pypi:

pip install wappalyzer-python

About

UNMAINTAINED Python wrapper for Wappalyzer (utility that uncovers the technologies used on websites)

Resources

Releases

No releases published

Packages

No packages published

Languages

You can’t perform that action at this time.