Jabba's headless webkit browser for scraping AJAX-powered webpages.
Python
Latest commit 40b39f6 Oct 23, 2014 @jabbalaci modernized for Python 3
Permalink
Failed to load latest commit information.
.gitignore initial commit Dec 27, 2012
README.md modernized for Python 3 Oct 23, 2014
jabba_webkit.py modernized for Python 3 Oct 23, 2014

README.md

Jabba-Webkit

Jabba's headless webkit browser for scraping AJAX-powered webpages.

Usage:

jabba_webkit.py <url> [<time>]

url: the page whose source you want to get

time: The application will quit after this given time (in seconds)

If the webpage is AJAX-powered and updates itself, you can tell this browser to wait X seconds. Then it fetches the generated HTML source.

You can also use it as a library:

>>> import jabba_webkit as jw
>>> html1 = jw.get_page(url1, time1)
>>> html2 = jw.get_page(url2)    # yes, you can call it several times