Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP
A short and simple python crawler, that uses Webkit and executes Javascript
Python
branch: master

Fetching latest commit…

Cannot retrieve the latest commit at this time

Failed to load latest commit information.
.gitignore
README.md
crawler.py

README.md

js-crawler

A short and simple web crawler written in Python, that uses Webkit and executes Javascript.

How to use

crawler = Crawler(gui=True,                                                 # To see the crawler in action
                  is_link_interesting=lambda url, text: 'download' in url)  # Follow every link containing
                                                                            #  "download" in the url
crawler.crawl('http://firefox.com')
crawler.close()
Something went wrong with that request. Please try again.