Skip to content
This repository

A lightweight Python library that uses Webkit to enable easy scraping of dynamic, Javascript-heavy web pages

branch: master

Merge pull request #13 from juanriaza/master

cssselect is no longer part of lxml
latest commit f7fce01453
Niklas Baumstark authored
Octocat-spinner-32 docs improve Google search example March 13, 2012
Octocat-spinner-32 dryscrape Upgrade cssselect September 15, 2012
Octocat-spinner-32 examples improved docs, changed class structure to allow for easier extending March 27, 2012
Octocat-spinner-32 .gitignore add setup.py January 12, 2012
Octocat-spinner-32 LICENSE add license January 12, 2012
Octocat-spinner-32 MANIFEST.in add setup.py January 12, 2012
Octocat-spinner-32 README.md fix README January 13, 2012
Octocat-spinner-32 requirements.txt Upgrade cssselect September 15, 2012
Octocat-spinner-32 setup.py big rename January 13, 2012
README.md

Overview

Author: Niklas Baumstark

dryscrape is a lightweight web scraping library for Python. It uses a headless Webkit instance to evaluate Javascript on the visited pages. This enables painless scraping of plain web pages as well as Javascript-heavy “Web 2.0” applications like Facebook.

It is built on the shoulders of capybara-webkit's webkit-server. A big thanks goes to thoughtbot, inc. for building this excellent piece of software!

Installation, Usage, API Docs

Documentation can be found at dryscrape's ReadTheDocs page.

Contact, Bugs, Contributions

If you have any problems with this software, don't hesitate to open an
issue on Github or open a pull request or write a mail to niklas baumstark at Gmail.

Something went wrong with that request. Please try again.