@hollingsworthd hollingsworthd released this Dec 8, 2014 · 115 commits to master since this release

Assets 3

automatic, zero-config web scraping

v1.1.0 Release Notes:

  • Fix URL canonicalization
  • Handle onclick events when fetching pages
  • Recover from Firefox crashes
  • More robust handling of HTTP gets
  • Remove unused dependencies and Firefox addons
  • Support to send custom HTTP headers on each request
  • Ability to specify browser config options
  • Support for very large result sets
  • Options to specify result patterns
  • Default config options for proxies and instance IPs
  • Ability to disable request throttling
  • Recursive/chained queries
  • Option to turn off result extraction (just return HTML)
  • Better support for fetching pages from public web caches