Skip to content
Felipe Lima edited this page Mar 5, 2015 · 23 revisions

version 2.4.0

version 2.3.0

version 2.2.1

  • Adds ability to crawl a prefetched Mechanize page (thanks to @dsjbirch)

version 2.1.2

  • Added support for hash based property selectors (eg.: css: 'header' instead of 'css=.header')

version 2.1.1

  • Updated gem dependencies

version 2.1.0

version 2.0.1

  • Added proxy settings configuration (thanks to @phortx)
  • Fixed minor bug in HTML property locator

version 2.0.0

This version contains some breaking changes (not backwards compatible), most notably to for_each that is now specified through the option :iterator and nested block parameters that are gone.

  • Added syntatic sugar methods Wombat.scrape and Crawler#scrape that alias to their respective crawl method implementation;
  • Gem internals suffered big refactoring, removed code duplication;
  • DSL syntax simplified for nested properties. Now the nested block takes no arguments;
  • DSL syntax changed for iterated properties. Iterators can now be named just like other properties and won't be automatically named as iterator#{i} anymore. Specified through the :iterator option;
  • Crawler#list_page is now called Crawler#path;
  • Added new :follow property type that crawls links in pages.

version 1.0.0

  • Breaking change: Metadata#format renamed to Metadata#document_format due to method name clash with Kernel#format

version 0.5.0

version 0.4.0

  • Added utility method Wombat.crawl that eliminates the need to have a ruby class instance to use Wombat. Now you can use just Wombat.crawl and start working. The class based format still works as before though.

version 0.3.1

  • Added the ability to provide a block to Crawler#crawl and override the default crawler properties for a one off run (thanks to @danielnc)