RubyGem that provides a library for scraping content from web sites. This library has the following emphasis:
-
To be very very easy to maintain and change
-
To provide an excellent tracking system, that will make it easy to modify when problems occur
-
To easily scale to 1000’s of web sites
-
Move rules out of the main project
-
Improve the tracker, for managing broken page links.
-
Refactor Extractor so that it, extracts the value in a more robust way.
To use forager from the command line try: forager ["uri"|<help>|<demo>|<test>|<testlive>] e.g. forager help forager demo forager test // this will test local HTML pages forager testlive // this will run the tests against the live (test) urls forager "http://www.amazon.co.uk/Ruby-Way-Programming-Addison-Wesley-Professional/dp/0672328844/ref=sr_1_1?ie=UTF8&s=books&qid=1265974068&sr=1-1"
-
FIX (list of requirements)
sudo gem install forager
Copyright © June 2009 B. F. B. Emson