refactor the crawler
clean up require paths
* use require_relative instead of needing to grab the directory of the current file * use the full paths of files for autoloading, so that the files from the installed gem aren't loaded accidentally during development * spec_helper can simply be `require`d, since the spec directory is added to the load path by rspec
add --include flag
Ruby 1.9/2.0 support
The option --local (short -a) should crawl URLs which start the given starting point URL. All other links should be skipped.
Added Skip and iSkip options
Add --logfile option.
rawler/base: Properly handle circular redirections.
A url redirecting must be recorded as parsed before doing the redirect in order to prevent possible infinite loops.
Allows specifying which logfile to write to. Assigning a custom logfile implies turning on logging. (Logfile could perhaps have been assigned via --log=filename but trollop doesn't support boolean trigger with optional string, so a separate option is used instead.)