Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP
Anemone web-spider framework
Ruby
Branch: master
Pull request Compare This branch is 1 commit ahead, 157 commits behind chriskite:master.

Fetching latest commit…

Cannot retrieve the latest commit at this time

Failed to load latest commit information.
bin
lib
spec
LICENSE.txt
README.rdoc
anemone.gemspec

README.rdoc

Anemone

DESCRIPTION

Anemone is a web spider framework that can spider a domain and collect useful information about the pages it visits. It is versatile, allowing you to write your own specialized spider tasks quickly and easily.

FEATURES

  • Multi-threaded design for high performance

  • Tracks 301 HTTP redirects to understand a page's aliases

  • Built-in BFS algorithm for determining page depth

  • Allows exclusion of URLs based on regular expressions

REQUIREMENTS

  • nokogiri

EXAMPLES

See the bin directory for several examples of useful Anemone tasks.

Something went wrong with that request. Please try again.