Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
basil(isk): a front-end for the anemone web crawler.
Ruby
branch: master

Fetching latest commit…

Cannot retrieve the latest commit at this time

Failed to load latest commit information.
bin
lib
test
.gitignore
HISTORY
LICENSE
README.rdoc
basilisk.gemspec

README.rdoc

basilisk

a command-line front-end for the anemone web-crawler (github.com/chriskite/anemone). basilisk produces useful reports for qa-ing websites. It also features an extensible page processor class for writing your own page processors.

Included page processors:

  • seo: generates a csv with the following columns: url, title, description, keywords, h1s, h2s

  • sitemap: generates an xml sitemap

  • image: generates a list of broken images and images lacking an alt tag.

  • error: generates a csv of urls returning html response codes other than success and redirect.

See the generated yml config file for even more options.

install

sudo gem install basilisk

usage

To create a new search:

basil create [search_name] [url]
  • Creates a search config file ([search_name].yml), which you may edit to change the default options, specify which page process you want to run, any regex and css terms for searching across the site, and regexes for skipping urls.

To run the search:

basil run [search_name]
  • Runs the specified search. Note: you must create a search before running it. Files generated by the page processors will reside in a folder called [search_name].

author & license

basilisk is licensed under a modified MIT licence. See LICENCE.txt.

basilisk was written by Kyle Banker, largely dependent on the anemone web-crawler by Chris Kite.

Copyright 2009 Alexander Interactive, Inc.

Something went wrong with that request. Please try again.