Skip to content
This repository

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Crawlista is a support library for Clojure applications that crawl the Web

Fetching latest commit…

Cannot retrieve the latest commit at this time

README.md

What is Crawlista

Crawlista is a support library for Clojure applications that crawl the Web.

Continuous Integration status

Usage

Installation

With Leiningen

[clojurewerkz/crawlista "1.0.0-alpha13"]

or, if you are comfortable with using snapshots,

[clojurewerkz/crawlista "1.0.0-SNAPSHOT"]

New snapshots are published to clojars.org every day (if there are any changes).

Crawlista is a Work In Progress

Crawlista is a work in progress. Please see our test suite for code examples.

Continuous Integration

Continuous Integration status

CI is hosted by travis-ci.org

Supported Clojure versions

Crawlista is built from the ground up for Clojure 1.3 and up.

Development

Crawlista uses Leiningen 2. Make sure you have it installed and then run tests against all supported Clojure versions using

lein2 all test

Then create a branch and make your changes on it. Once you are done with your changes and all tests pass, submit a pull request on Github.

Regenerating robots.txt Parser

If you make changes to the Ragel-based robots.txt parser in Crawlista, you need to regenerate it:

ragel -J -o src/java/clojurewerkz/crawlista/robots/Parser.java src/rl/clojurewerkz/crawlista/robots/Parser.rl
lein2 javac

and then run robots.txt parser test suite with

lein2 test :robots

License

Copyright (C) 2011-2012 Michael S. Klishin

Distributed under the Eclipse Public License, the same as Clojure.

Something went wrong with that request. Please try again.