cwensel / bixo forked from emi/bixo

A creepy crawler

This URL has Read+Write access

bixo /
name age message
file .gitignore Loading commit data...
file README
directory bin/
file build.xml
directory doc/
file ivy.xml
directory ivy/
directory lib/
directory release/
directory src/
README
===============================
Introduction
===============================

Bixo is an open source Java crawler that runs as a series of Cascading
pipes. It is designed to be used as a tool for creating customized
crawlers, thus each Cascading pipe implements a discrete operation. By
building a customized Cascading pipe assembly, you can quickly create
specialized crawlers that are optimized for a particular use case.

Bixo borrows heavily from the Apache Nutch project, as well as many other
open source projects at Apache and elsewhere.

Bixo is released under the MIT license.

===============================
Building
===============================

You need Apache Ant 1.7 or higher. 
In the project root type:
ant -p

To  clean, run the tests and integration tests and build a jar type:
ant clean test it jar

To build a distribution type:
ant dist

To build a eclipse project type:
ant eclipse
Than choose "import existing project" in eclipse.