Utilities for Cascading
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
doc
lib
src
.gitignore
README
build.properties
build.xml
pom.xml

README

===============================
Introduction
===============================

cascading.utils is an open source set of utilities for Cascading workflows that we
use here at Scale Unlimited for the Bixo Java web mining toolkit and other projects.
There are classes that wrap Cascading Tuples with "datum" objects, a few utility classes
such as TupleLogger and SplitterAssembly, some classes to help monitor running
workflows, etc.

cascading.utils is licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

===============================
Building
===============================

You need Apache Ant 1.7 or higher. 

To get a list of valid targets:

% cd <project directory>
% ant -p

To clean and build a jar (which also runs all tests):

% ant clean jar

Note that "ant clean jar" will currently fail, due to a bug in the maven ant task
plugin used for managing dependencies.  After it does, execute the following commands
in order to replace the corrupt pom in your local Maven repository:

% rm ~/.m2/repository/net/java/jvnet-parent/1/jvnet-parent-1.pom
% mvn eclipse:eclipse
% ant clean jar

After this, subsequent builds should all succeed.

To create Eclipse project files:

% ant eclipse

Then, from Eclipse follow the standard procedure to import an existing Java project
into your Workspace.

NOTE: Invalid 3rd Pary Jars

Sometimes during a build you'll run into this kind of error:

    [javac] error: error reading /Users/kenkrugler/.m2/repository/asm/asm/3.1/asm-3.1.jar; error in opening zip file
    [javac] error: error reading /Users/kenkrugler/.m2/repository/org/codehaus/jettison/jettison/1.1/jettison-1.1.jar; error in opening zip file
    [javac] error: error reading /Users/kenkrugler/.m2/repository/com/sun/xml/bind/jaxb-impl/2.2.3-1/jaxb-impl-2.2.3-1.jar; error in opening zip file

The issue is that some Maven repositories return redirect results when a download 
is requested (e.g. glassfish is one of these). When that happens, the maven-ant 
task we're using fails, and creates a jar that's actually the HTML redirect page.

The solution is to (a) delete all of the directories in your local maven repo that
contain invalid jars, and then (b) run `mvn eclipse` to trigger the download 
of the jars using standard Maven, which correctly handles redirects.

E.g. for the above, I did:

rm -rf ~/.m2/repository/asm/asm/3.1/
rm -rf ~/.m2/repository/org/codehaus/jettison/jettison/1.1/
rm -rf ~/.m2/repository/com/sun/xml/bind/jaxb-impl/2.2.3-1/
mvn eclipse