public
Description: Cascading is a feature rich API for defining and executing complex and fault tolerant data processing workflows on a Hadoop cluster.
Homepage: http://www.cascading.org/
Clone URL: git://github.com/cwensel/cascading.git
name age message
file .gitignore Tue Mar 03 12:25:19 -0800 2009 update .gitignore [cwensel]
file CHANGES.txt Fri Aug 28 15:31:40 -0700 2009 version 1.0.16 [cwensel]
file LICENSE.txt Tue Apr 07 12:01:29 -0700 2009 added LICENSE file [cwensel]
file README.txt Fri Apr 17 10:58:54 -0700 2009 updated README [cwensel]
file build.xml Mon Jul 13 18:07:43 -0700 2009 Updated ant build to not hard-code hadoop/lib s... [cwensel]
file gpl.txt Thu Jan 24 09:31:25 -0800 2008 import cascading [cwensel]
directory lib/ Tue Nov 18 18:15:53 -0800 2008 updated licensing, fixed javadoc errors [cwensel]
directory src/ Fri Aug 28 11:18:17 -0700 2009 reuse jobconf [cwensel]
file version.properties Fri Aug 28 15:31:40 -0700 2009 version 1.0.16 [cwensel]
README.txt
Thanks for using Cascading.

General Information:

  Project and contact information: http://www.cascading.org/

  This distribution includes four Cascading jar files:

  cascading-x.y.z.jar      - all relevant Cascading class files and libraries, with a 'lib' folder
  cascading-core-x.y.z.jar - all Cascading Core class files
  cascading-xml-x.y.z.jar  - all Cascading XML operations class files
  cascadgin-test-x.y.z.jar - all Cascading tests and test utilities

Building:

  To build Cascading,

  > cd <path to cascading>
  > ant -Dhadoop.home=<path to hadoop> compile

  To make all jars:

  > ant -Dhadoop.home=<path to hadoop> jar

  To run all tests:

  > ant -Dhadoop.home=<path to hadoop> test

  where <path to cascading> is the directory created after cloning or uncompressing the Cascading
  distribution, and <path to hadoop> is where you installed Hadoop.

  Note that ant will not interpret the ~ path, use ${user.home} instead. For example,
    -Dhadoop.home=${user.home}/hadoop

  Alternatively, you can put hadoop.home inside the file build.properties in the cascading project directory.

Using:

  To use with Hadoop, we suggest stuffing cascading-core and cascading-xml jar files, and all third-party libs
  into the 'lib' folder of your job jar and executing via 'hadoop jar your.jar <your args>'.

  For example, your job jar would look like this (via: jar -t your.jar)

    /<all your class and resource files>
    /lib/cascading-core-x.y.z.jar
    /lib/cascading-xml-x.y.z.jar
    /lib/<cascading third-party jar files>

  Hadoop will unpack the jar locally and remotely (in the cluster) and add any libraries in 'lib' to the classpath.
  This is a feature specific to Hadoop.

  The cascading-x.y.z.jar file is typically used with scripting languages and is completely self contained.

  This ant snippet works quite well (you may need to override cascading.home):

  <property name="cascading.home" location="${basedir}/../cascading"/>
  <property file="${cascading.home}/version.properties"/>
  <property name="cascading.release.version" value="x.y.z"/>
  <property name="cascading.filename.core" value="cascading-core-${cascading.release.version}.jar"/>
  <property name="cascading.filename.xml" value="cascading-xml-${cascading.release.version}.jar"/>
  <property name="cascading.libs" value="${cascading.home}/lib"/>
  <property name="cascading.libs.core" value="${cascading.libs}"/>
  <property name="cascading.libs.xml" value="${cascading.libs}/xml"/>

  <condition property="cascading.path" value="${cascading.home}/"
             else="${cascading.home}/build">
    <available file="${cascading.home}/${cascading.filename.core}"/>
  </condition>

  <property name="cascading.lib.core" value="${cascading.path}/${cascading.filename.core}"/>
  <property name="cascading.lib.xml" value="${cascading.path}/${cascading.filename.xml}"/>

  <target name="jar" depends="build" description="creates a Hadoop ready jar will all dependencies">

    <!-- copy Cascading classes and libraries -->
    <copy todir="${build.classes}/lib" file="${cascading.lib.core}"/>
    <copy todir="${build.classes}/lib" file="${cascading.lib.xml}"/>
    <copy todir="${build.classes}/lib">
      <fileset dir="${cascading.libs.core}" includes="*.jar"/>
      <fileset dir="${cascading.libs.xml}" includes="*.jar"/>
    </copy>

    <jar jarfile="${build.dir}/${ant.project.name}.jar">
      <fileset dir="${build.classes}"/>
      <fileset dir="${basedir}" includes="lib/"/>
      <manifest>
        <attribute name="Main-Class" value="${ant.project.name}/Main"/>
      </manifest>
    </jar>

  </target>