Skip to content


Subversion checkout URL

You can clone with
Download ZIP
Cascading is a feature rich API for defining and executing complex and fault tolerant data processing workflows on a Hadoop cluster. Please see for access to all WIP branches.
Pull request Compare This branch is 694 commits behind cwensel:wip-3.0.
Fetching latest commit...
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.


Thanks for using Cascading.

General Information:

Project and contact information:

This distribution includes six Cascading jar files:

  • cascading-x.y.z.jar - all relevant Cascading class files and libraries, with a 'lib' folder
  • cascading-core-x.y.z.jar - all Cascading Core class files
  • cascading-local-x.y.z.jar - all Cascading Local mode class files
  • cascading-hadoop-x.y.z.jar - all Cascading Hadoop mode class files
  • cascading-xml-x.y.z.jar - all Cascading XML operations class files
  • cascading-test-x.y.z.jar - all Cascading tests and test utilities

These jars are all available via

Hadoop mode is where the Cascading application should run on a Hadoop cluster.

Local mode is where the Cascading application will run locally in memory without any Hadoop dependenices.


To build Cascading, run the following in the shell:

> git clone
> cd cascading
> gradle build

Cascading currently requires Gradle 1.0 to build.

To use an IDE like IntelliJ, run the following to get all jar dependencies:

> gradle ideLibs

Using with Apache Hadoop:

To use Cascading with Hadoop, we suggest stuffing cascading-core, cascading-hadoop, (optionally) cascading-xml jarfiles and all third-party libs (optionally retrieved by calling gradle ideLibs) into the lib folder of your job jar and executing your job via $HADOOP_HOME/bin/hadoop jar your.jar <your args>.

Note you do not need to put the lib/hadoop jars in your jar as they are already present in your cluster.

For example, your job jar would look like this (via: jar -t your.jar)

/<all your class and resource files>
/lib/<cascading third-party jar files>

Hadoop will unpack the jar locally and remotely (in the cluster) and add any libraries in lib to the classpath. This is a feature specific to Hadoop.

The cascading-x.y.z.jar file is typically used with scripting languages and is completely self contained, but it cannot be added to a jar lib folder as Hadoop will not recursively unjar jars.

Something went wrong with that request. Please try again.