Grow your team on GitHub
GitHub is home to over 28 million developers working together. Join them to grow your own development teams, manage permissions, and collaborate on projects.Sign up
Cascading is a feature rich API for defining and executing complex and fault tolerant data processing workflows on various cluster computing platforms. Please see https://github.com/cwensel/cascading for access to all WIP branches.
Stand-alone ANSI SQL for Cascading on Apache Hadoop
HBase adapters for Cascading
Integration for Cascading and Apache Hive
The Scalding tutorial as a standalone SBT project
A Fluent Java API for Cascading
A test harness for testing binary compatibility
Deploying apache-hadoop in a virtualized cluster as easy as 1-2-3.
Tutorials for Cascading, Lingual, Pattern and other projects
source examples to support the "Cascading for the Impatient" blog post series
cascading schemes and taps for JDBC
A simple command line interface for building high load cluster jobs.
Memecached/Membase/ElasticSearch integration for Cascading
a simple kind of social recommender
Machine Learning for Cascading
Cascading.Multitool is a sed and grep command line tool for Apache Hadoop.
Sample applications using Cascading
standalone project for running the cascalog tutorial
A simple Hello World Cascading project to ease the start of a new Cascading application
Annotations and Classes for managing and executing dependent processes
Serializer and comparator for using Thrift objects in Cascading or Cascalog
All the Cascading taps you need and love.
This project is deprecated, please use https://github.com/Cascading/cascading-jdbc
Cascalog for the Impatient
[DEPRECATED, please use https://github.com/magro/kryo-serializers] Extra tidbits for Kryo.
Cascading plus City of Palo Alto open data
Kryo Integration for Cascading.