Skip to content
Modern Cassandra tap for Cascading. Actually works with Cascading 2.0, Cascalog 1.10 and supports CQL collections.
Java Clojure
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.

Cascading Tap for Cassandra

Build Status

This is a Cassandra Tap that can be used as a sink and source. Works with the latest version of Cassandra and Cascading (2.0), is tested, well-maintained. It's working fine for us, but use it at your own risk.

If you're new to Cassandra, check out our Cassandra Guides, They were initially written for Cassaforte, Clojure Cassandra driver, but are generic enough and go in elaborate details on all Cassandra-related topics, such as Consistency/Availability, Data Modelling, Command Line Tools, Timestamps, Counters and many many more.


To use it as both source and sink, simply create a Schema:

import com.clojurewerkz.cascading.cassandra.CassandraTap;
import com.clojurewerkz.cascading.cassandra.CassandraScheme;

import java.util.List;
import java.util.ArrayList;
import java.util.Map;
import java.util.HashMap

Map<String, String> settings = new HashMap<String, String>();
mappings.put("", "localhost");
mappings.put("db.port", "9160");
// And so on...

CassandraScheme scheme = new CassandraScheme(settings);
CassandraTap tap = new CassandraTap(scheme);

That's pretty much it. To do same thing in Clojure (with Cascalog), you can use following code:

(defn create-tap
  (let [keyspace      "keyspace"
        column-family "column-family"
        scheme        (CassandraScheme.
                       {"sink.keyColumnName" "name"
                        "" ""
                        "db.port" "9160"
                        "db.keyspace" "cascading_cassandra"
                        "db.inputPartitioner" "org.apache.cassandra.dht.Murmur3Partitioner"
                        "db.outputPartitioner" "org.apache.cassandra.dht.Murmur3Partitioner"})
        tap           (CassandraTap. scheme)]

Possible mappings:


  • - host of the database to connect to
  • db.port - port of the database to connect to
  • db.keyspace - keyspace to use for sink or source
  • db.columnFamily - column family to use for sink or source
  • db.inputPartitioner - partitiner for DB used as source
  • db.outputPartitioner - partitiner for DB used as sink


  • source.columns - columns for the source, to be fetched
  • source.useWideRows - wether or not to use wide rows deserialization
  • source.types - data types to use for deserialization.
    • Examplpe for wide columns: {"key", "UTF8Type", "value" "Int32Type"}
    • Example for static columns: {"column1" "UTF8Type" "column2" "Int32Type"


  • sink.keyColumnName - key column name for sink
  • sink.outputMappings - output mappings for sink, used to map internal Cascading tuple segment names to database columns.


Jar is hosted on Clojars:


[cascading-cassandra "1.0.0-rc3"]



This project supports the ClojureWerkz goals

ClojureWerkz is a growing collection of open-source, batteries-included Clojure libraries that emphasise modern targets, great documentation, and thorough testing. They've got a ton of great stuff, check 'em out!


Copyright (C) 2011-2013 Alex Petrov

Double licensed under the Eclipse Public License (the same as Clojure) or the Apache Public License 2.0.

Other Contributors


Thanks to (JetBrains)[] for providing a license for (IntelliJIDEA)[] to develop part of this project developed in Java.


Alex Petrov: alexp at coffeenco dot de

Something went wrong with that request. Please try again.