Clojure-based query language for Hadoop inspired by Datalog.
Clojure Java
Pull request Compare This branch is 1 commit ahead, 1268 commits behind nathanmarz:master.
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.


Cascalog allows you to query Hadoop in Clojure with an expressive language inspired by Datalog. Follow the getting started steps, check out the tutorial, and you'll be running Cascalog queries on your local computer within 5 minutes.

Cascalog also features a wrapper around Cascading to define dataflows in cascalog.workflow . Custom operations defined in Cascalog can be used both for Cascalog queries and Cascalog dataflows.

Getting started

  1. Make sure you have java 1.6
  2. export JAVA_OPTS=-Xmx768m
  3. install leiningen
  4. git clone git://
  5. cd cascalog && lein deps && lein compile-java && lein compile
  6. optionally run "lein test" to make sure tests pass


  1. Introducing Cascalog
  2. New Cascalog features: outer joins, combiners, sorting, and more
  3. News Feed in 38 lines of code using Cascalog

Running Cascalog queries on a Hadoop cluster

  1. Cascalog includes hadoop as a dependency so that you can experiment with it easily. Don't include Hadoop jars within your jar that has Cascalog.
  2. Cascalog requires Cascading 1.1
  3. Any custom operations must be compiled into the jar you give to Hadoop for running jobs

Questions or Concerns?

Google group: cascalog-user

IRC: Come chat in the #cascading room on freenode


Cascalog is based off of a very early branch of cascading-clojure project. Special thanks to Bradford Cross and Mark McGranaghan for their work on that project. Much of that code appears within Cascalog in either its original form or a modified form.