Skip to content

llasram/parkour

 
 

Repository files navigation

Parkour

Hadoop MapReduce in idiomatic Clojure. Parkour takes your Clojure code’s functional gymnastics and sends it free-running across the urban environment of your Hadoop cluster.

Installation

Parkour is available on Clojars. Add this :dependency to your Leiningen project.clj:

[com.damballa/parkour "0.4.0"]

Usage

Parkour is a Clojure library for writing Hadoop MapReduce jobs. It tries to avoid being a “framework” – if you know Hadoop, and you know Clojure, then you’re most of the way to knowing Parkour.

The Parkour introduction contains an overview of the key concepts, but here is the classic “word count” example, in Parkour:

(defn mapper
  [conf]
  (fn [context input]
    (->> (mr/vals input)
         (r/mapcat #(str/split % #"\s+"))
         (r/map #(-> [% 1])))))

(defn reducer
  [conf]
  (fn [context input]
    (->> (mr/keyvalgroups input)
         (r/map (fn [[word counts]]
                  [word (r/reduce + 0 counts)])))))

(defn word-count
  [dseq dsink]
  (-> (pg/input dseq)
      (pg/map #'mapper)
      (pg/partition [Text LongWritable])
      (pg/combine #'reducer)
      (pg/reduce #'reducer)
      (pg/output dsink)))

Documentation

Parkour’s documentation is divided into a number of separate sections:

  • Introduction – A getting-started introduction, with an overview of Parkour’s key concepts.
  • Motivation – An explanation of the goals Parkour exists to achieve, with comparison to other libraries and frameworks.
  • Namespaces – A tour of Parkour’s namespaces, explaining how each set of functionality fits into the whole.
  • Serialization – How Parkour integrates Clojure with Hadoop serialization mechanisms.
  • Testing – Patterns for testing Parkour MapReduce jobs.
  • Deployment – Running Parkour applications on a Hadoop cluster.
  • Reference – Generated API reference, via codox.

License

Copyright © 2013 Marshall Bockrath-Vandegrift & Damballa, Inc.

Distributed under the Apache License, Version 2.0.

About

Hadoop MapReduce in idiomatic Clojure.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Clojure 85.9%
  • Java 13.9%
  • Other 0.2%