Skip to content
Clojure wrapper for Apache Crunch
Clojure Java
Find file
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.
crackle-core
crackle-example
crackle-hbase
.gitignore
README.md
project.clj

README.md

Clojure aphorism: Clojure programmers don’t write their apps in Clojure.
They write the language that they use to write their apps in Clojure.

"The Joy of Clojure"

Crackle

A Clojure wrapper for Apache Crunch

Installation

Crackle is available on Clojars, please report any issues here.

with Leiningen:

[crackle/crackle-core "0.5.4"]

with Maven:

<dependency>
 <groupId>crackle</groupId>
 <artifactId>crackle-core</artifactId>
 <version>0.5.4</version>
</dependency>

Usage

(ns crackle.example
  (:use crackle.core)
  (:require [crackle.from :as from])
  (:require [crackle.to :as to]))

;====== word count example ===============
(defn-mapcat split-words [] :strings
  (fn [line] (clojure.string/split line #"\s+")))

(defn count-words [input-path output-path]
  (do-pipeline :debug
    (from/text-file input-path)
    (parallel-do! (split-words))
    (count!)
    (to/text-file output-path)))

;;====== average bytes by ip example ======
(defn-map parse-line [regexp] [:strings :clojure]
  (fn [line]
    (let [[address bytes] (clojure.string/split line regexp)]
      (pair-of address [(read-string bytes) 1]))))

(defn-combine sum-bytes-and-counts []
  (fn [value1 value2]
    (mapv + value1 value2)))

(defn-mapv compute-average [] [:strings :ints]
  (fn [[bytes requests]]
    (int (/ bytes requests))))

(defn count-bytes-by-ip [input-path output-path]
  (do-pipeline
    (from/text-file input-path)
    (parallel-do! (parse-line #"\s+"))
    (group-by-key!)
    (combine-values! (sum-bytes-and-counts))
    (parallel-do! (compute-average))
    (to/text-file output-path)))

License

Copyright © 2012-2013 Victor Iacoban victor.iacoban@gmail.com

Distributed under the Eclipse Public License, the same as Clojure.

Something went wrong with that request. Please try again.