Skip to content

velankanisys/sparkling

 
 

Repository files navigation

Sparkling - A Clojure API for Apache Spark

Sparkling is a Clojure API for Apache Spark.

Check out our site for information about Gorillalabs Sparkling and a getting started guide.

Build Status

Availabilty from Clojars

Sparkling is available from Clojars. To use with Leiningen, add

Clojars Project to your dependencies.

See gorillalabs/sparkling-getting-started for an example project using Sparkling. This one is also used in the getting started guide

Release Notes

1.0.0 - Added value to the existing libraries (clj-spark and flambo)

  • It's about twice as fast by getting rid of a reflection call (thanks to David Jacot for his take on this).
  • Get rid of mapping/remapping inside the api functions, which
    • bloated the execution plan (mine shrinked to a third) and
    • (more importantly) allowed me to keep partitioner information.
  • adding more -values functions (e.g. map-values), againt to keep partitioner information.
  • Additional Sources for RDDs:
    • JdbcRDD: Reading Data from your JDBC source.
    • Hadoop-Avro-Reader: Reading AVRO Files from HDFS

Acknowledgements

Thanks to The Climate Corporation and their open source clj-spark project, and to Yieldbot for yieldbot/flambo which served as the starting point for this project.

License

Copyright (C) 2014-2015 Dr. Christian Betz, and the Gorillalabs team.

Distributed under the Eclipse Public License either version 1.0 or (at your option) any later version.

About

A Clojure library for Apache Spark: fast, fully-features, and developer friendly

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Clojure 65.1%
  • CSS 15.8%
  • Java 8.2%
  • Ruby 7.7%
  • JavaScript 3.2%