• Sahara aims to provide users with simple means to provision a data intensive cluster (Hadoop, Spark) by specifying several parameters like software versions, cluster topology, nodes hardware details and a few more.

    Python 3 120 Apache-2.0 Updated Nov 3, 2015
  • Disk image elements for Savanna

    Shell 28 Apache-2.0 Updated Sep 24, 2015
  • Implementation of a new ROLLUP operator for Apache Pig, that results in optimal execution plans

    Java 1 Apache-2.0 Updated May 7, 2015
  • Simple OpenStack Python bindings

    Python Apache-2.0 Updated Apr 9, 2015
  • Python logging handler for Logstash.

    Python 123 MIT Updated Oct 29, 2014
  • Python 1 1 Updated Oct 22, 2014
  • Decision trees library and more

    Scala 4 1 Apache-2.0 Updated Oct 9, 2014
  • pig

    Forked from apache/pig

    Mirror of Apache Pig

    Java 426 Apache-2.0 Updated Oct 3, 2014
  • This is the PIG ROLLUP repo

    Java 1 Apache-2.0 Updated Oct 3, 2014
  • Python 2 Updated Sep 22, 2014
  • A possible implementation of a decision tree for SPARK

    Scala 3 2 Apache-2.0 Updated Sep 12, 2014
  • OpenStack Measurement Framework

    Python 3 1 Apache-2.0 Updated Aug 13, 2014
  • Hadoop implementation of KNN graph building algorithms (Brute force, NNDescent, NNCtph, ...)

    Java 3 Updated Jul 28, 2014
  • HFSP

    Forked from melrief/HFSP

    The Hadoop Fair Sojourn Protocol Scheduler

    Java 1 4 Apache-2.0 Updated Jan 14, 2014
  • Java 4 8 Apache-2.0 Updated Oct 15, 2013
  • A set of tools to analyse Hadoop logs

    Python 5 Apache-2.0 Updated Jul 1, 2013
  • This project deals with the implementation of k-means for multi-dimensional clustering.

    Scala 4 Updated Jun 20, 2013
  • Statistical Workload Injector for MapReduce - Project at UC Berkeley AMP Lab

    Java 85 Updated Jun 27, 2012