Productivity-centric Python big data framework for high performance at Hadoop-scale, with first-class integration with Impala. Co-founded by the creator of pandas
Let’s Big Data. Hue is an open source Web interface for analyzing data with Apache Hadoop.
Real-time Query for Hadoop
Python client and Numba-based UDFs for Impala
Provides a Spark backend for executing Dataflow pipelines.
A library for financial and time series calculations on Apache Spark
Scalable genomics variant store and analytics
Thrift SASL module that implements TSaslClientTransport
Simple real-time large-scale machine learning infrastructure.
IPython notebooks and learning materials for Ibis
The fast and fun way to write YARN applications.
Llama - Low Latency Application MAster
Sqoop has moved to Apache!