Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Apache Spark - A unified analytics engine for large-scale data processing
Scala 28.6k 23.2k
Web-based notebook that enables data-driven, interactive data analytics and collaborative documents with SQL, Scala and more.
Java 5.1k 2.4k
Scala 718 194
Information for setting up for the BerkeleyX Spark Intro MOOC, and lab assignments for the course
Python 358 323
Vagrant projects for various use-cases with Spark, Zeppelin, IPython / Jupyter, SparkR
Shell 35 23
Apache Arrow is a cross-language development platform for in-memory data. It specifies a standardized language-independent columnar memory format for flat and hierarchical data, organized for effic…
C++ 6.9k 1.7k
Seeing something unexpected? Take a look at the GitHub profile guide.