Apache Spark
Apache Spark is an open source distributed general-purpose cluster-computing framework. It provides an interface for programming entire clusters with implicit data parallelism and fault tolerance.
Here are 316 public repositories matching this topic...
ETH Zurich - Web Scale Data Processing and Mining Project - Runs
-
Updated
Sep 10, 2014 - Shell
End-to-End, Real-time, Advanced Analytics Big Data Reference Pipeline using Spark, Spark SQL, Spark ML, GraphX, Spark Streaming, Kafka, NiFi, Cassandra, ElasticSearch, Redis, Tachyon, HDFS, Zeppelin, iPython/Jupyter Notebook, Tableau, Twitter Algebird. See https://github.com/fluxcapacitor/pipeline/wiki for Setup Instructions.
-
Updated
Feb 20, 2016 - Shell
Predictive Analysis using Big Data platforms and Machine Learning Libraries
-
Updated
Aug 1, 2016 - Shell
-
Updated
Aug 9, 2016 - Shell
Spark Standalone Cluster With Zookeeper
-
Updated
Jan 18, 2017 - Shell
In this repository you will find many instructions and exercise about big_data, implemented with Hadoop and Spark
-
Updated
Feb 4, 2017 - Shell
Run spark in docker containers
-
Updated
Mar 4, 2017 - Shell
local kubernetes-based ml setup
-
Updated
Mar 5, 2017 - Shell
based on the hadoop-on-docker image just built.
-
Updated
Mar 14, 2017 - Shell
Created by Matei Zaharia
Released May 26, 2014
- Followers
- 416 followers
- Repository
- apache/spark
- Website
- spark.apache.org
- Wikipedia
- Wikipedia