-
Updated
Jan 26, 2023 - Shell
Apache Spark
Apache Spark is an open source distributed general-purpose cluster-computing framework. It provides an interface for programming entire clusters with implicit data parallelism and fault tolerance.
Here are 316 public repositories matching this topic...
A debian:jessie based Spark + HadoopDFS docker container.
-
Updated
May 4, 2017 - Shell
-
Updated
Aug 27, 2019 - Shell
Custom Spark build for local install to use a newer version of Hadoop and a newer version of AWS sdk
-
Updated
Apr 1, 2020 - Shell
Serverless PySpark
-
Updated
Mar 10, 2020 - Shell
Dockerized PredictionIO as standalone image
-
Updated
Aug 28, 2017 - Shell
Docker App for services including kafka, spark and cassandra
-
Updated
Mar 31, 2019 - Shell
Exploring details of Motor Vehicle Collisions in New York City provided by the Police Department (NYPD).
-
Updated
Mar 9, 2019 - Shell
-
Updated
Jun 22, 2022 - Shell
This is a project for Big Data course in Roma Tre University.
-
Updated
May 12, 2020 - Shell
A spark script for processing (large-scale) file system snapshot data.
-
Updated
Apr 12, 2021 - Shell
Created by Matei Zaharia
Released May 26, 2014
- Followers
- 416 followers
- Repository
- apache/spark
- Website
- spark.apache.org
- Wikipedia
- Wikipedia