Apache Spark docker image
-
Updated
Apr 21, 2023 - Shell
Apache Spark is an open source distributed general-purpose cluster-computing framework. It provides an interface for programming entire clusters with implicit data parallelism and fault tolerance.
Apache Spark docker image
A curated list of awesome Apache Spark packages and resources.
[PROJECT IS NO LONGER MAINTAINED] Wirbelsturm is a Vagrant and Puppet based tool to perform 1-click local and remote deployments, with a focus on big data tech like Kafka.
Ansible roles to install an Spark Standalone cluster (HDFS/Spark/Jupyter Notebook) or Ambari based Spark cluster
Easy CPU Profiling for Apache Spark applications
A .NET for Apache Spark docker image (3rdman/dotnet-spark)
Driver/Executor images for spark-operator
Production run of Apache Spark on Kubernetes
Sparkler Crawl Environment - a packaged, dockerized version of http://github.com/USCDataScience/sparkler.git
An image for running Jupyter notebooks and Apache Spark in the cloud on OpenShift
demo of running apache spark jobs using tekton and s2i workflows
This is the material for the 2019 Silicon Valley Code Camp Session "Realish Time Predictive Analytics with Spark Structured Streaming"
Create n-node cluster and Run spark job on Docker
Host files and procedure for running Fink on Kubernetes
Apache Spark cluster with docker-swarm prometheus cadvisor
Sample Oozie Workflow to test the Spark Job. In Workflow, we use the Shell action to call a Shell script. The Shell script will be invoking the Spark Pi example Job.
Apache Spark to run on Kubernetes
Created by Matei Zaharia
Released May 26, 2014