ConfigMap and CRD based approach for managing the Spark clusters in Kubernetes and OpenShift.
Clone or download
Latest commit 1a67c1a Dec 7, 2018
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
.github
.mvn/wrapper
.travis typo Dec 8, 2018
examples
helm/spark-operator
manifest
monitoring
src
test @ f42f88b
.gitignore
.gitmodules jvm-operators -> radanalyticsio org transfer Aug 10, 2018
.travis.yml
CODE_OF_CONDUCT.md
Dockerfile.alpine
Dockerfile.centos
LICENSE license May 18, 2018
Makefile
README.md
ascii.gif
mvnw
mvnw.cmd
pom.xml
standardized-UML-diagram.png
version-bump.sh

README.md

spark-operator

Build status License

{ConfigMap|CRD}-based approach for managing the Spark clusters in Kubernetes and OpenShift.

This operator uses abstract-operator library.

Watch the full asciicast

How does it work

UML diagram

Quick Start

Run the spark-operator deployment:

kubectl apply -f manifest/operator.yaml

Create new cluster from the prepared example:

kubectl apply -f examples/cluster.yaml

After issuing the commands above, you should be able to see a new Spark cluster running in the current namespace.

kubectl get pods
NAME                               READY     STATUS    RESTARTS   AGE
my-spark-cluster-m-5kjtj           1/1       Running   0          10s
my-spark-cluster-w-m8knz           1/1       Running   0          10s
my-spark-cluster-w-vg9k2           1/1       Running   0          10s
spark-operator-510388731-852b2     1/1       Running   0          27s

Once you don't need the cluster anymore, you can delete it by deleting the config map resource by:

kubectl delete cm my-spark-cluster

Very Quick Start

# create operator
kubectl apply -f http://bit.ly/sparkop

# create cluster
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: ConfigMap
metadata:
  name: my-cluster
  labels:
    radanalytics.io/kind: sparkcluster
data:
  config: |-
    worker:
      instances: "2"
EOF

OpenShift

For deployment on OpenShift use the same commands as above, but with oc instead of kubectl.

Custom Resource Definitions (CRD)

This operator can also work with CRDs. Assuming the admin user is logged in, you can install the operator with:

kubectl apply -f manifest/operator-crd.yaml

and then create the Spark clusters by creating the custom resources (CR).

kubectl apply -f examples/cluster-cr.yaml
kubectl get sparkclusters

Images

Image name Description Layers quay.io docker.io
:latest-released represents the latest released version Layers info quay.io repo docker.io repo
:latest represents the master branch Layers info
:x.y.z one particular released version Layers info

For each variant there is also available an image with -alpine suffix based on Alpine for instance Layers info