Skip to content

adrian555/dday2019

Repository files navigation

dday2019

  • Build Spark docker image
wget http://ftp.wayne.edu/apache/spark/spark-2.4.4/spark-2.4.4-bin-hadoop2.7.tgz
tar zxvf spark-2.4.4-bin-hadoop2.7.tgz
cd spark-2.4.4-bin-hadoop2.7
cp ../patch/run.sh sbin
cp ../patch/Dockerfile kubernetes/dockerfiles/spark
./bin/docker-image-tool.sh -r docker.io/adrian555 -t v2.4.4 build
./bin/docker-image-tool.sh -r docker.io/adrian555 -t v2.4.4 push
cd ..

Note: you can also push the images to quay.io as long as you tag the image with quay.io/<org>/<image>:<tag> and have logged into quay.io.

  • Create Ansible type of spark-operator
operator-sdk new spark-operator --api-version=ibm.com/v1alpha1 --kind=Spark --type=ansible

Now, we will add the control logic to deploy the Spark cluster. We will replace spark-opeartor directory with the modified one spark-operator.full.

rm -rf spark-operator
mv spark-operator.full spark-operator

Default Role and RoleBinding are created. And it is by default a namespace scoped operator. We want to make this to be a cluster scoped operator, and for the simplicity, we will just use the cluster-admin as the role for the service account.

Let's look at the things we have changed.

deploy/crds, deploy/operator.yaml, deploy/role_binding.yaml, roles/spark/defaults, roles/spark/tasks, roles/spark/templates.

  • Build the image for operator
cd spark-operator
operator-sdk build adrian555/spark-operator:v0.0.1
docker push adrian555/spark-operator:v0.0.1
cd ..
  • Update the spark-operator to use this image
cd spark-operator
sed -i '' 's/{{ REPLACE_IMAGE }}/adrian555\/spark-operator:v0.0.1/g' deploy/operator.yaml
sed -i '' 's/imagePullPolicy:.*$/imagePullPolicy: Always/g' deploy.operator.yaml
cd ..
  • Now we are ready to deploy the operator. Login to the cluster through browser

https://console-openshift-console.apps.ddoc.os.fyre.ibm.com using <admin>/<password>. Retrieve the login command. Something like this

oc login --token=0q40Ehp_x8Kjp_BLtJmFr9E0z_hYOmL7hriXONjrFzM --server=https://api.ddoc.os.fyre.ibm.com:6443

# create a new project
oc new-project dday2019

oc adm policy add-scc-to-user anyuid -z default

Note: brew install openshift-cli will install the oc CLI.

  • Install the spark-operator:
cd spark-operator
ls deploy/crds
oc apply -f deploy/crds/ibm_v1alpha1_spark_crd.yaml
oc apply -f deploy/service_account.yaml
oc apply -f deploy/role.yaml
oc apply -f deploy/role_binding.yaml
oc apply -f deploy/operator.yaml

We can look at what this opeartor is doing so far:

kubectl logs deployment/spark-operator operator -n dday2019 -f
  • Now we will deploy a Spark cluster by this operator.
oc apply -f deploy/crds/ibm_v1alpha1_spark_cr.yaml

We can look at what this opeartor is doing so far:

kubectl logs deployment/spark-operator operator -n dday2019 -f
  • Run jupyter notebook from ddoc-inf.fyre.ibm.com (since fyre cluster only gives out one external IP)
yum install -y python36
python3 -m venv p3
source p3/bin/activate
pip install jupyter
jupyter notebook --ip 0.0.0.0 --no-browser --port 9090 --allow-root&

On laptop, open following with browser (token may change)

http://ddoc-inf.fyre.ibm.com:9090/?token=171969c1fe2565206f743a988dc48414066f5705dddcadd7

(Optional) On laptop, run following to create the ssh tunnel

ssh -N -L 9090:localhost:9090 root@wzoc-inf.fyre.ibm.com

Then open localhost:9090 on the browser to get into the jupyter notebook.

  • Run following spark-shell command from Jupyter terminal (on web browser)
bin/spark-shell --master spark://master0.ddoc.os.fyre.ibm.com:31313
  • One of the day 2 operations is to scale the application. To change the number of workers, we can update the CustomResource yaml file and apply. Hence the Spark cluster is automatically scaling up and down.
cd spark-operator
sed -i '' 's/worker_size:.*$/worker_size: 2/g' deploy/crds/ibm_v1alpha1_spark_cr.yaml
oc apply -f deploy/crds/ibm_v1alpha1_spark_cr.yaml
cd ..
  • We can also change the env in the spark-operator.
cd spark-operator
oc apply -f deploy/operator.n.yaml
cd ..
  • Do some cleanup to move on
cd spark-operator
oc delete -f deploy/crds/ibm_v1alpha1_spark_cr.yaml
oc delete -f deploy/operator.n.yaml
oc delete -f deploy/role_binding.yaml
oc delete -f deploy/role.yaml
oc delete -f deploy/service_account.yaml
oc delete -f deploy/crds/ibm_v1alpha1_spark_crd.yaml

Now we move on to integrate this with the Operator Lifecycle Manager.

  • Generate the ClusterServiceVersion for this operator:
cd spark-operator
operator-sdk olm-catalog gen-csv --csv-version 0.0.1 --update-crds
cd ..
  • Update some info to make it pass operator-courier. First compare the two files with VSC.
cd spark-operator/deploy
rm -rf olm-catalog
mv olm.new olm-catalog
pip install operator-courier
cd spark-operator/deploy/olm-catalog
operator-courier verify spark-operator
cd -

Note: once an operator is verified, technically you can push the image to quay.io with operator-courier push command.

  • Now we will build the catalog source image.
# copy the csv
cd olm
mkdir operators
cp -r ../spark-operator/deploy/olm-catalog/spark-operator operators

# build the registry image
docker build . -t adrian555/spark-operator-registry:v0.0.1
docker push adrian555/spark-operator-registry:v0.0.1

# create the catalogsource, source namespace is set to openshift-operator-lifecycle-manager
oc apply -f catalogsource.yaml

# the operator is now shown
oc get packagemanifest -n openshift-operator-lifecycle-manager

cd -

We can now look for this operator from the console, click through Catalog|Operator Management|Operator Catalogs.

  • Install the spark-operator to openshift-operators namespace
  • Deploy the Spark Cluster to openshift-operators namespace

To register the operator in a different namespace, for example, openshift-marketplace, update the CatalogSource manifest to set up the sourceNamespace for the operator.

  • Create the catalogsource with a different source namespace openshift-marketplace.
cd olm
oc apply -f csmarketplace.yaml
oc get packagemanifest -n openshift-marketplace
cd -
  • Create a demo-operators OperatorGroup through the web console
  • Need to add the clusterrole rules for the OperatorGroup
cd olm
oc apply -f demo-og-admin.yaml
oc apply -f demo-og-edit.yaml
oc apply -f demo-og-view.yaml
cd -
  • Deploy the operator and create the spark cluster to default namespace.

  • Run the same jupyter notebook example.

  • If the cluster is a Kubernetes without OLM installed, it can be installed following way

curl -sL https://github.com/operator-framework/operator-lifecycle-manager/releases/download/0.12.0/install.sh | bash -s 0.12.0

There is also a Web Console for the OLM to run the console local https://github.com/operator-framework/operator-lifecycle-manager/blob/master/scripts/run_console_local.sh

Or you can run it as a service.

cd olm
oc apply -f olm-console.yaml
oc get svc olm-console -n olm
cd -

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published