BigDL bigdl-k8s image has been built to easily run applications on Kubernetes cluster. The details of pre-installed packages and usage of the image will be introduced in this page.
- Runnable docker environment has been set up.
- A running Kubernetes cluster is prepared. Also make sure the permission of
kubectl
to create, list and delete pod.
- Pull an BigDL bigdl-k8s image from dockerhub:
sudo docker pull intelanalytics/bigdl-k8s:latest
- Speed up pulling image by adding mirrors
To speed up pulling the image from dockerhub in China, add a registry's mirror. For Linux OS (CentOS, Ubuntu etc), if the docker version is higher than 1.12, config the docker daemon. Edit /etc/docker/daemon.json
and add the registry-mirrors key and value:
{
"registry-mirrors": ["https://<my-docker-mirror-host>"]
}
For example, add the ustc mirror in China.
{
"registry-mirrors": ["https://docker.mirrors.ustc.edu.cn"]
}
Flush changes and restart docker:
sudo systemctl daemon-reload
sudo systemctl restart docker
If your docker version is between 1.8 and 1.11, find the docker configuration which location depends on the operation system. Edit and add DOCKER_OPTS="--registry-mirror=https://<my-docker-mirror-host>"
. Restart docker sudo service docker restart
.
If you would like to speed up pulling this image on MacOS or Windows, find the docker setting and config registry-mirrors section by specifying mirror host. Restart docker.
Then pull the image. It will be faster.
sudo docker pull intelanalytics/bigdl-k8s:latest
2.K8s configuration
Get k8s master as spark master :
kubectl cluster-info
After running this commend, it shows "Kubernetes master is running at https://127.0.0.1:12345 "
this means :
master="k8s://https://127.0.0.1:12345"
The namespace is default or spark.kubernetes.namespace
RBAC :
kubectl create serviceaccount spark
kubectl create clusterrolebinding spark-role --clusterrole=edit --serviceaccount=default:spark --namespace=default
View k8s configuration file :
.kube/config
or
kubectl config view --flatten --minify > kuberconfig
The k8s data can stored in nfs or ceph, take nfs as an example
In NFS server, run :
yum install nfs-utils
systemctl enable rpcbind
systemctl enable nfs
systemctl start rpcbind
firewall-cmd --zone=public --permanent --add-service={rpc-bind,mountd,nfs}
firewall-cmd --reload
mkdir /disk1/nfsdata
chmod 755 /disk1/nfsdata
nano /etc/exports "/disk1/nfsdata *(rw,sync,no_root_squash,no_all_squash)"
systemctl restart nfs
In NFS client, run :
yum install -y nfs-utils && systemctl start rpcbind && showmount -e <nfs-master-ip-address>
k8s conf :
git clone https://github.com/kubernetes-incubator/external-storage.git
cd /XXX/external-storage/nfs-client
nano deploy/deployment.yaml
nano deploy/rbac.yaml
kubectl create -f deploy/rbac.yaml
kubectl create -f deploy/deployment.yaml
kubectl create -f deploy/class.yaml
test :
kubectl create -f deploy/test-claim.yaml
kubectl create -f deploy/test-pod.yaml
kubectl get pvc
kubectl delete -f deploy/test-pod.yaml
kubectl delete -f deploy/test-claim.yaml
if the test is success, then run:
kubectl create -f deploy/nfs-volume-claim.yaml
3.Launch a k8s client container:
Please note the two different containers: client container is for user to submit bigdl jobs from here, since it contains all the required env and libs except hadoop/k8s configs; executor container is not need to create manually, which is scheduled by k8s at runtime.
sudo docker run -itd --net=host \
-v /etc/kubernetes:/etc/kubernetes \
-v /root/.kube:/root/.kube \
intelanalytics/bigdl-k8s:latest bash
Note. To launch the client container, -v /etc/kubernetes:/etc/kubernetes:
and -v /root/.kube:/root/.kube
are required to specify the path of kube config and installation.
To specify more argument, use:
sudo docker run -itd --net=host \
-v /etc/kubernetes:/etc/kubernetes \
-v /root/.kube:/root/.kube \
-e http_proxy=http://your-proxy-host:your-proxy-port \
-e https_proxy=https://your-proxy-host:your-proxy-port \
-e RUNTIME_SPARK_MASTER=k8s://https://<k8s-apiserver-host>:<k8s-apiserver-port> \
-e RUNTIME_K8S_SERVICE_ACCOUNT=account \
-e RUNTIME_K8S_SPARK_IMAGE=intelanalytics/bigdl-k8s:latest \
-e RUNTIME_PERSISTENT_VOLUME_CLAIM=myvolumeclaim \
-e RUNTIME_DRIVER_HOST=x.x.x.x \
-e RUNTIME_DRIVER_PORT=54321 \
-e RUNTIME_EXECUTOR_INSTANCES=1 \
-e RUNTIME_EXECUTOR_CORES=4 \
-e RUNTIME_EXECUTOR_MEMORY=20g \
-e RUNTIME_TOTAL_EXECUTOR_CORES=4 \
-e RUNTIME_DRIVER_CORES=4 \
-e RUNTIME_DRIVER_MEMORY=10g \
intelanalytics/bigdl-k8s:latest bash
- http_proxy is to specify http proxy.
- https_proxy is to specify https proxy.
- RUNTIME_SPARK_MASTER is to specify spark master, which should be
k8s://https://<k8s-apiserver-host>:<k8s-apiserver-port>
orspark://<spark-master-host>:<spark-master-port>
. - RUNTIME_K8S_SERVICE_ACCOUNT is service account for driver pod. Please refer to k8s RBAC.
- RUNTIME_K8S_SPARK_IMAGE is the k8s image.
- RUNTIME_PERSISTENT_VOLUME_CLAIM is to specify volume mount. We are supposed to use volume mount to store or receive data. Get ready with Kubernetes Volumes.
- RUNTIME_DRIVER_HOST is to specify driver localhost (only required when submit jobs as kubernetes client mode).
- RUNTIME_DRIVER_PORT is to specify port number (only required when submit jobs as kubernetes client mode).
- Other environment variables are for spark configuration setting. The default values in this image are listed above. Replace the values as you need.
Once the container is created, launch the container by:
sudo docker exec -it <containerID> bash
Then you may see it shows:
root@[hostname]:/opt/spark/work-dir#
/opt/spark/work-dir
is the spark work path.
Note: The /opt
directory contains:
- download-bigdl.sh is used for downloading Analytics-Zoo distributions.
- jdk is the jdk home.
- spark is the spark home.
Here is a sample for submitting the python [anomalydetection] example on cluster mode.
${SPARK_HOME}/bin/spark-submit \
--master ${RUNTIME_SPARK_MASTER} \
--deploy-mode cluster \
--conf spark.kubernetes.authenticate.driver.serviceAccountName=${RUNTIME_K8S_SERVICE_ACCOUNT} \
--name bigdl-app-k8s \
--conf spark.kubernetes.container.image=${RUNTIME_K8S_SPARK_IMAGE} \
--conf spark.executor.instances=${RUNTIME_EXECUTOR_INSTANCES} \
--conf spark.kubernetes.driver.volumes.persistentVolumeClaim.${RUNTIME_PERSISTENT_VOLUME_CLAIM}.options.claimName=${RUNTIME_PERSISTENT_VOLUME_CLAIM} \
--conf spark.kubernetes.driver.volumes.persistentVolumeClaim.${RUNTIME_PERSISTENT_VOLUME_CLAIM}.mount.path=/bigdl/data \
--conf spark.kubernetes.executor.volumes.persistentVolumeClaim.${RUNTIME_PERSISTENT_VOLUME_CLAIM}.options.claimName=${RUNTIME_PERSISTENT_VOLUME_CLAIM} \
--conf spark.kubernetes.executor.volumes.persistentVolumeClaim.${RUNTIME_PERSISTENT_VOLUME_CLAIM}.mount.path=/bigdl/data \
--conf spark.kubernetes.driver.label.<your-label>=true \
--conf spark.kubernetes.executor.label.<your-label>=true \
--executor-cores ${RUNTIME_EXECUTOR_CORES} \
--executor-memory ${RUNTIME_EXECUTOR_MEMORY} \
--total-executor-cores ${RUNTIME_TOTAL_EXECUTOR_CORES} \
--driver-cores ${RUNTIME_DRIVER_CORES} \
--driver-memory ${RUNTIME_DRIVER_MEMORY} \
--properties-file ${ANALYTICS_ZOO_HOME}/conf/spark-analytics-zoo.conf \
--py-files ${ANALYTICS_ZOO_HOME}/lib/analytics-zoo-bigdl_${BIGDL_VERSION}-spark_${SPARK_VERSION}-${ANALYTICS_ZOO_VERSION}-python-api.zip,/opt/analytics-zoo-examples/python/anomalydetection/anomaly_detection.py \
--conf spark.driver.extraJavaOptions=-Dderby.stream.error.file=/tmp \
--conf spark.sql.catalogImplementation='in-memory' \
--conf spark.driver.extraClassPath=${ANALYTICS_ZOO_HOME}/lib/analytics-zoo-bigdl_${BIGDL_VERSION}-spark_${SPARK_VERSION}-${ANALYTICS_ZOO_VERSION}-jar-with-dependencies.jar \
--conf spark.executor.extraClassPath=${ANALYTICS_ZOO_HOME}/lib/analytics-zoo-bigdl_${BIGDL_VERSION}-spark_${SPARK_VERSION}-${ANALYTICS_ZOO_VERSION}-jar-with-dependencies.jar \
file:///opt/analytics-zoo-examples/python/anomalydetection/anomaly_detection.py \
--input_dir /bigdl/data/nyc_taxi.csv
Options:
- --master: the spark mater, must be a URL with the format
k8s://https://<k8s-apiserver-host>:<k8s-apiserver-port>
. - --deploy-mode: submit application in cluster mode.
- --name: the Spark application name.
- --conf: require to specify k8s service account, container image to use for the Spark application, driver volumes name and path, label of pods, spark driver and executor configuration, etc. check the argument settings in your environment and refer to the spark configuration page and spark on k8s configuration page for more details.
- --properties-file: the customized conf properties.
- --py-files: the extra python packages is needed.
- file://: local file path of the python example file in the client container.
- --input_dir: input data path of the anomaly detection example. The data path is the mounted filesystem of the host. Refer to more details by Kubernetes Volumes.
See more python examples running on k8s.
Here is a sample for submitting the scala anomalydetection example on cluster mode
${SPARK_HOME}/bin/spark-submit \
--master ${RUNTIME_SPARK_MASTER} \
--deploy-mode cluster \
--conf spark.kubernetes.authenticate.driver.serviceAccountName=${RUNTIME_K8S_SERVICE_ACCOUNT} \
--name analytics-zoo \
--conf spark.kubernetes.container.image=${RUNTIME_K8S_SPARK_IMAGE} \
--conf spark.executor.instances=${RUNTIME_EXECUTOR_INSTANCES} \
--conf spark.kubernetes.driver.volumes.persistentVolumeClaim.${RUNTIME_PERSISTENT_VOLUME_CLAIM}.options.claimName=${RUNTIME_PERSISTENT_VOLUME_CLAIM} \
--conf spark.kubernetes.driver.volumes.persistentVolumeClaim.${RUNTIME_PERSISTENT_VOLUME_CLAIM}.mount.path=/zoo \
--conf spark.kubernetes.executor.volumes.persistentVolumeClaim.${RUNTIME_PERSISTENT_VOLUME_CLAIM}.options.claimName=${RUNTIME_PERSISTENT_VOLUME_CLAIM} \
--conf spark.kubernetes.executor.volumes.persistentVolumeClaim.${RUNTIME_PERSISTENT_VOLUME_CLAIM}.mount.path=/zoo \
--conf spark.kubernetes.driver.label.<your-label>=true \
--conf spark.kubernetes.executor.label.<your-label>=true \
--executor-cores ${RUNTIME_EXECUTOR_CORES} \
--executor-memory ${RUNTIME_EXECUTOR_MEMORY} \
--total-executor-cores ${RUNTIME_TOTAL_EXECUTOR_CORES} \
--driver-cores ${RUNTIME_DRIVER_CORES} \
--driver-memory ${RUNTIME_DRIVER_MEMORY} \
--properties-file ${ANALYTICS_ZOO_HOME}/conf/spark-analytics-zoo.conf \
--py-files ${ANALYTICS_ZOO_HOME}/lib/analytics-zoo-bigdl_${BIGDL_VERSION}-spark_${SPARK_VERSION}-${ANALYTICS_ZOO_VERSION}-python-api.zip \
--conf spark.driver.extraJavaOptions=-Dderby.stream.error.file=/tmp \
--conf spark.sql.catalogImplementation='in-memory' \
--conf spark.driver.extraClassPath=${ANALYTICS_ZOO_HOME}/lib/analytics-zoo-bigdl_${BIGDL_VERSION}-spark_${SPARK_VERSION}-${ANALYTICS_ZOO_VERSION}-jar-with-dependencies.jar \
--conf spark.executor.extraClassPath=${ANALYTICS_ZOO_HOME}/lib/analytics-zoo-bigdl_${BIGDL_VERSION}-spark_${SPARK_VERSION}-${ANALYTICS_ZOO_VERSION}-jar-with-dependencies.jar \
--class com.intel.analytics.zoo.examples.anomalydetection.AnomalyDetection \
${ANALYTICS_ZOO_HOME}/lib/analytics-zoo-bigdl_${BIGDL_VERSION}-spark_${SPARK_VERSION}-${ANALYTICS_ZOO_VERSION}-python-api.zip \
--inputDir /zoo/data
Options:
- --master: the spark mater, must be a URL with the format
k8s://https://<k8s-apiserver-host>:<k8s-apiserver-port>
. - --deploy-mode: submit application in cluster mode.
- --name: the Spark application name.
- --conf: require to specify k8s service account, container image to use for the Spark application, driver volumes name and path, label of pods, spark driver and executor configuration, etc. check the argument settings in your environment and refer to the spark configuration page and spark on k8s configuration page for more details.
- --properties-file: the customized conf properties.
- --py-files: the extra python packages is needed.
- --class: scala example class name.
- --input_dir: input data path of the anomaly detection example. The data path is the mounted filesystem of the host. Refer to more details by Kubernetes Volumes.
See more scala examples running on k8s.
When application is running, it’s possible to stream logs on the driver pod:
$ kubectl logs <spark-driver-pod>
To check pod status or to get some basic information around pod using:
$ kubectl describe pod <spark-driver-pod>
You can also check other pods using the similar way.
After finishing running the application, deleting the driver pod:
$ kubectl delete <spark-driver-pod>
Or clean up the entire spark application by pod label:
$ kubectl delete pod -l <pod label>