From 48ecb1eac55d625361c0e1f52cde15079055ede5 Mon Sep 17 00:00:00 2001 From: Yinan Li Date: Fri, 24 Aug 2018 10:52:46 -0700 Subject: [PATCH 1/2] [SPARK-24090][K8S] Update running-on-kubernetes.md --- docs/running-on-kubernetes.md | 39 ++++++++++++++++++++++++++++------- 1 file changed, 31 insertions(+), 8 deletions(-) diff --git a/docs/running-on-kubernetes.md b/docs/running-on-kubernetes.md index 8f84ca044e163..c0aab71b24c5c 100644 --- a/docs/running-on-kubernetes.md +++ b/docs/running-on-kubernetes.md @@ -185,6 +185,35 @@ To use a secret through an environment variable use the following options to the --conf spark.kubernetes.executor.secretKeyRef.ENV_NAME=name:key ``` +## Using Kubernetes Volumes +Starting Spark 2.4.0, users can mount the following types of Kubernetes [volumes](https://kubernetes.io/docs/concepts/storage/volumes/) into the driver and executor pods: +* [hostPath](https://kubernetes.io/docs/concepts/storage/volumes/#hostpath): mounts a file or directory from the host node’s filesystem into a pod. +* [emptyDir](https://kubernetes.io/docs/concepts/storage/volumes/#emptydir): an initially empty volume created when a pod is assigned to a node. +* [persistentVolumeClaim](https://kubernetes.io/docs/concepts/storage/volumes/#persistentvolumeclaim): used to mount a `PersistentVolume` into a pod. + +To mount a volume of any of the types above into the driver pod, use the following configuration property: + +``` +--conf spark.kubernetes.driver.volumes.[VolumeType].[VolumeName].mount.path= +--conf spark.kubernetes.driver.volumes.[VolumeType].[VolumeName].mount.readOnly= +``` + +Specifically, `VolumeType` can be one of the following values: `hostPath`, `emptyDir`, and `persistentVolumeClaim`. `VolumeName` is the name you want to use for the volume under the `volumes` field in the pod specification. + +Each supported type of volumes may have some specific configuration options, which can be specified using configuration properties of the following form: + +``` +spark.kubernetes.driver.volumes.[VolumeType].[VolumeName].options.[OptionName]= +``` + +For example, the claim name of a `persistentVolumeClaim` with volume name `checkpointpvc` can be specified using the following property: + +``` +spark.kubernetes.driver.volumes.persistentVolumeClaim.checkpointpvc.options.claimName=check-point-pvc-claim +``` + +The configuration properties for mounting volumes into the executor pods use prefix `spark.kubernetes.executor.` instead of `spark.kubernetes.driver.`. For a complete list of available options for each supported type of volumes, please refer to the [Spark Properties](#spark-properties) section below. + ## Introspection and Debugging These are the different ways in which you can investigate a running/completed Spark application, monitor progress, and @@ -299,21 +328,15 @@ RBAC authorization and how to configure Kubernetes service accounts for pods, pl ## Future Work -There are several Spark on Kubernetes features that are currently being incubated in a fork - -[apache-spark-on-k8s/spark](https://github.com/apache-spark-on-k8s/spark), which are expected to eventually make it into -future versions of the spark-kubernetes integration. +There are several Spark on Kubernetes features that are currently being worked on or planned to be worked on. Those features are expected to eventually make it into future versions of the spark-kubernetes integration. Some of these include: -* R -* Dynamic Executor Scaling +* Dynamic Resource Allocation and External Shuffle Service * Local File Dependency Management * Spark Application Management * Job Queues and Resource Management -You can refer to the [documentation](https://apache-spark-on-k8s.github.io/userdocs/) if you want to try these features -and provide feedback to the development team. - # Configuration See the [configuration page](configuration.html) for information on Spark configurations. The following configurations are From 7e8144ba8111cfeac12051b28c06b7ef87aa4720 Mon Sep 17 00:00:00 2001 From: Yinan Li Date: Sun, 26 Aug 2018 22:07:55 -0700 Subject: [PATCH 2/2] ddressed review comments --- docs/running-on-kubernetes.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/docs/running-on-kubernetes.md b/docs/running-on-kubernetes.md index c0aab71b24c5c..c83dad6df1e7b 100644 --- a/docs/running-on-kubernetes.md +++ b/docs/running-on-kubernetes.md @@ -186,7 +186,8 @@ To use a secret through an environment variable use the following options to the ``` ## Using Kubernetes Volumes -Starting Spark 2.4.0, users can mount the following types of Kubernetes [volumes](https://kubernetes.io/docs/concepts/storage/volumes/) into the driver and executor pods: + +Starting with Spark 2.4.0, users can mount the following types of Kubernetes [volumes](https://kubernetes.io/docs/concepts/storage/volumes/) into the driver and executor pods: * [hostPath](https://kubernetes.io/docs/concepts/storage/volumes/#hostpath): mounts a file or directory from the host node’s filesystem into a pod. * [emptyDir](https://kubernetes.io/docs/concepts/storage/volumes/#emptydir): an initially empty volume created when a pod is assigned to a node. * [persistentVolumeClaim](https://kubernetes.io/docs/concepts/storage/volumes/#persistentvolumeclaim): used to mount a `PersistentVolume` into a pod.