Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-23529][K8s] Support mounting volumes #21260

Closed
wants to merge 14 commits into from

Conversation

andrusha
Copy link
Contributor

@andrusha andrusha commented May 7, 2018

This PR continues #21095 and intersects with #21238. I've added volume mounts as a separate step and added PersistantVolumeClaim support.

There is a fundamental problem with how we pass the options through spark conf to fabric8. For each volume type and all possible volume options we would have to implement some custom code to map config values to fabric8 calls. This will result in big body of code we would have to support and means that Spark will always be somehow out of sync with k8s.

I think there needs to be a discussion on how to proceed correctly (eg use PodPreset instead)


Due to the complications of provisioning and managing actual resources this PR addresses only volume mounting of already present resources.


  • emptyDir support
  • Testing
  • Documentation
  • KubernetesVolumeUtils tests

@foxish
Copy link
Contributor

foxish commented May 8, 2018

jenkins, ok to test

@SparkQA
Copy link

SparkQA commented May 8, 2018

Test build #90363 has finished for PR 21260 at commit f1ada4c.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented May 8, 2018

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-spark-integration/2951/

@SparkQA
Copy link

SparkQA commented May 8, 2018

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-spark-integration/2951/

@liyinan926
Copy link
Contributor

There is a fundamental problem with how we pass the options through spark conf to fabric8. For each volume type and all possible volume options we would have to implement some custom code to map config values to fabric8 calls. This will result in big body of code we would have to support and means that Spark will always be somehow out of sync with k8s.

This is indeed concerning, given that we don't yet support a lot of pod customization options yet, e.g., affinity and anti-affinity, security context, etc. Ideally, pod specs should be specified declaratively like in Deployment and StatefulSet, but Spark is configuration property based. The Spark Operator attempted to address this using initializers. But initializers are alpha and dangerous. Admission webhooks are an option, but again, they pose risks.

@liyinan926
Copy link
Contributor

@andrusha Can you remove coverage of #21238 and update the title? I will take a look once it's ready. You also need to rebase the PR.

@andrusha
Copy link
Contributor Author

@liyinan926 sounds fair, it also pending tests, I'll add those in todo list for this PR and ping you once its done

@er0sin
Copy link

er0sin commented May 15, 2018

How does one configure a PV/PVC with this change?
spark.kubernetes.executor.volumes=pvName:containerPath ?

@liyinan926
Copy link
Contributor

@er0sin PVCs can be mounted similarly to the example below:

spark.kubernetes.driver.volumes.persistentVolumeClaim.pv1.mount.path=/mnt
spark.kubernetes.driver.volumes.persistentVolumeClaim.pv1.mount.readOnly=false
spark.kubernetes.driver.volumes.persistentVolumeClaim.pv1.options.claimName=clm-1

@andrusha andrusha changed the title [SPARK-23529][K8s] Support mounting hostPath volumes [SPARK-23529][K8s] Support mounting volumes May 22, 2018
@SparkQA
Copy link

SparkQA commented May 22, 2018

Test build #90956 has finished for PR 21260 at commit 0134239.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented May 22, 2018

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-spark-integration/3344/

@SparkQA
Copy link

SparkQA commented May 22, 2018

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-spark-integration/3344/

@andrusha
Copy link
Contributor Author

@liyinan926 now hostPath, emptyDir and peristentVolumeClaim are supported. That's enough to cover most of the use-cases as PVC volumes can be cloud-specific, say AWS EFS. I've also added tests and documentation.

Would appreciate feedback on documentation, as we have to make clear which features are supported.

Following refactorings might be done:

  1. Introduce KubernetesConf.empty so it would be simpler to add new specs and will results in less merge conflicts.
  2. Encode KubernetesVolumeSpec.volumeType as ADT sealed trait KubernetesVolumeType so compiler exhaustiveness checks can be done.

@SparkQA
Copy link

SparkQA commented May 22, 2018

Test build #90960 has finished for PR 21260 at commit dde6cf9.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@andrusha
Copy link
Contributor Author

@AmplabJenkins failed due to github flakiness. Please retest.

@SparkQA
Copy link

SparkQA commented May 22, 2018

Test build #90962 has finished for PR 21260 at commit 95d9312.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@liyinan926
Copy link
Contributor

@andrusha Is this ready for review?

@andrusha
Copy link
Contributor Author

@liyinan926 yeap, ready for review. If you think it makes sense to make refactors I'm speaking of then I can add those too.

* @param prefix the given property name prefix
* @return a Map storing with volume name as key and spec as value
*/
def parseVolumesWithPrefix(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests are missing

@SparkQA
Copy link

SparkQA commented May 24, 2018

Test build #91112 has finished for PR 21260 at commit de148e3.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@andrusha andrusha force-pushed the k8s-vol branch 2 times, most recently from f87bf3f to f218d8a Compare May 24, 2018 16:40
@SparkQA
Copy link

SparkQA commented May 24, 2018

Test build #91114 has finished for PR 21260 at commit f218d8a.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented May 24, 2018

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-spark-integration/3435/

@SparkQA
Copy link

SparkQA commented May 24, 2018

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-spark-integration/3435/

@SparkQA
Copy link

SparkQA commented Jun 11, 2018

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-spark-integration/3749/

import org.apache.spark.deploy.k8s._

private[spark] class MountVolumesFeatureStep(
kubernetesConf: KubernetesConf[_ <: KubernetesRoleSpecificConf])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The class parameter line should be indented 4 spaces.

val sizeLimitKey = s"$volumeType.$volumeName.$KUBERNETES_VOLUMES_OPTIONS_SIZE_LIMIT_KEY"
for {
medium <- options.getTry(mediumKey)
sizeLimit <- options.getTry(sizeLimitKey)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both Medium and SizeLimit are optional for EmptyDirVolumeSource so users are not required to set them. What happens with options.getTry if users do not set the key?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.
https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.10/#emptydirvolumesource-v1-core I don't quite like that you have to pass "" and null to k8s, but it works.


for {
path <- properties.getTry(pathKey)
readOnly <- properties.getTry(readOnlyKey)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ReadOnly is optional and defaults to false so users are not required to set it explicitly. Is this semantics captured by properties.getTry?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made it optional with default false value. We basically ended up with two ways to parse spark options now, one is custom for volumes as there is just so many combinations and one which the rest of spark is using. It must be possible to refactor this somehow, maybe improving spark options parser itself.

@SparkQA
Copy link

SparkQA commented Jun 11, 2018

Test build #91663 has finished for PR 21260 at commit d960e34.

  • This patch fails Spark unit tests.
  • This patch does not merge cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jun 13, 2018

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-spark-integration/3836/

@SparkQA
Copy link

SparkQA commented Jun 13, 2018

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-spark-integration/3836/

@SparkQA
Copy link

SparkQA commented Jun 13, 2018

Test build #91775 has finished for PR 21260 at commit f714b8e.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jun 15, 2018

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-spark-integration/3928/

@SparkQA
Copy link

SparkQA commented Jun 15, 2018

Test build #91904 has finished for PR 21260 at commit 7433244.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jun 15, 2018

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-spark-integration/3928/

@andrusha
Copy link
Contributor Author

retest please

@felixcheung
Copy link
Member

Jenkins, retest this please

@SparkQA
Copy link

SparkQA commented Jun 26, 2018

Test build #92346 has finished for PR 21260 at commit 7433244.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jul 3, 2018

Test build #92560 has finished for PR 21260 at commit 45eb477.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Contributor

@liyinan926 liyinan926 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@liyinan926
Copy link
Contributor

@felixcheung @mccheah Can you take a look and merge this?

@skonto
Copy link
Contributor

skonto commented Jul 6, 2018

@felixcheung gentle ping, this is pretty useful.

@felixcheung
Copy link
Member

felixcheung commented Jul 6, 2018 via email

@baluchicken
Copy link

Nice work, only one thing I would consider including a StorageClass name option for the PersistentVolumeClaim volume type which defaults to an empty string. Without that the PVC will always use the default StorageClass which may not exists in all scenarios. Thus the pod will remain in pending state indefinitely.

@felixcheung
Copy link
Member

felixcheung commented Jul 8, 2018

@skonto is it better to generalize the approach to match the one in https://issues.apache.org/jira/browse/SPARK-24434?

not sure if @mccheah @foxish @erikerlandson have any last thought

@liyinan926
Copy link
Contributor

@felixcheung This feature was discussed and this PR was started before https://issues.apache.org/jira/browse/SPARK-24434 was even brought up. Being able to mount commonly used types of volumes seems super useful for some users, so it might make sense to accept it while https://issues.apache.org/jira/browse/SPARK-24434 is still going through design review.

@felixcheung
Copy link
Member

ok that makes sense. could we get some clear guidance in SPARK-24434 on how to decide what should be a conf and what should be in an external template file?

Copy link
Member

@felixcheung felixcheung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, would be great to add a complete example in running-on-kubernetes.md as a follow up
(will wait for 24hr, then merge)

@liyinan926
Copy link
Contributor

ok that makes sense. could we get some clear guidance in SPARK-24434 on how to decide what should be a conf and what should be in an external template file?

We talked about stop accepting any new config options for customizing the driver/executor pods. Moving forward, all new customization needs will be fulfilled by the solution to SPARK-24434.

@asfgit asfgit closed this in 5ff1b9b Jul 11, 2018
@felixcheung
Copy link
Member

merged to master

@skonto
Copy link
Contributor

skonto commented Jul 11, 2018

@felixcheung volumes will be supported by the pod template and so you will be able to do it without the conf options defined here. If both spark conf and template properties exist we have defined a precedence order in the design doc.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants