Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Direct CSI supports PVC Access Mode "ReadWriteMany"? #13

Closed
mykidong opened this issue Jul 29, 2020 · 2 comments
Closed

Direct CSI supports PVC Access Mode "ReadWriteMany"? #13

mykidong opened this issue Jul 29, 2020 · 2 comments

Comments

@mykidong
Copy link

I am wondering that direct csi supports the access mode ReadWriteMany of PVC.

I got some errors while running spark example pi job on kubernetes.
First, I have created pvc like this:

cat <<EOF > spark-pvc.yml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: spark-driver-pvc
  namespace: spark
  labels: {}
  annotations: {}
spec:
  accessModes:
  - ReadWriteMany
  resources:
    requests:
      storage: 2Gi
  storageClassName: direct.csi.min.io
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: spark-exec-pvc
  namespace: spark
  labels: {}
  annotations: {}
spec:
  accessModes:
  - ReadWriteMany
  resources:
    requests:
      storage: 50Gi
  storageClassName: direct.csi.min.io
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: spark-driver-localdir-pvc
  namespace: spark
  labels: {}
  annotations: {}
spec:
  accessModes:
  - ReadWriteMany
  resources:
    requests:
      storage: 2Gi
  storageClassName: direct.csi.min.io
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: spark-exec-localdir-pvc
  namespace: spark
  labels: {}
  annotations: {}
spec:
  accessModes:
  - ReadWriteMany
  resources:
    requests:
      storage: 50Gi
  storageClassName: direct.csi.min.io
EOF

kubectl apply -f spark-pvc.yml;

and run spark pi job:

spark-submit \
--master k8s://https://10.233.0.1:443 \
--deploy-mode cluster \
--name spark-pi \
--class org.apache.spark.examples.SparkPi \
--driver-memory 1g \
--executor-memory 4g \
--executor-cores 2 \
--num-executors 2 \
--conf spark.kubernetes.driver.volumes.persistentVolumeClaim.checkpointpvc.mount.path=/checkpoint \
--conf spark.kubernetes.driver.volumes.persistentVolumeClaim.checkpointpvc.mount.subPath=checkpoint \
--conf spark.kubernetes.driver.volumes.persistentVolumeClaim.checkpointpvc.mount.readOnly=false \
--conf spark.kubernetes.driver.volumes.persistentVolumeClaim.checkpointpvc.options.claimName=spark-driver-pvc \
--conf spark.kubernetes.executor.volumes.persistentVolumeClaim.checkpointpvc.mount.path=/checkpoint \
--conf spark.kubernetes.executor.volumes.persistentVolumeClaim.checkpointpvc.mount.subPath=checkpoint \
--conf spark.kubernetes.executor.volumes.persistentVolumeClaim.checkpointpvc.mount.readOnly=false \
--conf spark.kubernetes.executor.volumes.persistentVolumeClaim.checkpointpvc.options.claimName=spark-exec-pvc \
--conf spark.kubernetes.driver.volumes.persistentVolumeClaim.spark-local-dir-localdirpvc.mount.path=/localdir \
--conf spark.kubernetes.driver.volumes.persistentVolumeClaim.spark-local-dir-localdirpvc.mount.readOnly=false \
--conf spark.kubernetes.driver.volumes.persistentVolumeClaim.spark-local-dir-localdirpvc.options.claimName=spark-driver-localdir-pvc \
--conf spark.kubernetes.executor.volumes.persistentVolumeClaim.spark-local-dir-localdirpvc.mount.path=/localdir \
--conf spark.kubernetes.executor.volumes.persistentVolumeClaim.spark-local-dir-localdirpvc.mount.readOnly=false \
--conf spark.kubernetes.executor.volumes.persistentVolumeClaim.spark-local-dir-localdirpvc.options.claimName=spark-exec-localdir-pvc \
--conf spark.kubernetes.container.image=mykidong/spark:v3.0.0 \
--conf spark.kubernetes.driver.container.image=mykidong/spark:v3.0.0 \
--conf spark.kubernetes.executor.container.image=mykidong/spark:v3.0.0 \
--conf spark.kubernetes.namespace=spark \
--conf spark.kubernetes.container.image.pullPolicy=IfNotPresent \
--conf spark.kubernetes.authenticate.driver.serviceAccountName=spark \
local:///opt/spark/examples/jars/spark-examples_2.12-3.0.0.jar 100;

My Spark Pi Job works fine, if the number of executors is set to 1 with the value of --num-executors 1 .
But with more than 2 executors, I go the following errors with the description of pod:

[pcp@master-0 ~]$ kubectl get po -n spark
NAME                               READY   STATUS              RESTARTS   AGE
spark-pi-4d9d707399602358-driver   1/1     Running             0          21s
spark-pi-edbda97399605e6a-exec-1   1/1     Running             0          6s
spark-pi-edbda97399605e6a-exec-2   0/1     ContainerCreating   0          6s
[pcp@master-0 ~]$ kubectl describe po spark-pi-edbda97399605e6a-exec-2 -n spark
Name:           spark-pi-edbda97399605e6a-exec-2
Namespace:      spark
Priority:       0
Node:           minion-0/10.240.0.5
Start Time:     Wed, 29 Jul 2020 07:01:39 +0000
Labels:         spark-app-selector=spark-18df56262b474f8989411d72b84ba81a
                spark-exec-id=2
                spark-role=executor
Annotations:    <none>
Status:         Pending
IP:
IPs:            <none>
Controlled By:  Pod/spark-pi-4d9d707399602358-driver
Containers:
  spark-kubernetes-executor:
    Container ID:
    Image:         mykidong/spark:v3.0.0
    Image ID:
    Port:          7079/TCP
    Host Port:     0/TCP
    Args:
      executor
    State:          Waiting
      Reason:       ContainerCreating
    Ready:          False
    Restart Count:  0
    Limits:
      memory:  4505Mi
    Requests:
      cpu:     2
      memory:  4505Mi
    Environment:
      SPARK_USER:             pcp
      SPARK_DRIVER_URL:       spark://CoarseGrainedScheduler@spark-pi-4d9d707399602358-driver-svc.spark.svc:7078
      SPARK_EXECUTOR_CORES:   2
      SPARK_EXECUTOR_MEMORY:  4g
      SPARK_APPLICATION_ID:   spark-18df56262b474f8989411d72b84ba81a
      SPARK_CONF_DIR:         /opt/spark/conf
      SPARK_EXECUTOR_ID:      2
      SPARK_EXECUTOR_POD_IP:   (v1:status.podIP)
      SPARK_JAVA_OPT_0:       -Dspark.driver.blockManager.port=7079
      SPARK_JAVA_OPT_1:       -Dspark.driver.port=7078
      SPARK_LOCAL_DIRS:       /localdir
    Mounts:
      /checkpoint from checkpointpvc (rw,path="checkpoint")
      /localdir from spark-local-dir-localdirpvc (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from spark-token-5m2hb (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             False
  ContainersReady   False
  PodScheduled      True
Volumes:
  spark-local-dir-localdirpvc:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  spark-exec-localdir-pvc
    ReadOnly:   false
  checkpointpvc:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  spark-exec-pvc
    ReadOnly:   false
  spark-token-5m2hb:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  spark-token-5m2hb
    Optional:    false
QoS Class:       Burstable
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason       Age               From               Message
  ----     ------       ----              ----               -------
  Normal   Scheduled    <unknown>         default-scheduler  Successfully assigned spark/spark-pi-edbda97399605e6a-exec-2 to minion-0
  Warning  FailedMount  2s (x6 over 18s)  kubelet, minion-0  MountVolume.SetUp failed for volume "pvc-7568f3e6-931e-4249-a9f2-1803c45a1a1b" : kubernetes.io/csi: mounter.SetupAt failed: rpc error: code = Internal desc = mount failed: exit status 255
Mounting command: mount
Mounting arguments: -t ext4 -o bind /data/minio/data4/46a6c696-d169-11ea-8dc4-0a5f797c5014 /var/lib/kubelet/pods/4dabe5be-17d8-4d52-862e-572e2d7ac9c6/volumes/kubernetes.io~csi/pvc-7568f3e6-931e-4249-a9f2-1803c45a1a1b/mount
Output: mount: mounting /data/minio/data4/46a6c696-d169-11ea-8dc4-0a5f797c5014 on /var/lib/kubelet/pods/4dabe5be-17d8-4d52-862e-572e2d7ac9c6/volumes/kubernetes.io~csi/pvc-7568f3e6-931e-4249-a9f2-1803c45a1a1b/mount failed: No such file or directory
  Warning  FailedMount  2s (x6 over 18s)  kubelet, minion-0  MountVolume.SetUp failed for volume "pvc-0d62e28b-8473-4b8a-aa34-bc13715dd53b" : kubernetes.io/csi: mounter.SetupAt failed: rpc error: code = Internal desc = mount failed: exit status 255
Mounting command: mount
Mounting arguments: -t ext4 -o bind /data/minio/data1/46429e3c-d169-11ea-8dc4-0a5f797c5014 /var/lib/kubelet/pods/4dabe5be-17d8-4d52-862e-572e2d7ac9c6/volumes/kubernetes.io~csi/pvc-0d62e28b-8473-4b8a-aa34-bc13715dd53b/mount
Output: mount: mounting /data/minio/data1/46429e3c-d169-11ea-8dc4-0a5f797c5014 on /var/lib/kubelet/pods/4dabe5be-17d8-4d52-862e-572e2d7ac9c6/volumes/kubernetes.io~csi/pvc-0d62e28b-8473-4b8a-aa34-bc13715dd53b/mount failed: No such file or directory

I think, multiple pods like spark executors can access volumes with the access mode of ReadWriteMany.
Any idea about that?

@harshavardhana
Copy link
Member

@mykidong No it is not the intention of this project. ReadWriteMany is suitable for NFS volumes or networked block devices not for hostPath and jbod's which this project is going to support.

We may re-look at this in future but for now we are not planning to support it.

@mykidong
Copy link
Author

mykidong commented Aug 1, 2020

@harshavardhana thanks for your answer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants