You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
My Spark Pi Job works fine, if the number of executors is set to 1 with the value of --num-executors 1 .
But with more than 2 executors, I go the following errors with the description of pod:
[pcp@master-0 ~]$ kubectl get po -n spark
NAME READY STATUS RESTARTS AGE
spark-pi-4d9d707399602358-driver 1/1 Running 0 21s
spark-pi-edbda97399605e6a-exec-1 1/1 Running 0 6s
spark-pi-edbda97399605e6a-exec-2 0/1 ContainerCreating 0 6s
[pcp@master-0 ~]$ kubectl describe po spark-pi-edbda97399605e6a-exec-2 -n spark
Name: spark-pi-edbda97399605e6a-exec-2
Namespace: spark
Priority: 0
Node: minion-0/10.240.0.5
Start Time: Wed, 29 Jul 2020 07:01:39 +0000
Labels: spark-app-selector=spark-18df56262b474f8989411d72b84ba81a
spark-exec-id=2
spark-role=executor
Annotations: <none>
Status: Pending
IP:
IPs: <none>
Controlled By: Pod/spark-pi-4d9d707399602358-driver
Containers:
spark-kubernetes-executor:
Container ID:
Image: mykidong/spark:v3.0.0
Image ID:
Port: 7079/TCP
Host Port: 0/TCP
Args:
executor
State: Waiting
Reason: ContainerCreating
Ready: False
Restart Count: 0
Limits:
memory: 4505Mi
Requests:
cpu: 2
memory: 4505Mi
Environment:
SPARK_USER: pcp
SPARK_DRIVER_URL: spark://CoarseGrainedScheduler@spark-pi-4d9d707399602358-driver-svc.spark.svc:7078
SPARK_EXECUTOR_CORES: 2
SPARK_EXECUTOR_MEMORY: 4g
SPARK_APPLICATION_ID: spark-18df56262b474f8989411d72b84ba81a
SPARK_CONF_DIR: /opt/spark/conf
SPARK_EXECUTOR_ID: 2
SPARK_EXECUTOR_POD_IP: (v1:status.podIP)
SPARK_JAVA_OPT_0: -Dspark.driver.blockManager.port=7079
SPARK_JAVA_OPT_1: -Dspark.driver.port=7078
SPARK_LOCAL_DIRS: /localdir
Mounts:
/checkpoint from checkpointpvc (rw,path="checkpoint")
/localdir from spark-local-dir-localdirpvc (rw)
/var/run/secrets/kubernetes.io/serviceaccount from spark-token-5m2hb (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
spark-local-dir-localdirpvc:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: spark-exec-localdir-pvc
ReadOnly: false
checkpointpvc:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: spark-exec-pvc
ReadOnly: false
spark-token-5m2hb:
Type: Secret (a volume populated by a Secret)
SecretName: spark-token-5m2hb
Optional: false
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled <unknown> default-scheduler Successfully assigned spark/spark-pi-edbda97399605e6a-exec-2 to minion-0
Warning FailedMount 2s (x6 over 18s) kubelet, minion-0 MountVolume.SetUp failed for volume "pvc-7568f3e6-931e-4249-a9f2-1803c45a1a1b" : kubernetes.io/csi: mounter.SetupAt failed: rpc error: code = Internal desc = mount failed: exit status 255
Mounting command: mount
Mounting arguments: -t ext4 -o bind /data/minio/data4/46a6c696-d169-11ea-8dc4-0a5f797c5014 /var/lib/kubelet/pods/4dabe5be-17d8-4d52-862e-572e2d7ac9c6/volumes/kubernetes.io~csi/pvc-7568f3e6-931e-4249-a9f2-1803c45a1a1b/mount
Output: mount: mounting /data/minio/data4/46a6c696-d169-11ea-8dc4-0a5f797c5014 on /var/lib/kubelet/pods/4dabe5be-17d8-4d52-862e-572e2d7ac9c6/volumes/kubernetes.io~csi/pvc-7568f3e6-931e-4249-a9f2-1803c45a1a1b/mount failed: No such file or directory
Warning FailedMount 2s (x6 over 18s) kubelet, minion-0 MountVolume.SetUp failed for volume "pvc-0d62e28b-8473-4b8a-aa34-bc13715dd53b" : kubernetes.io/csi: mounter.SetupAt failed: rpc error: code = Internal desc = mount failed: exit status 255
Mounting command: mount
Mounting arguments: -t ext4 -o bind /data/minio/data1/46429e3c-d169-11ea-8dc4-0a5f797c5014 /var/lib/kubelet/pods/4dabe5be-17d8-4d52-862e-572e2d7ac9c6/volumes/kubernetes.io~csi/pvc-0d62e28b-8473-4b8a-aa34-bc13715dd53b/mount
Output: mount: mounting /data/minio/data1/46429e3c-d169-11ea-8dc4-0a5f797c5014 on /var/lib/kubelet/pods/4dabe5be-17d8-4d52-862e-572e2d7ac9c6/volumes/kubernetes.io~csi/pvc-0d62e28b-8473-4b8a-aa34-bc13715dd53b/mount failed: No such file or directory
I think, multiple pods like spark executors can access volumes with the access mode of ReadWriteMany.
Any idea about that?
The text was updated successfully, but these errors were encountered:
@mykidong No it is not the intention of this project. ReadWriteMany is suitable for NFS volumes or networked block devices not for hostPath and jbod's which this project is going to support.
We may re-look at this in future but for now we are not planning to support it.
I am wondering that direct csi supports the access mode ReadWriteMany of PVC.
I got some errors while running spark example pi job on kubernetes.
First, I have created pvc like this:
and run spark pi job:
My Spark Pi Job works fine, if the number of executors is set to 1 with the value of
--num-executors 1
.But with more than 2 executors, I go the following errors with the description of pod:
I think, multiple pods like spark executors can access volumes with the access mode of ReadWriteMany.
Any idea about that?
The text was updated successfully, but these errors were encountered: