Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to configure a local volume for Jupyterhub_spawner.py #933

Closed
ToddMorrill opened this issue Jun 5, 2018 · 6 comments
Closed

How to configure a local volume for Jupyterhub_spawner.py #933

ToddMorrill opened this issue Jun 5, 2018 · 6 comments

Comments

@ToddMorrill
Copy link

I'm in the last stretch of configuring a minikube cluster on bare metal. I've got the kubeflow app running and can spawn a jupyter notebook running tf 1.7 (gpu version). I'm trying to configure a local volume to be mounted into the spawned jupyter notebook pod. My goal is to be able to interact with the local storage folder on the desktop (i.e. move datasets around, persist notebooks, etc.) and also be able to access this data from within the pods across pod restarts. Ideally, I could just use most of the 2TB SSD drive on the computer for all users to share, though I'm open to each of ~3 users having their own storage volume. Any help is greatly appreciated.

Here's what I've done so far:

Create persistent volume so we can save our work

nano persistent_volume.yaml
# yaml file contents
apiVersion: v1
kind: PersistentVolume
metadata:
  name: example-local-pv
spec:
  capacity:
    storage: 5Gi
  accessModes:
  - ReadWriteOnce
  persistentVolumeReclaimPolicy: Retain
  storageClassName: local-storage
  local:
    path: /home/ai2-ironman/Documents/code/kubeflow-mount
  nodeAffinity:
    required:
      nodeSelectorTerms:
      - matchExpressions:
        - key: kubernetes.io/hostname
          operator: In
          values:
          - my-node
# apply this 
kubectly apply -f persistent_volume.yaml
# Inspect this persistent volume
$ kubectl get pv
NAME               CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS      CLAIM     STORAGECLASS    REASON    AGE
example-local-pv   5Gi        RWO            Retain           Available             local-storage             1d

$ kubectl describe pv example-local-pv
Name:              example-local-pv
Labels:            <none>
Annotations:       kubectl.kubernetes.io/last-applied-configuration={"apiVersion":"v1","kind":"PersistentVolume","metadata":{"annotations":{},"name":"example-local-pv","namespace":""},"spec":{"accessModes":["ReadWriteOn...
Finalizers:        [kubernetes.io/pv-protection]
StorageClass:      local-storage
Status:            Available
Claim:             
Reclaim Policy:    Retain
Access Modes:      RWO
Capacity:          5Gi
Node Affinity:     
  Required Terms:  
    Term 0:        kubernetes.io/hostname in [my-node]
Message:           
Source:
    Type:  LocalVolume (a persistent volume backed by local storage on a node)
    Path:  /home/ai2-ironman/Documents/code/kubeflow-mount
Events:    <none>

Create a storage class for this local storage

nano local_storage_class.yaml
# yaml file contents
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: local-storage
provisioner: kubernetes.io/no-provisioner
volumeBindingMode: WaitForFirstConsumer
# Apply this new local storage class
kubectly apply -f local_storage.yaml

# check if the storageclass was registered (NB: I detail how I changed the default storage class below)
$ kubectl get storageclass
NAME                      PROVISIONER                    AGE
local-storage (default)   kubernetes.io/no-provisioner   1d
standard                  k8s.io/minikube-hostpath       4d

Change the default storage class

# the following command did not work for me so I edited the storageclass.yaml file directly
kubectl patch storageclass standard -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"false"}}}'
# make changes directly to the storageclass.yaml file
sudo nano /etc/kubernetes/addons/storageclass.yaml
# change storageclass.beta.kubernetes.io/is-default-class: "true" --> "false"
# change newly created storage class to default
kubectl patch storageclass local-storage -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'
# verify changes
$ kubectl describe storageclass local-storage
Name:            local-storage
IsDefaultClass:  Yes
Annotations:     kubectl.kubernetes.io/last-applied-configuration={"apiVersion":"storage.k8s.io/v1","kind":"StorageClass","metadata":{"annotations":{},"name":"local-storage","namespace":""},"provisioner":"kubernetes.io/no-provisioner","volumeBindingMode":"WaitForFirstConsumer"}
,storageclass.kubernetes.io/is-default-class=true
Provisioner:           kubernetes.io/no-provisioner
Parameters:            <none>
AllowVolumeExpansion:  <unset>
MountOptions:          <none>
ReclaimPolicy:         Delete
VolumeBindingMode:     WaitForFirstConsumer
Events:                <none>

Edit the jupyterhub_spawner.py file to create a PVC for this newly created PV

  • currently, I have edited the script to ignore the persistent volume claim section (as I saw in Jupyter pod stuck at ContainerCreating when spawning #336 )
  • here is the key question:
    What changes do I have to make to jupyterhub_spawner.py (or otherwise) to mount my local persistent volume to the jupyter pod when I start the server?
cd /home/ai2-ironman/my-kubeflow/vendor/kubeflow/core
cat jupyterhub_spawner.py
# pasting in the relevant portion of the file - everything else remains the same as OOTB installation
###################################################
### Persistent volume options
###################################################
# Using persistent storage requires a default storage class.
# TODO(jlewi): Verify this works on minikube.
# TODO(jlewi): Should we set c.KubeSpawner.singleuser_fs_gid = 1000
# see https://github.com/kubeflow/kubeflow/pull/22#issuecomment-350500944
#pvc_mount = os.environ.get('NOTEBOOK_PVC_MOUNT')
pvc_mount = False
if pvc_mount: # pvc_mount and pvc_mount != 'null':
    c.KubeSpawner.user_storage_pvc_ensure = True
    # How much disk space do we want?
    c.KubeSpawner.user_storage_capacity = '10Gi'
    c.KubeSpawner.pvc_name_template = 'claim-{username}{servername}'
    c.KubeSpawner.volumes = [
      {
        'name': 'volume-{username}{servername}',
        'persistentVolumeClaim': {
          'claimName': 'claim-{username}{servername}'
        }
      }
    ]
    c.KubeSpawner.volume_mounts = [
      {
        'mountPath': pvc_mount,
        'name': 'volume-{username}{servername}'
      }
    ]

Start my jupyter notebook

# pod is using the following for volume and it's unclear how to change it
Volumes:
  no-api-access-please:
    Type:        EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:      
$ kubectl describe pods jupyter-ai2-2dironman -n=kubeflow
Name:         jupyter-ai2-2dironman
Namespace:    kubeflow
Node:         minikube/192.168.1.123
Start Time:   Tue, 05 Jun 2018 18:33:02 -0400
Labels:       app=jupyterhub
              component=singleuser-server
              heritage=jupyterhub
              hub.jupyter.org/username=ai2_2Dironman
Annotations:  <none>
Status:       Running
IP:           ...
Containers:
  notebook:
    Container ID:  docker://4d3d51228f2be5e0ddb2bb7a630bc4ebd4fa8e7240496ad20f5b789e578e1810
    Image:         gcr.io/kubeflow-images-public/tensorflow-1.7.0-notebook-gpu:v20180419-0ad94c4e
    Image ID:      docker-pullable://gcr.io/kubeflow-images-public/tensorflow-1.7.0-notebook-gpu@sha256:33b13e3de4a53854d8c52f172d58ef554e96a46e7e8d65cb27eaa33c3ca4a002
    Port:          8888/TCP
    Host Port:     0/TCP
    Args:
      start-singleuser.sh
      --ip="0.0.0.0"
      --port=8888
      --allow-root
    State:          Running
      Started:      Tue, 05 Jun 2018 18:33:04 -0400
    Ready:          True
    Restart Count:  0
    Limits:
      nvidia.com/gpu:  1
    Requests:
      cpu:             4
      memory:          8Gi
      nvidia.com/gpu:  1
    Environment:
      JUPYTERHUB_API_TOKEN:           ...
      JPY_API_TOKEN:                  ...
      JUPYTERHUB_CLIENT_ID:           user-ai2-ironman
      JUPYTERHUB_HOST:                
      JUPYTERHUB_OAUTH_CALLBACK_URL:  /user/ai2-ironman/oauth_callback
      JUPYTERHUB_USER:                ai2-ironman
      JUPYTERHUB_API_URL:             http://tf-hub-0:8081/hub/api
      JUPYTERHUB_BASE_URL:            /
      JUPYTERHUB_SERVICE_PREFIX:      /user/ai2-ironman/
      MEM_GUARANTEE:                  8Gi
      CPU_GUARANTEE:                  4
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from no-api-access-please (ro)
Conditions:
  Type           Status
  Initialized    True 
  Ready          True 
  PodScheduled   True 
Volumes:
  no-api-access-please:
    Type:        EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:      
QoS Class:       Burstable
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:          <none>

Thanks again for any help here.

@jlewi
Copy link
Contributor

jlewi commented Jun 6, 2018

JupyterHub will use the default storage class to create a PV for every user's home directory. You probably don't want to give each user the same home directory.

In addition to the volume mounted at /home/joyvan you could potentially mount another PV that maps to a shared location on the host system.

To mount extra volumes you probably want to modify volumes and volume mounts here

c.KubeSpawner.volumes = volumes

You'll want to set them to the appropriate values for the volumes and volummounts parameters in PodTemplateSpecs. You can use that to mount whatever PVs you created.

/cc pdmack

@ToddMorrill
Copy link
Author

I really appreciate the response.

My setup may be a little unique in that I'm running minikube on a bare metal machine (i.e. vm-driver=none) so minikube's host path mounts directly to my host path (i.e. bare metal). I should be able to take advantage of the hostPath volume type, which doesn't require a persistent volume definition or persistent volume claim, nor does it seem to need a storage class. I got another pod running with this setup. This may not be the ideal setup in a multi-user environment, but it's sufficient for now.

Here's the toy example I got running using a hostPath volume:

$ cat example_pod_w_volume.yaml 
apiVersion: v1
kind: Pod
metadata:
  name: tf-with-volume
  # Note that the Pod does not need to be in the same namespace as the loader.
  labels:
    app: tf-volume-example
spec:
  containers:
  - name: tf-test
    image: gcr.io/tensorflow/tensorflow@sha256:8dd50435b6ba906430669cf914e30284e2622c1377f9359637b38a49ab053497
    ports:
    - containerPort: 8888
    volumeMounts:
    - mountPath: "/notebooks"
      name: task-pv-storage
  volumes:
    - name: task-pv-storage
      hostPath:
        # directory location on host
        path: /home/ai2-ironman/Documents/code/kubeflow-mount
        # this field is optional
        type: Directory

Following this toy example, I tried to follow a similar approach for kubeflow and made some changes to jupyterhub_spawner.py & jupyterhub.libsonnet but this didn't do the trick. NB: I'm on tag v0.1.2 per the the user_guide.md. The newly spawned jupyter pods continue to show up with the following volume mounted:

Volumes:
  no-api-access-please:
    Type:        EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:      

On a side note, I restored the default storage class to the standard OOTB one and deleted my storage class I mentioned above.

Is there an issue with the way that I'm filling out/editing the volume specs or are there other changes required to use the hostPath volume?

@jlewi
Copy link
Contributor

jlewi commented Jun 7, 2018

What is the pod spec for the resulting notebook pod that gets spawned? In particular does volumes and volumeMounts have the paths expected?

@ToddMorrill
Copy link
Author

I've pasted the yaml below, which doesn't reflect the updates to the jupyterhub_spawner.py or jupyterhub.libsonnet files.

$ kubectl get pod jupyter-ai2-2dironman -n=kubeflow -o yaml
apiVersion: v1
kind: Pod
metadata:
  creationTimestamp: 2018-06-06T18:35:50Z
  labels:
    app: jupyterhub
    component: singleuser-server
    heritage: jupyterhub
    hub.jupyter.org/username: ai2_2Dironman
  name: jupyter-ai2-2dironman
  namespace: kubeflow
  resourceVersion: "283211"
  selfLink: /api/v1/namespaces/kubeflow/pods/jupyter-ai2-2dironman
  uid: 6fd477a6-69b8-11e8-a33b-309c239a542a
spec:
  containers:
  - args:
    - start-singleuser.sh
    - --ip="0.0.0.0"
    - --port=8888
    - --allow-root
    env:
    - name: JUPYTERHUB_API_TOKEN
      value: ...
    - name: JPY_API_TOKEN
      value: ...
    - name: JUPYTERHUB_CLIENT_ID
      value: user-ai2-ironman
    - name: JUPYTERHUB_HOST
    - name: JUPYTERHUB_OAUTH_CALLBACK_URL
      value: /user/ai2-ironman/oauth_callback
    - name: JUPYTERHUB_USER
      value: ai2-ironman
    - name: JUPYTERHUB_API_URL
      value: http://tf-hub-0:8081/hub/api
    - name: JUPYTERHUB_BASE_URL
      value: /
    - name: JUPYTERHUB_SERVICE_PREFIX
      value: /user/ai2-ironman/
    - name: MEM_GUARANTEE
      value: 8Gi
    - name: CPU_GUARANTEE
      value: "4"
    image: gcr.io/kubeflow-images-public/tensorflow-1.7.0-notebook-gpu:v20180419-0ad94c4e
    imagePullPolicy: IfNotPresent
    lifecycle: {}
    name: notebook
    ports:
    - containerPort: 8888
      name: notebook-port
      protocol: TCP
    resources:
      limits:
        nvidia.com/gpu: "1"
      requests:
        cpu: "4"
        memory: 8Gi
        nvidia.com/gpu: "1"
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
    - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      name: no-api-access-please
      readOnly: true
  dnsPolicy: ClusterFirst
  nodeName: minikube
  restartPolicy: Always
  schedulerName: default-scheduler
  securityContext:
    fsGroup: 0
    runAsUser: 0
  serviceAccount: default
  serviceAccountName: default
  terminationGracePeriodSeconds: 30
  tolerations:
  - effect: NoExecute
    key: node.kubernetes.io/not-ready
    operator: Exists
    tolerationSeconds: 300
  - effect: NoExecute
    key: node.kubernetes.io/unreachable
    operator: Exists
    tolerationSeconds: 300
  volumes:
  - emptyDir: {}
    name: no-api-access-please
status:
  conditions:
  - lastProbeTime: null
    lastTransitionTime: 2018-06-06T18:35:50Z
    status: "True"
    type: Initialized
  - lastProbeTime: null
    lastTransitionTime: 2018-06-06T18:35:52Z
    status: "True"
    type: Ready
  - lastProbeTime: null
    lastTransitionTime: 2018-06-06T18:35:50Z
    status: "True"
    type: PodScheduled
  containerStatuses:
  - containerID: docker://2511cf16e0263147001f7fd4afdd25e421362d613232d6998ee37119466f9f79
    image: gcr.io/kubeflow-images-public/tensorflow-1.7.0-notebook-gpu:v20180419-0ad94c4e
    imageID: docker-pullable://gcr.io/kubeflow-images-public/tensorflow-1.7.0-notebook-gpu@sha256:33b13e3de4a53854d8c52f172d58ef554e96a46e7e8d65cb27eaa33c3ca4a002
    lastState: {}
    name: notebook
    ready: true
    restartCount: 0
    state:
      running:
        startedAt: 2018-06-06T18:35:52Z
  hostIP: 192.168.1.123
  phase: Running
  podIP: ...
  qosClass: Burstable
  startTime: 2018-06-06T18:35:50Z

@jlewi
Copy link
Contributor

jlewi commented Jun 8, 2018

It looks like the changes to kube_spawner.py
https://github.com/ToddMorrill/kubeflow/blob/d18f9ed480e1dbe1df72d3541b61f1dffdf3c50c/kubeflow/core/jupyterhub_spawner.py#L103-L124

Are only set if the pvc_mount parameter is set; this gets mapped to a ksonnet parameter.
https://github.com/kubeflow/kubeflow/blob/master/kubeflow/core/prototypes/all.jsonnet#L18

But for the host path there's no reason to make it conditional in your case; so you could just remove the if statement so that it is always set.

@ToddMorrill
Copy link
Author

All is well that ends well.

I think kubeflow v0.1.2 wasn't respecting the changes I was making to jupyterhub_spawner.py. For example, c.KubeSpawner.volumes = volumes didn't even exist in that script, not to mention, it looks like is the script that is actually getting called is kubeform_spawner.py, which is only available on the current master branch of kubeflow. Once I switched over to master and continued using the changes I mentioned in jupyterhub_spawner.py (except, the "gotcha" is that these changes needed to be made to kubeform_spawner.py & jupyterhub.libsonnet, I was able to make the hostPath mount to the jupyter pod.

Really excited to get to work with this. Thanks for the help.

surajkota pushed a commit to surajkota/kubeflow that referenced this issue Jun 13, 2022
…flow#933)

* image gcr.io/kubeflow-images-public/tf_operator:vmaster-gd455e6ef
* Image built from kubeflow/kubeflow@d455e6ef
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants