Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add persistent volume configuration to jupyter #78

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
35 changes: 35 additions & 0 deletions dask/templates/dask-jupyter-deployment.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,15 @@ spec:
spec:
imagePullSecrets:
{{- toYaml .Values.jupyter.image.pullSecrets | nindent 8 }}
{{- if .Values.jupyter.persistent.enable }}
initContainers:
- name: {{ .Release.Name }}-jupyter-volume-permissions
image: busybox:1.31.1
command: ["sh", "-c", "chmod -Rv 777 /data/ && chown 0:100 -R /data/"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Setting permissions to 777 is generally a bad idea. You may find 755 is more appropriate.

However could you provide a little explaination why this is necessary at all?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The storage class that I am using is longhorn. The volumes they give you are owned by root by default. That init container sets the permissions so that the jovyan user can read and write to the PV.

I'm not aware of any storage classes that let you explicitly set the default ownership. I will add an option so its easy to disable this init container if desired.

volumeMounts:
- name: jupyter-persistent
mountPath: /data
{{- end }}
containers:
- name: {{ template "dask.fullname" . }}-jupyter
image: "{{ .Values.jupyter.image.repository }}:{{ .Values.jupyter.image.tag }}"
Expand All @@ -43,6 +52,10 @@ spec:
volumeMounts:
- name: config-volume
mountPath: /usr/local/etc/jupyter
{{- if .Values.jupyter.persistent.enable }}
- name: jupyter-persistent
mountPath: {{ .Values.jupyter.persistent.path }}
{{- end }}
env:
- name: DASK_SCHEDULER_ADDRESS
value: {{ template "dask.fullname" . }}-scheduler:{{ .Values.scheduler.servicePort }}
Expand All @@ -53,6 +66,11 @@ spec:
- name: config-volume
configMap:
name: {{ template "dask.fullname" . }}-jupyter-config
{{- if .Values.jupyter.persistent.enable }}
- name: jupyter-persistent
persistentVolumeClaim:
claimName: {{ .Release.Name }}-jupyter-persistent
{{- end }}
{{- with .Values.jupyter.nodeSelector }}
nodeSelector:
{{- toYaml . | nindent 8 }}
Expand All @@ -72,4 +90,21 @@ spec:
{{- if .Values.jupyter.serviceAccountName }}
serviceAccountName: {{ .Values.jupyter.serviceAccountName | quote }}
{{- end }}
{{- if .Values.jupyter.persistent.enable }}
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: {{ .Release.Name }}-jupyter-persistent
spec:
{{- if (not (eq .Values.jupyter.persistent.storageClass "default")) }}
storageClassName: {{ .Values.jupyter.persistent.storageClass }}
{{- end }}
accessModes:
- ReadWriteOnce
resources:
requests:
storage: {{ .Values.jupyter.persistent.size }}
{{- end }}

{{- end }}
5 changes: 5 additions & 0 deletions dask/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -97,6 +97,11 @@ jupyter:
servicePort: 80
# This hash corresponds to the password 'dask'
password: 'sha1:aae8550c0a44:9507d45e087d5ee481a5ce9f4f16f37a0867318c'
persistent:
enable: false
storageClass: "default"
size: 1Gi
path: /home/jovyan/persistent
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would recommend making this just /home/jovyan

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was trying to explicitly not overwrite the examples directory with the pv. I see your point though, it would be much clearer to the user if the entire home directory is whats saved.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The examples are created at runtime. So this should've be a problem.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I make the mount path /home/jovyan the examples do not show up.

Checking the Dockerfile from the dask-docker repo shows they are added at build time.

https://github.com/dask/dask-docker/blob/master/notebook/Dockerfile#L47

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh interesting. I must be thinking of the Pangeo helm chart.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering how you want to handle this. I can submit a PR to the docker-helm repo to copy the examples to another directory (say /usr/share/doc/dask-examples). That way they would be independent of the PV, and always available.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @jacobtomlinson how would you wan tot handle this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can submit a PR to the docker-helm repo to copy the examples to another directory (say /usr/share/doc/dask-examples).

The trouble with this is discoverability. Really they need to show up in the sidebar.

Perhaps adding them to that location and then adding a line to the prepare.sh script to create a symbolic link would work?

env:
# - name: EXTRA_CONDA_PACKAGES
# value: "numba xarray -c conda-forge"
Expand Down