Added a minikube setup script. #1387

abhi-g · 2018-08-19T22:08:44Z

Adding a shell script to make it easier for someone trying to setup local Minikube based Kubeflow deployment.
Some features:

Downloads kubectl, minikube and installs if necessary
Dynamic configuration of minikube VM based on host resources and user input
Allows a local host directory to be mounted into the Jupyter notebooks to allow for access to local datasets and persisting notebooks across minikube restarts / complete reconfigurations

/hold

This change is

jlewi · 2018-08-20T22:38:07Z

scripts/setup-minikube.sh

+
+# if user requested a local fs path to be mounted, make it accessible via
+# Jupyter Notebooks
+function mount_local_fs() {


To better fit this into the kfctl.sh pattern;
pv.yaml should be dumped to the k8s_specs directory during the generate phase of the kfctl command. And it should then be created during the kfctl apply phase

jlewi · 2018-08-20T22:39:02Z

scripts/setup-minikube.sh

+  kubectl create -n kubeflow -f ./pv-claim.yaml
+
+  # update tf-hub stateful set env to use the claim
+  kubectl -n kubeflow set env statefulset tf-hub -e KF_PVC_LIST=local-notebooks


Can we do this in kfctl.sh during the generate phase?
Can we check if platform is minikube and then set ksonnet parameters to set these parameters on tf-hub?

jlewi · 2018-08-20T23:48:05Z

Right now this script is wrapping kfctl.sh; can we instead have kfctl.sh call out to the functions in this script?

One time setup functions can be called during the init phase; e.g.

call install_kubectl_minikube
infer_minikube_settings - this should be called during init or generate
- The values should be persisted to a file e.g. env.sh
cleanup_and_deploy_minikube this should be called during kfctl.sh apply
- env.sh will be sourced so any environment variables defined in that file will be loaded and available.
mount_local_fs should be split into generate and apply
- When kfctl.sh generate k8s is called; here
  You should persist the yaml manifests for the PV to the directory k8s_specs
- Then during kfctl.sh apply k8s you should use kubectl to create those manifests
  https://github.com/kubeflow/kubeflow/blob/master/scripts/kfctl.sh#L232

jlewi · 2018-08-20T23:49:09Z

scripts/setup-minikube.sh

+  ../kfctl.sh apply all
+  popd
+
+  if is_kubeflow_ready


Per other comments; if we call this from kfctl.sh then most of the functionality up to this line should already be handled by kfctl.sh

abhi-g · 2018-08-23T00:00:33Z

@jlewi I think its ready for review now.

Updated based on your comments.

kfctl init phase now will read in required parameters from user, install ks, kubectl, minikube if needed and start minikube.
kfctl generate phase will create the local volume claim specs in the ksonnet dir, and setup params for jupyterhub
kfctl apply phase will deploy everything, wait for services to get ready, mount the local fs, setup tunnels.

When a new notebook is spawned it should automatically mount the local fs directory if specified and shows up as local-notebooks alongside work directory.

I haven't been able to test the parameterizing the jupyterhub ksonnet part. Since the script pulls all the sonnets etc. from github, I will retest after this is merged and send any fixes if needed.

…ocal

jlewi · 2018-08-23T03:24:04Z

You should be able to run kfctl.sh directly from you local git repo and it should use whatever files you have in your local copy
https://github.com/kubeflow/kubeflow/blob/master/scripts/kfctl.sh#L28

jlewi · 2018-08-23T03:29:34Z

kubeflow/core/prototypes/jupyterhub.jsonnet

@@ -13,6 +13,8 @@
 // @optionalParam repoName string kubeflow-images-public The repository name for JupyterNotebook.
 // @optionalParam disks string null Comma separated list of Google persistent disks to attach to jupyter environments.
 // @optionalParam gcpSecretName string user-gcp-sa The name of the secret containing service account credentials for GCP
+// @optionalParam notebookUid string 1000 UserId of the host user for minikube local fs mount


Are we changing the default behavior? It doesn't look like KubeFormSpawner was setting the UID/GID before but now it is.

Yes, it is doing this now only in minikube case. This is to make sure that the locally mounted filesystem is read/writable via Jupyter Notebook

updating to nulls.

jlewi · 2018-08-23T03:31:12Z

kubeflow/core/kubeform_spawner.py

+if env_uid and env_uid != 'null':
+    c.KubeSpawner.singleuser_uid = int(env_uid)
+env_gid = os.environ.get('NOTEBOOK_GID')
+if env_gid and env_gid != 'null':


Won't this break by default? It looks like you are defaulting env_gid to 100. So you will now add a post pod hook that tries to mount local notebooks into the container and that command will presumably fail.

You are correct. Updating above default values of params to null

jlewi · 2018-08-23T03:34:29Z

kubeflow/core/kubeform_spawner.py

+env_gid = os.environ.get('NOTEBOOK_GID')
+if env_gid and env_gid != 'null':
+    c.KubeSpawner.singleuser_fs_gid = int(env_gid)
+    def modify_pod_hook(spawner, pod):


Should we guard the lifecycle with its own hook? Maybe define an environment variable for the directory containing the notebooks.

I worry it might be brittle to infer this based on gid.

Added a separate env var for this.

jlewi · 2018-08-23T03:37:02Z

kubeflow/core/kubeform_spawner.py

@@ -177,6 +177,26 @@ def _expand_user_properties(self, template):
 c.KubeSpawner.singleuser_working_dir = '/home/jovyan'
 volumes = []
 volume_mounts = []
+
+# Allow environment vars to override uid and gid.


@pdmack Would you mind taking a look at this code? I remember it taking us a while to get uid and gid to work correctly when using a PV backing the home directory?

By the way, I have been able to test that I can access the local directory when specified via the Jupyter Notebook and create/modify notebooks.
Also, if not specified, the notebook behaves as earlier.
I haven't tested after creating the corresponding jsonnet params.

Sorry, missed this. Merged but I'll take a closer look this weekend anyway.

jlewi · 2018-08-23T03:39:43Z

kubeflow/core/kubeform_spawner.py

+
+# Allow environment vars to override uid and gid.
+# This allows local host path mounts to be read/writable
+env_uid = os.environ.get('NOTEBOOK_UID')


@abhi-g How does this compare to the current behavior? i.e. if you run on GKE with a PV for /home/jovyan how does this code changing the UID and GID change what user and group we run the notebook as?

We had a lot of trouble getting the permissions correct when using a PV for the home directory.
The thing to do would be to run it on GKE with the default settings and verify that the user had RW access to the home directory /home/jovyan

I'll definitely test that. Except, last time I tried, I still get stuck with IAM issues on GKE

jlewi · 2018-08-23T03:40:30Z

scripts/kfctl.sh

@@ -15,9 +15,14 @@ WHAT=$2

 ENV_FILE="env.sh"

+KUBEFLOW_VERSION=${KUBEFLOW_VERSION:-"master"}


Why do you need KUBEFLOW_VERSION?

To download the appropriate repository archive.

This needs to be set here, because createEnv is called after downloading corresponding shell scripts that are needed by kfctl.sh.

jlewi · 2018-08-23T03:41:27Z

scripts/kfctl.sh

@@ -15,9 +15,14 @@ WHAT=$2

 ENV_FILE="env.sh"

+KUBEFLOW_VERSION=${KUBEFLOW_VERSION:-"master"}
+KUBEFLOW_DEPLOY=${KUBEFLOW_DEPLOY:-true}


We shouldn't need KUBEFLOW_DEPLOY anymore

jlewi · 2018-08-23T03:42:02Z

scripts/kfctl.sh

+KUBEFLOW_VERSION=${KUBEFLOW_VERSION:-"master"}
+KUBEFLOW_DEPLOY=${KUBEFLOW_DEPLOY:-true}
+K8S_NAMESPACE=${K8S_NAMESPACE:-"kubeflow"}
+KUBEFLOW_CLOUD=${KUBEFLOW_CLOUD:-"minikube"}


Can we get rid of KUBEFLOW_CLOUD? This should be provided by the parameter --platform passed on kfctl init

jlewi · 2018-08-23T03:43:01Z

scripts/kfctl.sh

@@ -15,9 +15,14 @@ WHAT=$2

 ENV_FILE="env.sh"

+KUBEFLOW_VERSION=${KUBEFLOW_VERSION:-"master"}
+KUBEFLOW_DEPLOY=${KUBEFLOW_DEPLOY:-true}
+K8S_NAMESPACE=${K8S_NAMESPACE:-"kubeflow"}


Why do we need K8S_NAMESPACE to be provided as an environment variable?
This should probably be a parameter of kfctl init if its really needed

This is already defined in CREATE_ENV

removed K8S_NAMESPACE here as well

jlewi · 2018-08-23T03:45:02Z

scripts/kfctl.sh

+    set +x
+    infer_minikube_settings
+    set -x
+    download_kubeflow_source


Why are you calling download_kubeflow_source here?
The needed source will already be there; source is downloaded by
https://github.com/kubeflow/kubeflow/blob/master/scripts/download.sh

The source is located based on the location of kfctl.sh

Once you get rid of it it will use your local code and you can test it.

For the local experience, I don't like the idea of a user downloading the kubeflow source first. In fact, in its current form, they only need to download kfctl.sh, and then:
kfctl.sh init appName --platform minikube
cd appName
../kfctl.sh generate all
../kfctl.sh apply all

jlewi · 2018-08-23T03:46:02Z

scripts/kfctl.sh

+    echo HOST_PLATFPORM=${HOST_PLATFPORM} >> ${ENV_FILE}
+    echo MOUNT_LOCAL=${MOUNT_LOCAL} >> ${ENV_FILE}
+    echo MINIKUBE_CMD=\"${MINIKUBE_CMD}\" >> ${ENV_FILE}
+    echo KUBEFLOW_REPO="$(pwd)/kubeflow-${KUBEFLOW_VERSION}" >> ${ENV_FILE}


KUBEFLOW_REPO is already defined on line 24

Based on the experience described above, download_kubeflow_source will get a copy of the repo into the appName dir. This overrides KUBEFLOW_REPO to point to that location.

jlewi · 2018-08-23T03:47:05Z

scripts/kfctl.sh

@@ -92,6 +107,17 @@ createEnv() {
  fi
 }

+# For minikube single script download experience
+function download_kfctl_scripts() {


You can delete this;
We have a separate script to do the download.sh
https://github.com/kubeflow/kubeflow/blob/master/scripts/download.sh

I can leverage this script to download everything.. But from an experience point of view:

download the download script.

run it to download the source and scripts.

find out where kfctl.sh is.

either run from within that directory (creating appName dir there) or move the scripts out elsewhere and then make sure source is pointing to right location.

Seems a bit more demanding than downloading and running a VM image. The goal with initial local experience is to keep it really simple first.

jlewi · 2018-08-23T03:47:58Z

scripts/kfctl.sh

+    download_kfctl_scripts
+  fi
+
+  source "${DIR}/util.sh"


Move this back to the top; they will already exist because kfctl.sh doesn't download the scripts.

jlewi

Reviewable status: 0 of 6 files reviewed, 7 unresolved discussions (waiting on @jlewi, @abhi-g, and @richardsliu)

kubeflow/core/kubeform_spawner.py, line 187 at r3 (raw file):

Previously, abhi-g (Abhishek Gupta) wrote…

You are correct. Updating above default values of params to null

In the jsonnet can we avoid setting the environment variables if we don't want to override the values, then env_gid and env_uid should be None if the environment variables aren't set

kubeflow/core/prototypes/jupyterhub.jsonnet, line 16 at r3 (raw file):

Previously, jlewi (Jeremy Lewi) wrote…

Did you forget to push, not seeing the change?

Isn't 0 a valid uid? i.e. doesn't 0 mean root?
Should we use -1?

Then in the jsonnet only set the environment variable if its >= 0.
Then in the Python code os.getEnv should return None and we can use that to determine its not set.

…ocal

jlewi

Reviewable status: 0 of 6 files reviewed, 8 unresolved discussions (waiting on @jlewi, @abhi-g, and @richardsliu)

scripts/kfctl.sh, line 18 at r7 (raw file):

ENV_FILE="env.sh"

KUBEFLOW_DOCKER_REGISTRY=${KUBEFLOW_DOCKER_REGISTRY:-""}

I think this should be in createEnv
Right now we are only doing it for platform ack but we should just always do it
We only want to inherit environment variables from the environment when init is called
after that we want to get them from env.sh so that we always use the same values.

jlewi · 2018-08-24T17:58:38Z

Just a couple nits and then this should be ready to go.

abhi-g · 2018-08-24T21:57:31Z

Addressed your last couple of comments @jlewi

jlewi · 2018-08-24T23:20:35Z

/lgtm
/approve

k8s-ci-robot · 2018-08-24T23:20:39Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jlewi

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [jlewi]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

abhi-g · 2018-08-25T01:42:16Z

/hold cancel

…ocal

abhi-g · 2018-08-25T02:08:15Z

adding back lgtm label. Only fixed a 1 line conflict with upstream.

/lgtm

k8s-ci-robot · 2018-08-25T02:08:16Z

@abhi-g: you cannot LGTM your own PR.

In response to this:

adding back lgtm label. Only fixed a 1 line conflict with upstream.

/lgtm

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

abhi-g · 2018-08-25T02:09:10Z

Good to know @k8s-ci-robot

ankushagarwal · 2018-08-25T02:16:31Z

/lgtm

abhi-g · 2018-08-25T16:19:32Z

fixes #1153

* Added a minikube setup script. * Make local-notebooks accessible via Jupyter Notebooks * Restructure minikube deployment script to work with kfctl patterns * Cleanup * fix comparison in jupyterhub.libsonnet for the new params * Fix jsonnet string param parsing issues. * Addressing some of jlewi's comments * Refactored to split out kfctl.sh and setup-minikube.sh * Remove unused function. * Updated to use -1 on the env vars for UID / GID * Move Docker registry env var into createEnv

…kubeflow#1387)

Added a minikube setup script.

3fe6df9

k8s-ci-robot added the do-not-merge/hold label Aug 19, 2018

googlebot added the cla: yes label Aug 19, 2018

k8s-ci-robot requested review from jlewi and richardsliu August 19, 2018 22:08

k8s-ci-robot added the size/L label Aug 19, 2018

jlewi reviewed Aug 20, 2018

View reviewed changes

abhi-g added 3 commits August 21, 2018 21:08

Make local-notebooks accessible via Jupyter Notebooks

43b7bbf

Restructure minikube deployment script to work with kfctl patterns

1445ca3

Cleanup

1e35361

abhi-g added 2 commits August 22, 2018 17:01

fix comparison in jupyterhub.libsonnet for the new params

64f71dd

Merge remote-tracking branch 'upstream/master' into jupyter_spawner_l…

d5c95ed

…ocal

jlewi reviewed Aug 23, 2018

View reviewed changes

jlewi suggested changes Aug 24, 2018

View reviewed changes

Merge remote-tracking branch 'upstream/master' into jupyter_spawner_l…

c0eafae

…ocal

jlewi suggested changes Aug 24, 2018

View reviewed changes

jlewi mentioned this pull request Aug 24, 2018

minikube e2e test should use kfctl #1416

Closed

abhi-g added 2 commits August 24, 2018 13:27

Updated to use -1 on the env vars for UID / GID

a2baf09

Move Docker registry env var into createEnv

95124e3

k8s-ci-robot assigned jlewi Aug 24, 2018

k8s-ci-robot added the lgtm label Aug 24, 2018

k8s-ci-robot added the approved label Aug 24, 2018

k8s-ci-robot removed the do-not-merge/hold label Aug 25, 2018

Merge remote-tracking branch 'upstream/master' into jupyter_spawner_l…

092a9b3

…ocal

k8s-ci-robot removed the lgtm label Aug 25, 2018

k8s-ci-robot assigned ankushagarwal Aug 25, 2018

k8s-ci-robot added the lgtm label Aug 25, 2018

k8s-ci-robot merged commit 106e267 into kubeflow:master Aug 25, 2018

jlewi mentioned this pull request Aug 25, 2018

kfctl.sh unable to find component ambassador #1429

Closed

abhi-g deleted the jupyter_spawner_local branch September 2, 2018 14:10

snyk-bot mentioned this pull request Jan 17, 2022

[Snyk] Security upgrade node-fetch from 2.6.0 to 3.1.1 aliceUnhinged613/kubeflow#88

Open

surajkota pushed a commit to surajkota/kubeflow that referenced this pull request Jun 13, 2022

Remove katib-metricscollector-injection label from Kubeflow namespace (…

ac14931

…kubeflow#1387)

		@@ -15,9 +15,14 @@ WHAT=$2

		ENV_FILE="env.sh"

		KUBEFLOW_VERSION=${KUBEFLOW_VERSION:-"master"}

Added a minikube setup script. #1387

Added a minikube setup script. #1387

Conversation

abhi-g commented Aug 19, 2018 • edited by jlewi

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jlewi commented Aug 20, 2018 • edited

Choose a reason for hiding this comment

abhi-g commented Aug 23, 2018

jlewi commented Aug 23, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jlewi left a comment

Choose a reason for hiding this comment

jlewi left a comment

Choose a reason for hiding this comment

jlewi commented Aug 24, 2018

abhi-g commented Aug 24, 2018

jlewi commented Aug 24, 2018

k8s-ci-robot commented Aug 24, 2018

abhi-g commented Aug 25, 2018

abhi-g commented Aug 25, 2018

k8s-ci-robot commented Aug 25, 2018

abhi-g commented Aug 25, 2018

ankushagarwal commented Aug 25, 2018

abhi-g commented Aug 25, 2018

abhi-g commented Aug 19, 2018 •

edited by jlewi

jlewi commented Aug 20, 2018 •

edited