Refactor the Kubeflow Pipeline ksonnet to better support upgrade #2478

IronPan · 2019-02-14T18:25:51Z

Move the pipeline parameter to pipeline.libsonnet so only user specified parameter is persisted in
components/params.libsonnet. This allow upgrading pipeline by deleting ksonnet package only.
For more details on pipeline upgrade, check this script
https://github.com/kubeflow/kubeflow/pull/2479/files#diff-9e6f5278480335e834537d84293b656b

This change is

IronPan · 2019-02-14T18:33:28Z

/assign @jlewi @kunmingg

IronPan · 2019-02-14T20:11:40Z

cc @neuromage

jlewi · 2019-02-14T21:32:37Z

The K8s cluster is created during the apply phase.

So if you add an environment during the generate phase how are you going to deal with the fact that a K8s cluster may not exist? Are you going to point it at a dummy cluster and then change that during the apply phase?

jlewi · 2019-02-14T21:33:28Z

scripts/gke/util.sh

-  ks param set jupyter jupyterHubAuthenticator iap
-  ks param set pipeline mysqlPd "${DEPLOYMENT_NAME}-storage-pipeline-db"
-  ks param set pipeline nfsPd "${DEPLOYMENT_NAME}-storage-pipeline-nfs"
+  ks param set jupyter jupyterHubAuthenticator iap --env=default


Why are you setting these parameters only in the environment default?

the immediate reason is that i want to be able to persist the configuration if I upgrade the jupyter hub. otherwise it might be lost.

does it make sense for all customizations(non default parameters) to set to a specific environment?

this step would remove the customized parameter
https://github.com/kubeflow/kubeflow/pull/2479/files#diff-9e6f5278480335e834537d84293b656bR32

Ok so you're depending on environment specific overrides to keep track of user defined overrides of parameters so you can preserve them on update.

Are you planning on adding an E2E test to verify that behavior to ensure refactorings don't break it?

I think another option that might be better would be to define the default parameters inside your .libsonnet file e.g. in your libsonnet

params = defaultParams + overrides

Don't define optional parameters inside your prototype so parameters will only be defined when the user explicitly decides to override a parameter.

The advantage of this approach is that it doesn't depend on using specific environments so I think its less brittle.

I am not sure I entirely follow the idea here, but here are some questions

define the default parameters inside your .libsonnet file

If so would user not be able to ks param list to get the parameter default values?

params = defaultParams + overrides

Where should I store the overrides and ensure the override value can be persist when I delete the component?

Given that we rely on default environment and we are on the way of deprecating ksonnet, is it ok to use default environment as the current approach?

If so would user not be able to ks param list to get the parameter default values?
Yes.

Where should I store the overrides and ensure the override value can be persist when I delete the component?

You would define the default values in the .libsonnet file for pipelines; e.g like we do for tf serving
https://github.com/kubeflow/kubeflow/blob/master/kubeflow/tf-serving/tf-serving.libsonnet#L5

User defined overrides would still be defined in the params.libsonnet files but they wouldn't necessarily be tied to a specific environment.

Given that we rely on default environment and we are on the way of deprecating ksonnet, is it ok to use default environment as the current approach?

Up to you but I think the current approach is more brittle. Specifically, you are relying on logic in kfctl to correctly set the parameters for just the default environment so that defaults are still preserved in the components/params.libsonnet file.

At a minimum you'll also need to reimplement this logic in the kfctl go binary.

The approach I suggest doesn't have that brittleness; and it doesn't rely on users only overriding parameters in a specific environment.

thanks for the sample. it's helped me understand the idea. done.

IronPan · 2019-02-14T21:34:43Z

I thought the cluster is created at the "generate platform" step. This change is at the next "generate k8s" step. Please correct me if I miss anything (on-prem maybe)

jlewi · 2019-02-14T22:49:42Z

No infrastructure is created during generate. Only config files are created.

ksonnet tries to talk to the master when creating a new environment. So the K8s cluster needs to exist or you need to somehow fake it (or maybe there's some flag to disable that). The K8s cluster won't exist until affter we run apply platform.

IronPan · 2019-02-15T00:26:01Z

ah sorry i had a typo in my previous reply.

the cluster is created at the "apply platform" step. This change is at the next "generate k8s" step.

So when the step that includes this change is run, the cluster should already exist.
I also have it tested e2e with the subsequent change
#2479 and things are working as expected.

jlewi · 2019-02-15T04:50:38Z

So the intended order of operations is

generate platform
generate k8s
apply platform
apply k8s

In other words it should be possible to generate configs without depending on any infrastructure existing.

We don't have a good pattern right now for reoptimizing based on your actual cluster.

So if you want to make that work I think you need to

During generate call ks env add with --api-spec and point it at the URL for the appropriate K8s version spec.
During apply reset the server associated with your environment.

IronPan · 2019-02-18T07:55:01Z

My understanding how ksonnet app work is when initializing a ksonnet app with ks init https://github.com/kubeflow/kubeflow/blob/master/scripts/util.sh#L62
it requires a active kubeconfig context set correctly, otherwise I see

++ ks init ks_app --skip-default-registries
ERROR No current context found. Make sure a kubeconfig file is present

See also this issue ksonnet/ksonnet#251

The order you describe work because ks init was using context from a previous cluster(verified from the log). This probably should not be what we suppose to do.

Should we stick with the order in the official document?
https://www.kubeflow.org/docs/started/getting-started-gke/#deploy-kubeflow-on-gke-using-the-command-line

jlewi · 2019-02-20T04:06:33Z

The desired order of operations is generate and then apply.
The instructions in the GKE guide are a work around because we never got around to implementing the fix to run ks init with out an existing cluster. We have other places where we do it.

I think it would be a mistake to assume the cluster exists; that will break things in the future.

IronPan · 2019-02-21T00:19:56Z

I've use the alternative approach you suggested and there is no need to create ks env upfront.
thanks

jlewi · 2019-02-21T00:32:15Z

nit: Could you please update the PR description and title.

jlewi · 2019-02-21T00:33:23Z

/lgtm
/approve
/hold

Please update the PR title and description before canceling the hold and merging.

k8s-ci-robot · 2019-02-21T00:33:34Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jlewi

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [jlewi]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

IronPan · 2019-02-21T01:07:50Z

/hold cancel

…eflow#2478) * create default environment during step * address comment

create default environment during step

3c0b9e1

googlebot added the cla: yes label Feb 14, 2019

k8s-ci-robot requested review from ellistarn and jlewi February 14, 2019 18:25

k8s-ci-robot added the size/M label Feb 14, 2019

k8s-ci-robot assigned jlewi and kunmingg Feb 14, 2019

jlewi reviewed Feb 14, 2019

View reviewed changes

address comment

fd64072

k8s-ci-robot removed the size/M label Feb 21, 2019

merge

8d21f44

k8s-ci-robot added the size/L label Feb 21, 2019

k8s-ci-robot added lgtm do-not-merge/hold labels Feb 21, 2019

k8s-ci-robot added the approved label Feb 21, 2019

IronPan changed the title ~~create default ks environment during 'generate k8s' step~~ Refactor the Kubeflow Pipeline ksonnet to better support upgrade Feb 21, 2019

k8s-ci-robot removed the do-not-merge/hold label Feb 21, 2019

k8s-ci-robot merged commit 2857c3c into kubeflow:master Feb 21, 2019

saffaalvi pushed a commit to StatCan/kubeflow that referenced this pull request Feb 11, 2021

Refactor the Kubeflow Pipeline ksonnet to better support upgrade (kub…

bc30968

…eflow#2478) * create default environment during step * address comment

saffaalvi pushed a commit to StatCan/kubeflow that referenced this pull request Feb 12, 2021

Refactor the Kubeflow Pipeline ksonnet to better support upgrade (kub…

62d8358

…eflow#2478) * create default environment during step * address comment

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor the Kubeflow Pipeline ksonnet to better support upgrade #2478

Refactor the Kubeflow Pipeline ksonnet to better support upgrade #2478

IronPan commented Feb 14, 2019 •

edited

IronPan commented Feb 14, 2019

IronPan commented Feb 14, 2019

jlewi commented Feb 14, 2019

jlewi Feb 14, 2019

IronPan Feb 14, 2019

IronPan Feb 14, 2019

jlewi Feb 15, 2019

jlewi Feb 16, 2019

IronPan Feb 18, 2019 •

edited

jlewi Feb 19, 2019

IronPan Feb 21, 2019

IronPan commented Feb 14, 2019 •

edited

jlewi commented Feb 14, 2019

IronPan commented Feb 15, 2019 •

edited

jlewi commented Feb 15, 2019

IronPan commented Feb 18, 2019 •

edited

jlewi commented Feb 20, 2019

IronPan commented Feb 21, 2019

jlewi commented Feb 21, 2019

jlewi commented Feb 21, 2019

k8s-ci-robot commented Feb 21, 2019

IronPan commented Feb 21, 2019

Refactor the Kubeflow Pipeline ksonnet to better support upgrade #2478

Refactor the Kubeflow Pipeline ksonnet to better support upgrade #2478

Conversation

IronPan commented Feb 14, 2019 • edited

IronPan commented Feb 14, 2019

IronPan commented Feb 14, 2019

jlewi commented Feb 14, 2019

jlewi Feb 14, 2019

Choose a reason for hiding this comment

IronPan Feb 14, 2019

Choose a reason for hiding this comment

IronPan Feb 14, 2019

Choose a reason for hiding this comment

jlewi Feb 15, 2019

Choose a reason for hiding this comment

jlewi Feb 16, 2019

Choose a reason for hiding this comment

IronPan Feb 18, 2019 • edited

Choose a reason for hiding this comment

jlewi Feb 19, 2019

Choose a reason for hiding this comment

IronPan Feb 21, 2019

Choose a reason for hiding this comment

IronPan commented Feb 14, 2019 • edited

jlewi commented Feb 14, 2019

IronPan commented Feb 15, 2019 • edited

jlewi commented Feb 15, 2019

IronPan commented Feb 18, 2019 • edited

jlewi commented Feb 20, 2019

IronPan commented Feb 21, 2019

jlewi commented Feb 21, 2019

jlewi commented Feb 21, 2019

k8s-ci-robot commented Feb 21, 2019

IronPan commented Feb 21, 2019

IronPan commented Feb 14, 2019 •

edited

IronPan Feb 18, 2019 •

edited

IronPan commented Feb 14, 2019 •

edited

IronPan commented Feb 15, 2019 •

edited

IronPan commented Feb 18, 2019 •

edited