Skip to content
This repository has been archived by the owner on Aug 17, 2023. It is now read-only.

tie kfapp directory with kfdef name and namespace #324

Merged
merged 3 commits into from
May 14, 2020
Merged

tie kfapp directory with kfdef name and namespace #324

merged 3 commits into from
May 14, 2020

Conversation

adrian555
Copy link
Member

Towards #242

This is the simpler fix for the above issue. It does following:

  1. for a new kfdef instance, the kfapp directory is created as /tmp/<namespace>/<name>. The kfdef configuration file is copied to this directory. The cache and kustomize directories are also sub directories to this kfapp dir.

  2. the manifest cache and kustomization files are downloaded and created by the same kfApply function like what kfctl command line does.

  3. when a change in kfdef instance is caught, the Reconcile function will first remove this kfapp directory, then the same kfApply function runs to create/update according to the new kfdef configuration.

  4. when a kubeflow resource is deleted, the same kfApply is run when the reconcilation kicks in. But this time, since the kfapp directory remains and the contents do not get changed, the deleted kubeflow resource will get recreated.

  5. when the kfdef instance is deleted, the kfapp directory is destroyed as well.

@kubeflow-bot
Copy link

This change is Reviewable

@adrian555
Copy link
Member Author

@nakfour
Copy link
Member

nakfour commented May 1, 2020

Thanks @adrian555, as I read this, it is deleting the kfapp even if only a change is detected, correct?

@adrian555
Copy link
Member Author

@nakfour correct with any change to kfdef.

Copy link
Contributor

@animeshsingh animeshsingh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @adrian555
Can we capture this in a quick sequence diagram - essentially with the following verbs for simplicity, and maybe add somewhere in main README. Can be a follow-on PR if needed

  1. create
    (kfdef -> kubeflow instance)
  2. update
    (kfdef -> kubeflow instance)
    (kubeflow instance->kfdef ) [something changed on cluster, reconcile needs to kick in]
  3. delete
    (kfdef -> kubeflow instance)
    (kubeflow instance->kfdef ) [something deleted on cluster, reconcile needs to kick in]

Other than that /lgtm

@animeshsingh
Copy link
Contributor

cc @gabrielwen @jlewi

@jlewi
Copy link
Contributor

jlewi commented May 6, 2020

Thanks @animeshsingh for the heads up. I don't have much comment on the controller.
I think on GCP we are moving away from using a single monolithic operator to manage all of Kubeflow. So I probably won't be having much to say on any code specific to the operator.

On GCP we are moving in the direction of using GitOps and managing it as a loose collection of applications. Happy to discuss more at a community meeting if you are interested.

if you want to see where we are headed.
https://github.com/kubeflow/gcp-blueprints

If you look at the makefile

You can see the pattern is

  1. Hydrate manifests
  2. Apply manifests

@animeshsingh
Copy link
Contributor

Thanks @jlewi - that's news! So does this imply on GCP you are also moving away from kfdef?

@jlewi
Copy link
Contributor

jlewi commented May 7, 2020

So does this imply on GCP you are also moving away from kfdef?

It means that KFDef is becoming redundant with kustomization.yaml. i.e. KFDef is just providing a list of kustomization packages to install. This can be done by listing kustomize packages in kustomization.yaml.

So I think whether we move away from it on GCP depends on whether we find a suitable purpose for having that extra layer of indirection.

@vpavlin
Copy link
Member

vpavlin commented May 12, 2020

This looks good to me - I tested it and was able to deploy into 2 separate namespaces.

The operator reacted when I deleted one of the tracked configMaps and re-created it
The operator updated the deployment when I edited the kfdef
The deployment was correctly deleted based on ownerRef when I deleted the KFDef

Any case I missed and it should be tested? If not LGTM

@adrian555
Copy link
Member Author

Thanks @vpavlin for testing this out. I think the tests you have done covered the cases this PR fixes. I am writing some document according to what Animesh suggested above. Other than that, the code should be complete.

@adrian555
Copy link
Member Author

/retest

1 similar comment
@animeshsingh
Copy link
Contributor

/retest

@adrian555
Copy link
Member Author

the go-iap presubmit seems to fail at the apply phase

INFO     root:util.py:72 Error: failed to apply:  (kubeflow.error): Code 500 with message: coordinator Apply failed for gcp:  (kubeflow.error): Code 500 with message: gcp apply could not
 update deployment manager: could not update storage-kubeflow.yaml; Insert deployment error: googleapi: Error 403: Quota exceeded for quota group 'deploymentMutators' and limit 'Write queries per day' of service 'deploymentmanager.googleapis.com' for consumer 'project_number:29647740582'., rateLimitExceeded

@vpavlin
Copy link
Member

vpavlin commented May 13, 2020

So the problem is quota limit being hit by the CI? Can we do something about it? It is very unfortunate if this is the thing that blocks this PR from being merged.

@animeshsingh
Copy link
Contributor

/retest

@adrian555
Copy link
Member Author

@animeshsingh @vpavlin same error failing to create gcp client from the log. Could @richardsliu @jlewi please help take a look?

@animeshsingh
Copy link
Contributor

/retest

@animeshsingh
Copy link
Contributor

animeshsingh commented May 14, 2020

Thanks @jlewi for getting the quota increased to fix the CI issue
cc @adrian555 @vpavlin

/lgtm
/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: animeshsingh

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot merged commit aaba770 into kubeflow:master May 14, 2020
@vpavlin
Copy link
Member

vpavlin commented May 14, 2020

Awesome, thank you very much

vpavlin pushed a commit to vpavlin/kfctl that referenced this pull request Jul 20, 2020
* tie kfapp directory with kfdef name and namespace

* support multiple kfdefs
crobby pushed a commit to crobby/kfctl that referenced this pull request Feb 25, 2021
* tie kfapp directory with kfdef name and namespace

* support multiple kfdefs
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants