Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Install application CRD and add pipeline application CR to pipeline standalone #2585

Merged
merged 30 commits into from
Jan 16, 2020

Conversation

IronPan
Copy link
Member

@IronPan IronPan commented Nov 9, 2019

  • Add application CRD and application manager
  • Add pipeline application CR
  • Set addOwnerRef=true for pipeline application CR so things will get GCed when application is deleted
  • Set a common label application-crd-id: kubeflow-pipelines for identifying the KFP components. This will be used as identifier by application manager to set ownerref. Note we also set it for application manager themselves as they will get GCed too.

Note we don't set the label for CRD or namespace since they are cluster-scoped resources and application CRD doesn't support such resources. Setting labels on those resources will leads to unexpected GC behavior.


This change is Reviewable

@IronPan
Copy link
Member Author

IronPan commented Nov 9, 2019

/assign @Bobgy

@IronPan
Copy link
Member Author

IronPan commented Nov 9, 2019

fix #2469

Copy link
Contributor

@Bobgy Bobgy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Looks great.
/lgtm

Left a few comments for clarification.

manifests/kustomize/base/kustomization.yaml Outdated Show resolved Hide resolved
manifests/kustomize/base/argo/minio-artifact-secret.yaml Outdated Show resolved Hide resolved
@Bobgy
Copy link
Contributor

Bobgy commented Nov 9, 2019

Also, note that there's an error in e2e tests when deploying these.

error: error validating "STDIN": error validating data: ValidationError(Deployment.spec): unknown field "serviceName" in io.k8s.api.apps.v1beta2.DeploymentSpec; if you choose to ignore these errors, turn validation off with --validate=false

@IronPan
Copy link
Member Author

IronPan commented Nov 11, 2019

/test kubeflow-pipeline-e2e-test

export PIPELINE_VERSION=0.1.31
kubectl apply -f https://storage.googleapis.com/ml-pipeline/pipeline-lite/$PIPELINE_VERSION/namespaced-install.yaml
export PIPELINE_VERSION=0.1.34
for i in {1..2}; do kubectl apply -f https://storage.googleapis.com/ml-pipeline/pipeline-lite/$PIPELINE_VERSION/namespaced-install.yaml && break || sleep 1; done
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you help me understand why we need these retries in README?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Besides, shall we sync this file with documentation on kubeflow/website?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

adding crd definition is a eventual consistent behavior. by adding crd and cr in the same command, somehow master always throw me

error: unable to recognize "STDIN": no matches for kind "Application" in version "app.k8s.io/v1beta1"

which is really pain. blind retry works so i added it here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And regarding the doc, i can update once this is merged.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might take a few seconds for the endpoint to be created. You can watch the Established condition of your CustomResourceDefinition to be true or watch the discovery information of the API server for your resource to show up.
https://kubernetes.io/docs/tasks/access-kubernetes-api/custom-resources/custom-resource-definitions/#create-a-customresourcedefinition

I did some investigation, as you said, applying a CRD is eventual consistency (it takes some time before the k8s api endpoint is ready). So in reality it could take any amount of time before CRDs are ready. Using a retry here may work consistently now, but I think that's related to the time waited during retry just happen to be long enough for the CRD to be established. Any of these conditions may change in the future, e.g. if applying happens faster, or if CRDs take longer to establish, then this retry will break.

Also, everyone who read this the first time would wonder why we need a retry here.

I prefer we make the waiting explicit like the following. It is also pretty self explanatory about what we are doing.

kubectl apply -f crds
kubectl wait --for condition=established --timeout=60s crd/applications.app.k8s.io
kubectl apply -k envs/dev

https://stackoverflow.com/a/57126479

test/deploy-pipeline-lite.sh Outdated Show resolved Hide resolved
@rmgogogo rmgogogo self-assigned this Nov 14, 2019
matchLabels:
application-crd-id: kubeflow-pipelines
descriptor:
version: 0.1.34
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quick check, will our on-call script auto update this field? not familiar with that part.
It's OK if we put that change to another PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should update release playbook to also change this field

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks. will do once this is merged.

Copy link
Contributor

@Bobgy Bobgy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will further investigate the retry issue. Its behavior is strange.

Even if I delete everything from a cluster, second kubectl apply doesn't fail any more.

manifests/kustomize/base/crds/application-crd.yaml Outdated Show resolved Hide resolved
matchLabels:
application-crd-id: kubeflow-pipelines
descriptor:
version: 0.1.34
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should update release playbook to also change this field

@rmgogogo
Copy link
Contributor

rmgogogo commented Jan 9, 2020

kindly ping this PR.
I also prefer change to apply twice, one for CRD and then for others to avoid retry logic.

Copy link
Contributor

@Bobgy Bobgy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! The new changes look good. Some final minor comments.

/approve

@@ -1,4 +1,4 @@
apiVersion: v1
kind: ServiceAccount
metadata:
name: ml-pipeline-viewer-crd-service-account
name: ml-pipeline-viewer-crd-service-account
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can you revert this change?

test/manifests/crd/kustomization.yaml Outdated Show resolved Hide resolved
@Bobgy
Copy link
Contributor

Bobgy commented Jan 16, 2020

/lgtm

@k8s-ci-robot k8s-ci-robot removed the lgtm label Jan 16, 2020
@Bobgy
Copy link
Contributor

Bobgy commented Jan 16, 2020

/lgtm

@IronPan
Copy link
Member Author

IronPan commented Jan 16, 2020

/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Bobgy, IronPan

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot merged commit a0a39a5 into kubeflow:master Jan 16, 2020
Jeffwan pushed a commit to Jeffwan/pipelines that referenced this pull request Dec 9, 2020
…tandalone (kubeflow#2585)

* install application CRD and add pipeline application CR

* add labels and let application manager to set ownerref

* fix

* address comments

* update test

* update test

* update test

* update readme

* fix test

* update

* update

* update

* Update application-crd.yaml

* fix

* fix

* Update .release.cloudbuild.yaml

* update tests

* Update kustomization.yaml

* Update deploy-pipeline-lite.sh

* Update ml-pipeline-viewer-crd-sa.yaml

* update tests

* update tests

* update tests
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants