New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Argo workflow e2e test #777
Conversation
/retest |
@@ -162,7 +162,7 @@ | |||
{ | |||
name: "e2e", | |||
dag: { | |||
tasks: [ | |||
tasks: std.prune([ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do you need to prune here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For gke, the argo workflow step will be {}
. prune removes that entry.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought we just need to prune here https://github.com/kubeflow/kubeflow/blob/master/testing/workflows/components/workflows.jsonnet#L13?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer explicitly pruning the task list rather than relying on pruning at a higher level. I believe pruning at a higher level is not a good practice because sometimes there are parts of the manifest which require an explicit empty object, eg : https://github.com/kubeflow/kubeflow/blob/master/kubeflow/core/cert-manager.libsonnet#L7
{ | ||
name: "pytorchjob-deploy", | ||
template: "pytorchjob-deploy", | ||
dependencies: ["deploy-kubeflow"], | ||
}, | ||
// Don't run argo test for gke since | ||
// it runs in the same cluster as the | ||
// test cluster. For minikube, we have |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't it deployed in its own namespace, so it's okay to be in the same gke cluster?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having multiple controllers for argo in the same cluster causes race conditions for argo workflows - we had a downtime in the test infrastructure because of this last week.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: lluunn The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
* feat: Add bayesian Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Fix requirement Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Update Signed-off-by: Ce Gao <gaoce@caicloud.io> * feat: Add config in cm Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Use 120s as timeout Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Update Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Fix timeout Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Address comments Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Fix command Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Fix path Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Change the period Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Add version in requirements Signed-off-by: Ce Gao <gaoce@caicloud.io> * pkg: Fix period Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Add build Signed-off-by: Ce Gao <gaoce@caicloud.io> * fix: Fix inital delay Signed-off-by: Ce Gao <gaoce@caicloud.io>
Fixes #740
This change is