Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add containerset retry strategy. Fixes #7290 #7377

Merged
merged 22 commits into from
Feb 13, 2022

Conversation

NikeNano
Copy link
Contributor

@NikeNano NikeNano commented Dec 10, 2021

Fixes: #7290

Don't bother creating a PR until you've done this:

  • Run make pre-commit -B to fix codegen, lint, and commit message problems.

Create your PR as a draft.

  • Your PR needs to pass the required checks before it can be approved. If the check is not required (e.g. E2E tests) it
    does not need to pass.
  • Once required tests have passed, you can make it "Ready for review".
  • Say how how you tested your changes. If you changed the UI, attach screenshots.

Tips:

  • If changes were requested, and you've made them, then dismiss the review to get it looked at again.
  • Add you organization to USERS.md if you like.
  • You can ask for help!

@NikeNano NikeNano marked this pull request as draft December 10, 2021 07:30
test/e2e/functional_test.go Outdated Show resolved Hide resolved
@NikeNano NikeNano force-pushed the nikenano/ContainerSetRetry branch 2 times, most recently from 67efb6f to 70b8fe3 Compare December 15, 2021 21:40
@NikeNano NikeNano marked this pull request as ready for review December 15, 2021 22:05
Copy link
Contributor

@alexec alexec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

initial thoughts:

  • I don't think you need limit, you already have steps in backoff.
  • Do you think we want one strategy for all containers in the set? Or different ones for different containers?

@NikeNano
Copy link
Contributor Author

NikeNano commented Dec 16, 2021

  • I don't think you need limit, you already have steps in backoff.

Yes, it is not needed, will stick to the steps.

  • Do you think we want one strategy for all containers in the set? Or different ones for different containers?

To keep it simpler I think we should have a general one to start of with also based upon the feedback here it seems to solve the most urgent use case.

@alexec
Copy link
Contributor

alexec commented Dec 16, 2021

To keep it simpler I think we should have a general one to start of with also based upon the feedback here it seems to solve the most urgent use case.

Sure. So we can add that in a v2.

@NikeNano
Copy link
Contributor Author

initial thoughts:

  • I don't think you need limit, you already have steps in backoff.

I mixed up it up with wait.Backoff the current implementation of retryStrategy dont have steps in the implementation. The current implementation (not using step) is also similar to the workflow retry strategy that have limit separated:

type RetryStrategy struct {
. Thus I think we should keep the current implementation not using step but limit separate from the backoff. WDYT @alexec

@NikeNano NikeNano requested a review from alexec January 3, 2022 08:53
@sarabala1979 sarabala1979 self-assigned this Jan 10, 2022
@NikeNano
Copy link
Contributor Author

@sarabala1979 could you take a look?

Copy link
Contributor

@alexec alexec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there are ways to simplify the code.

cmd/argoexec/commands/emissary.go Outdated Show resolved Hide resolved
@alexec alexec changed the title feat: add containerset retry strategy. Fixes: #7290 feat: add containerset retry strategy. Fixes #7290 Jan 27, 2022
Copy link
Contributor

@alexec alexec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

more changes requested I'm afraid

test/e2e/functional_test.go Show resolved Hide resolved
util/intstr/parametrizable_test.go Outdated Show resolved Hide resolved
workflow/controller/operator.go Outdated Show resolved Hide resolved
cmd/argoexec/commands/emissary.go Outdated Show resolved Hide resolved
@NikeNano
Copy link
Contributor Author

I have realised that the parsing from v1alpha1.Backoff to wait.Backoff is not correct. The reason for that is that the fields actually don't map 1 to 1(miss understood the docs before) . I tried to use the wait.Backoff type as well in the

type ContainerSetTemplate struct {}

but the generation then fails:

2022/01/27 20:32:26 k8s.io/api/core/v1/generated.proto:12:1: warning: Import k8s.io/apimachinery/pkg/runtime/schema/generated.proto is unused.
2022/01/27 20:32:28 k8s.io/api/policy/v1beta1/generated.proto:11:1: warning: Import k8s.io/apimachinery/pkg/runtime/generated.proto is unused.
k8s.io/api/policy/v1beta1/generated.proto:12:1: warning: Import k8s.io/apimachinery/pkg/runtime/schema/generated.proto is unused.
2022/01/27 20:32:28 github.com/argoproj/argo-workflows/v3/pkg/apis/workflow/v1alpha1/generated.proto:1912:18: "duration" is already defined in "github.com.argoproj.argo_workflows.v3.pkg.apis.workflow.v1alpha1.Backoff".
github.com/argoproj/argo-workflows/v3/pkg/apis/workflow/v1alpha1/generated.proto:1918:19: "factor" is already defined in "github.com.argoproj.argo_workflows.v3.pkg.apis.workflow.v1alpha1.Backoff".
github.com/argoproj/argo-workflows/v3/pkg/apis/workflow/v1alpha1/generated.proto:1910:9: "Backoff" is already defined in "github.com.argoproj.argo_workflows.v3.pkg.apis.workflow.v1alpha1".
github.com/argoproj/argo-workflows/v3/pkg/apis/workflow/v1alpha1/generated.proto:8:1: warning: Import github.com/gogo/protobuf/gogoproto/gogo.proto is unused.
github.com/argoproj/argo-workflows/v3/pkg/apis/workflow/v1alpha1/generated.proto:12:1: warning: Import k8s.io/apimachinery/pkg/runtime/generated.proto is unused.
github.com/argoproj/argo-workflows/v3/pkg/apis/workflow/v1alpha1/generated.proto:13:1: warning: Import k8s.io/apimachinery/pkg/runtime/schema/generated.proto is unused.
2022/01/27 20:32:28 protoc -I . -I /go/src -I /go/src --gogo_out=/go/src /go/src/github.com/argoproj/argo-workflows/v3/pkg/apis/workflow/v1alpha1/generated.proto
2022/01/27 20:32:28 Unable to generate protoc on github.com.argoproj.argo_workflows.v3.pkg.apis.workflow.v1alpha1: exit status 1

I guess we could recreate a new type to solve the issue but that feels a bit hacky. WDYT @alexec ?

@NikeNano NikeNano requested a review from alexec January 30, 2022 14:53
@alexec
Copy link
Contributor

alexec commented Jan 31, 2022

You can't have two types named Backoff, so don't use wait.Backoff as a field.

Signed-off-by: NikeNano <niklas.sven.hansson@gmail.com>
Signed-off-by: NikeNano <niklas.sven.hansson@gmail.com>
Signed-off-by: NikeNano <niklas.sven.hansson@gmail.com>
Signed-off-by: NikeNano <niklas.sven.hansson@gmail.com>
Signed-off-by: NikeNano <niklas.sven.hansson@gmail.com>
Signed-off-by: NikeNano <niklas.sven.hansson@gmail.com>
Signed-off-by: NikeNano <niklas.sven.hansson@gmail.com>
Signed-off-by: NikeNano <niklas.sven.hansson@gmail.com>
Signed-off-by: NikeNano <niklas.sven.hansson@gmail.com>
Signed-off-by: NikeNano <niklas.sven.hansson@gmail.com>
Signed-off-by: NikeNano <niklas.sven.hansson@gmail.com>
Signed-off-by: NikeNano <niklas.sven.hansson@gmail.com>
Signed-off-by: NikeNano <niklas.sven.hansson@gmail.com>
Signed-off-by: NikeNano <niklas.sven.hansson@gmail.com>
Signed-off-by: NikeNano <niklas.sven.hansson@gmail.com>
Signed-off-by: NikeNano <niklas.sven.hansson@gmail.com>
Signed-off-by: NikeNano <niklas.sven.hansson@gmail.com>
Signed-off-by: NikeNano <niklas.sven.hansson@gmail.com>
Signed-off-by: NikeNano <niklas.sven.hansson@gmail.com>
Signed-off-by: NikeNano <niklas.sven.hansson@gmail.com>
@NikeNano NikeNano requested a review from alexec February 10, 2022 14:28
VolumeMounts []corev1.VolumeMount `json:"volumeMounts,omitempty" protobuf:"bytes,3,rep,name=volumeMounts"`
Containers []ContainerNode `json:"containers" protobuf:"bytes,4,rep,name=containers"`
VolumeMounts []corev1.VolumeMount `json:"volumeMounts,omitempty" protobuf:"bytes,3,rep,name=volumeMounts"`
RetryStrategy *ContainerSetRetryStrategy `json:"retryStrategy,omitempty" protobuf:"bytes,5,opt,name=retryStrategy"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could do with a comment for users here as this gets surfaced in docs. How does the retry work for example?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will update with docs.

Signed-off-by: NikeNano <niklas.sven.hansson@gmail.com>
Signed-off-by: NikeNano <niklas.sven.hansson@gmail.com>
@alexec alexec enabled auto-merge (squash) February 13, 2022 23:31
@alexec alexec merged commit 808c561 into argoproj:master Feb 13, 2022
@NikeNano
Copy link
Contributor Author

Thanks for all the feedback @alexec appreciate it, have a hard time to get time off from work so things take slightly longer than I wish for :)

@sarabala1979 sarabala1979 mentioned this pull request Mar 1, 2022
@agilgur5 agilgur5 added area/templates/container-set area/retry-manual Manual workflow "Retry" Action (API/CLI/UI). See retryStrategy for template-level retries labels Oct 7, 2023
@agilgur5 agilgur5 added area/retryStrategy Template-level retryStrategy and removed area/retry-manual Manual workflow "Retry" Action (API/CLI/UI). See retryStrategy for template-level retries labels Apr 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support retry stratagy for containers in containerSet
4 participants