Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Support specifying the pattern for transient and retryable errors #4889

Merged
merged 6 commits into from Jan 18, 2021
Merged

feat: Support specifying the pattern for transient and retryable errors #4889

merged 6 commits into from Jan 18, 2021

Conversation

terrytangyuan
Copy link
Member

Signed-off-by: terrytangyuan terrytangyuan@gmail.com

Checklist:

…tryable

Signed-off-by: terrytangyuan <terrytangyuan@gmail.com>
Signed-off-by: terrytangyuan <terrytangyuan@gmail.com>
@terrytangyuan terrytangyuan changed the title feat: Support specifying the pattern for transient errors that are retryable feat: Support specifying the pattern for transient and retryable errors Jan 15, 2021
config/config.go Outdated
@@ -36,6 +36,10 @@ type Config struct {
// MainContainer holds container customization for the main container
MainContainer *apiv1.Container `json:"mainContainer,omitempty"`

// TransientErrorPattern specifies the pattern to match for errors that can be seen as transient
// and retryable.
TransientErrorPattern string `json:"transientErrorPattern,omitempty"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be a rarely used feature, so an environment variable would be the way to configure it - it is much simpler too.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In our case, we'd like to whitelist non-critical errors that can be retried for issues during cluster maintenance/migration, etc., especially for non-production clusters where certain services may not be available.

Good suggestion. I'll add an env var to configure this instead.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just updated. Please take another look.

Signed-off-by: terrytangyuan <terrytangyuan@gmail.com>
return isExceededQuotaErr(err) || apierr.IsTooManyRequests(err) || isResourceQuotaConflictErr(err) || isTransientNetworkErr(err) || apierr.IsServerTimeout(err) || apierr.IsServiceUnavailable(err)
// TRANSIENT_ERROR_PATTERN allows to specify the pattern to match for errors that can be seen as transient
// and retryable.
pattern, _ := os.LookupEnv("TRANSIENT_ERROR_PATTERN")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can move this to matchTransientErr

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Updated.

Signed-off-by: terrytangyuan <terrytangyuan@gmail.com>
@alexec alexec merged commit 25abd1a into argoproj:master Jan 18, 2021
@alexec alexec added this to the v3.0 milestone Jan 18, 2021
@terrytangyuan terrytangyuan deleted the transient-error-pattern branch January 18, 2021 20:11
This was referenced Jan 19, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants