-
Notifications
You must be signed in to change notification settings - Fork 631
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add more retries to more components #423
Conversation
kayenta-atlas/src/main/java/com/netflix/kayenta/atlas/metrics/AtlasMetricsService.java
Outdated
Show resolved
Hide resolved
@@ -29,7 +29,7 @@ | |||
import com.netflix.kayenta.security.AccountCredentialsRepository; | |||
import com.netflix.kayenta.storage.ObjectType; | |||
import com.netflix.kayenta.storage.StorageService; | |||
import com.netflix.spinnaker.kork.core.RetrySupport; | |||
import com.netflix.kayenta.util.Retry; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It occurred to me after I finished reviewing the PR that I had seen this logic before. I search kayenta on github and ended up back at this file. Wouldn't it make more sense to just add the new logic to the existing RetrySupport
class in kork?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a temporary step along the path to make things more context aware, so I'd prefer to keep it inside kayenta for now.
|
||
import java.util.function.Supplier; | ||
|
||
public class Retry { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please have a look at spring-retry.
It might be a good solution in case there is a need for more sophisticated retry policies: like configuring which types of exceptions should be retried, adding retry metrics or providing different backoff policies.
* feat(retry): add a retry util * bug(retry): retry Atlas actions * bug(retry): improve retry in ConfigBin * bug(retry): retry some s3 calls that seem prone to failure
While I would like a more general approach, and this is not quite as clean as I'd like, it's a step toward getting things closer to sane.
I don't yet have metrics wired up, but will add this eventually; I may also split out the "call this method until it stops throwing exceptions" type of call now into a more sane context (http, atlas) aware retry system. For now, this will work better than what was there.