Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(clouddriver): Basic support for retryable kato tasks #3162

Merged
merged 2 commits into from
Sep 17, 2019

Conversation

robzienert
Copy link
Member

@robzienert robzienert commented Sep 17, 2019

Depends on spinnaker/kork#383

@cfieber
Copy link
Contributor

cfieber commented Sep 17, 2019

Thanks I Love It

@@ -48,10 +50,12 @@ class MonitorKatoTask implements RetryableTask {

private final Clock clock
private final Registry registry
@Autowired private KatoService kato
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you add this to the constructor instead

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeeeeeahh... I tried once and wound up reverting it. I'll try again.

this(registry, Clock.systemUTC())
this.kato = kato
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this seems wrong without KatoService as a constructor arg

import com.netflix.spinnaker.orca.pipeline.model.Stage
import groovy.transform.CompileStatic
import groovy.transform.TypeCheckingMode
import io.github.resilience4j.retry.RetryConfig
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see this used anywhere in here, old?

Copy link
Contributor

@cfieber cfieber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor comments but this feels like the future we were promised, jetpacks and all

default:
maxRetryAttempts: 3
waitDuration: 10s
enableExponentialBackoff: true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we talked in person but these defaults will blow orca's 60s task ack timeout which is going to cause issues/double executions, etc


if (status == ExecutionStatus.TERMINAL && katoTask.status.retryable) {
kato.resumeTask(katoTask.id)
status = ExecutionStatus.RUNNING
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably worth at least logging something here that we are "going to retry".
You are basically hiding an error from the user and it is possible the error goes away on retry or it becomes another error... and that makes it harder to debug issues. (this is the thing we talked about some standardization in orca about error messages... someday)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right. I've updated the PR to show usage of the SystemNotification stuff I was talking about.

@robzienert robzienert force-pushed the kato-task-retry branch 2 times, most recently from 6859cdb to 2473141 Compare September 17, 2019 19:47
@robzienert robzienert merged commit caa471c into spinnaker:master Sep 17, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants