Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(orchestration): Allow sibling stages to continue on FAILED_CONTINUE #3252

Merged

Conversation

marchello2000
Copy link
Contributor

Normally, if a synthetic child stage fails with FAILED_CONTINUE the CompleteStageHandler
will propagate the FAILED_CONTINUE status to the parent. This will prevent all subsequent
sibling stages from executing.
This change allows for an option (similar to Tasks) to continue execution even when a
child stage has a status=FAILED_CONTINUE

This option can only be set on a stage that is a synthetic child.
This is needed for Monitored Deploy where we want to allow the deploy to proceed even if the
monitoring fails

@@ -142,7 +142,7 @@ class CompleteStageHandler(

// When a synthetic stage ends with FAILED_CONTINUE, propagate that status up to the stage's
// parent so that no more of the parent's synthetic children will run.
if (stage.status == FAILED_CONTINUE && stage.syntheticStageOwner != null) {
if (stage.status == FAILED_CONTINUE && stage.syntheticStageOwner != null && stage.allowSiblingStagesToContinueOnFailure) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be !stage.allow...? (based on the comment of not wanting to propagate up to the parent if siblings are allowed to run)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, thank you! change the name of the var last minute that inverted the meaning and forgot to update this. i guess i should find and update tests for this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, look, the existing tests caught it... fixed up and added a new test

* @param defaultValue default value to return if key is not present
* @return value or null if not present
*/
Object getCurrentOnly(@Nullable Object key, Object defaultValue) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure how I feel about getCurrentOnly vs overloading get() -- will let someone else weigh in.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you mean something like

public Object get(@Nullable Object key, Object defaultValue, boolean localOnly)

?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 on overloading get(...)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious, why?

I think

stage.context.getCurrentOnly("someValue", false);

is more descriptive than:

stage.context.get("someValue", false, true);

Copy link
Contributor

@jeyrschabu jeyrschabu Oct 25, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get() because the context is a map. Matter of preference, no strong opinions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hear you. This area has caused some confusion in the past (where something comes from in the context) so I am more inclined to make this very explicit

Normally, if a synthetic child stage fails with `FAILED_CONTINUE` the `CompleteStageHandler`
will propagate the `FAILED_CONTINUE` status to the parent. This will prevent all subsequent
sibling stages from executing.
This change allows for an option (similar to Tasks) to continue execution even when a
child stage has a `status=FAILED_CONTINUE`

This option can only be set on a stage that is a synthetic child.
This is needed for Monitored Deploy where we want to allow the deploy to proceed even if the
monitoring fails
marchello2000 added a commit to marchello2000/orca that referenced this pull request Oct 25, 2019
This change depends on spinnaker#3252
Sometimes, `DisableClusterStage` can fail. We don't want failures in `DisableClusterStage`
prenventing the subsequent `ShrinkCluster` from trying to run and clean up the canary+baseline clusters
@marchello2000
Copy link
Contributor Author

Going to merge this because there are teams blocked on this. If you have feedback please give it to me and I will incorporate

@marchello2000 marchello2000 merged commit ae0e2cf into spinnaker:master Oct 25, 2019
@marchello2000 marchello2000 deleted the mark/allowsSiblingStages branch October 25, 2019 21:55
mergify bot pushed a commit that referenced this pull request Oct 25, 2019
This change depends on #3252
Sometimes, `DisableClusterStage` can fail. We don't want failures in `DisableClusterStage`
prenventing the subsequent `ShrinkCluster` from trying to run and clean up the canary+baseline clusters
marchello2000 added a commit to marchello2000/orca that referenced this pull request Oct 27, 2019
This is a follow up to spinnaker#3259 because I rebased my local changes incorrectly.

Sometimes, `DisableClusterStage` can fail. We don't want failures in `DisableClusterStage`
preventing the subsequent `ShrinkCluster` from trying to run and clean up the canary+baseline clusters

This change buids on top of the functionality introduced in spinnaker#3252
to allow sibling stages to run even if a stage fails (correct error code will be propagated to the execution if this happens so it's clear to the user)
mergify bot pushed a commit that referenced this pull request Nov 1, 2019
This is a follow up to #3259 because I rebased my local changes incorrectly.

Sometimes, `DisableClusterStage` can fail. We don't want failures in `DisableClusterStage`
preventing the subsequent `ShrinkCluster` from trying to run and clean up the canary+baseline clusters

This change buids on top of the functionality introduced in #3252
to allow sibling stages to run even if a stage fails (correct error code will be propagated to the execution if this happens so it's clear to the user)
sergiopena pushed a commit to sergiopena/orca that referenced this pull request Dec 13, 2019
This is a follow up to spinnaker#3259 because I rebased my local changes incorrectly.

Sometimes, `DisableClusterStage` can fail. We don't want failures in `DisableClusterStage`
preventing the subsequent `ShrinkCluster` from trying to run and clean up the canary+baseline clusters

This change buids on top of the functionality introduced in spinnaker#3252
to allow sibling stages to run even if a stage fails (correct error code will be propagated to the execution if this happens so it's clear to the user)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants