-
Notifications
You must be signed in to change notification settings - Fork 805
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(pipelines): Fixed interpretation of missing startTime during inst… #2820
Conversation
orca-queue/src/main/kotlin/com/netflix/spinnaker/orca/q/handler/StartStageHandler.kt
Show resolved
Hide resolved
other than missing @robfletcher correct me if I am wrong |
Note I kept the RUNNING where it was after the withAuth but not before the
origin save with the time.
Is that OK? I'm not sure how that is used or the implications. Please
confirm.
…On Fri, Apr 5, 2019 at 2:22 PM Mark Vulfson ***@***.***> wrote:
other than missing stage.status = RUNNING lgtm, I don't think setting
startTime earlier will have any consequences and agree with you it would
be nice to set startTime somewhere in a generic way that doesn't seem to
exist.
@robfletcher <https://github.com/robfletcher> correct me if I am wrong
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#2820 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AEKkGI-rzVGrpwnBNDiLIyxxG2JXmbWIks5vd5RZgaJpZM4cfUbi>
.
|
@ewiseblatt I think setting the startTime prior to setting RUNNING is OK since all this state is wrapped in a message that's pushed to the queue. |
My concern was more if the exception is thrown and it is in the RUNNING
state but never reached that original code point.
I dont know how the states are used and whether someone knows RUNNING means
it completed the earlier step that is now failing.
…On Fri, Apr 5, 2019 at 2:56 PM Jacob Kiefer ***@***.***> wrote:
@ewiseblatt <https://github.com/ewiseblatt> I think setting the startTime
prior to setting RUNNING is OK since all this state is wrapped in a message
that's pushed to the queue.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#2820 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AEKkGOX5GoHo8-skhzaGbV7emPpzr25wks5vd5xjgaJpZM4cfUbi>
.
|
It feels odd to me to not have the state and the startTime set together but it's probably very low impact as that RUNNING state is going to get either set or replaced with TERMINAL in very short order. |
…rumentation Fixes spinnaker/spinnaker#4244 There are conditions in which CompleteStageHandler can be called without having a startTime set. Often this is because an exception was thrown before the startTime was set, but there are various guards within StartStageHandler as well. Previously when this happened, a StartTime of 0 was used. This is particularly bad because the 40-years worth of nanoseconds that is spit out of spectator is beyond the resolution of "double" for metric stores that use double for time. This fix changes the assumption to be the end time, resulting in assuming 0 time. Not that this could possibly be a different type of issue if the underlying problem was something else, such as StartStageHandler not even being called for some reason. In addition I set the startTime at the beginning of the try block where exceptions may be thrown in order to ensure there is a startTime when going through that control flow. I think this might be messy depending on how time and state are used because it has a startTime but is not in the RUNNING state. There does not seem to be an appropriate state definition for this particular phase and am not sure the implications of adding one. I also dont know if moving the startTime so it includes this extra check has other implications. If so then the startTime could be reset where it is currently set.
orca-queue/src/main/kotlin/com/netflix/spinnaker/orca/q/handler/CompleteStageHandler.kt
Show resolved
Hide resolved
Being in the RUNNING state affected the handler guard checking for NOT_STARTED (line 71) causing it to be ignored. I dont really understand the control flow here, and could not follow along in a debugger because kotlin/mockito jumps all over the place making it impossible for me to grok that way. I suspect there is a better fix and could involve adding another status between NOT_STARTED and RUNNING but that is beyond my level of intimacy with the model here. fixed tests to accomodate the second storeStage call.
I needed to make a change to satisfy the tests (and fix the cardinality checks in some tests). Please take another look. |
stage.withAuth { | ||
stage.plan() | ||
} | ||
|
||
stage.status = RUNNING |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's actually better to have this together with stage.startTime
as @robfletcher noted. Looking at CompleteStageHandler
(https://github.com/spinnaker/orca/blob/master/orca-queue/src/main/kotlin/com/netflix/spinnaker/orca/q/handler/CompleteStageHandler.kt#L56) it seems that it will be harmless. And it will be more accurate
@ewiseblatt should we merge this PR? |
…rumentation
Fixes spinnaker/spinnaker#4244
There are conditions in which CompleteStageHandler can be called without
having a startTime set. Often this is because an exception was thrown
before the startTime was set, but there are various guards within
StartStageHandler as well.
Previously when this happened, a StartTime of 0 was used. This is
particularly bad because the 40-years worth of nanoseconds that is
spit out of spectator is beyond the resolution of "double" for metric
stores that use double for time. This fix changes the assumption
to be the end time, resulting in assuming 0 time. Not that this could
possibly be a different type of issue if the underlying problem was
something else, such as StartStageHandler not even being called for some
reason.
In addition I set the startTime at the beginning of the try block where
exceptions may be thrown in order to ensure there is a startTime when
going through that control flow. I think this might be messy depending
on how time and state are used because it has a startTime but is not in
the RUNNING state. There does not seem to be an appropriate state definition
for this particular phase and am not sure the implications of adding one.
I also dont know if moving the startTime so it includes this extra check
has other implications. If so then the startTime could be reset where it
is currently set.
@robfletcher