-
Notifications
You must be signed in to change notification settings - Fork 831
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(scheduler): Pipeline terminating fix on rebalance #5703
fix(scheduler): Pipeline terminating fix on rebalance #5703
Conversation
g.Expect(err).To(BeNil()) | ||
|
||
//to allow events to propagate | ||
time.Sleep(500 * time.Millisecond) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these the ugly sleeps you were telling me about? Not sure they are avoidable given the nature of the local (async) event hub. But (note to self) we should probably do some concurrent operation tests on a single object to make sure the state remains consistent. I believe we could even do that under the current consistency checking k6 test, by having max 1 model/pipeline.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had a further look at this specific issue and these sleeps are required just because of the way the test is setup. In practice SetPipelineState
is called as a reaction to an event and therefore the concurrent issue in this test is hopefully just artificial.
I agree though that it should be rigorously tested.
if err := c.pipelineHandler.SetPipelineState(pv.Name, pv.Version, pv.UID, pipeline.PipelineCreating, "Rebalance", sourceChainerServer); err != nil { | ||
logger.WithError(err).Errorf("Failed to set pipeline state to creating for %s", pv.String()) | ||
// we do not need to set pipeline state to creating if it is already in terminating state, and we need to delete it | ||
if pv.State.Status == pipeline.PipelineTerminating { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Noted that we can't be in the "Terminated" state here because we only fetch running pipelines initially (with "Terminating" being considered one of the running states).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
This PR fixes a bug when a pipeline is in a terminating state gets recreated in the case of a dataflow subscription causing a rebalance. In this case we keep marking the pipeline as terminating and issue a delete operation to
dataflow-engine
.Also added testing coverage for this part of the code base.
Fixes INFRA-1023 (internal)