Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CORE-1441] [2.4.x] leave pipelines in STANDBY until they're FINISHED #8500

Merged
merged 1 commit into from Jan 10, 2023

Conversation

msteffen
Copy link
Contributor

@msteffen msteffen commented Jan 8, 2023

This is the 2.4.x version of #8499

When monitorPipeline starts, the first thing it does is put its pipeline
in STANDBY
. Then, it listens on ciChan for commits in READY and takes
the pipeline out of STANDBY if it gets any
. Meanwhile, another goro
calls SubscribeCommit and puts any events into ciChan.

The issue is that the first goro discards PFS events where the commit is
FINISHING, so if the PPS master restarts while a commit is in FINISHING,
it'll come back up, put the pipeline in STANDBY, and then won't take it
out. This leaves the commit stuck in FINISHING until compaction finishes
much later.

This PR changes the logic in monitorPipeline so that if a monitorPipeline
goro's pipeline has a commit in FINISHING, the pipeline goes into RUNNING
and stays there until the commit is FINISHED.

When monitorPipeline starts, the first thing it does is put its pipeline
in STANDBY[1]. Then, it listens on ciChan for commits in READY and takes
the pipeline out of STANDBY if it gets any[2]. Meanwhile, another goro
calls SubscribeCommit and puts any events into ciChan[3].

The issue is that the first goro discards PFS events where the commit is
FINISHING, so if the PPS master restarts while a commit is in FINISHING,
it'll come back up, put the pipeline in STANDBY, and then won't take it
out. This leaves the commit stuck in FINISHING until compaction finishes
much, much later.

[1]:
https://github.com/pachyderm/pachyderm/blob/8fca5b912358c15f67d8b19a32f84d2ba507571c/src/server/pps/server/monitor.go#L137
[2]:
https://github.com/pachyderm/pachyderm/blob/8fca5b912358c15f67d8b19a32f84d2ba507571c/src/server/pps/server/monitor.go#L163
[3]:
https://github.com/pachyderm/pachyderm/blob/8fca5b912358c15f67d8b19a32f84d2ba507571c/src/server/pps/server/monitor.go#L110
@msteffen msteffen changed the title leave pipelines in STANDBY until they're FINISHED [2.4.x] leave pipelines in STANDBY until they're FINISHED Jan 8, 2023
@msteffen msteffen changed the title [2.4.x] leave pipelines in STANDBY until they're FINISHED [CORE-1441] [2.4.x] leave pipelines in STANDBY until they're FINISHED Jan 9, 2023
Copy link
Contributor

@brycemcanally brycemcanally left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@msteffen msteffen merged commit fb0dcb8 into 2.4.x Jan 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants