Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix pod controller wait logic #368

Merged
merged 1 commit into from
Oct 30, 2023
Merged

Conversation

pablochacin
Copy link
Collaborator

Description

After changes introduced in #355 the logic for waiting all pod visitors to finish and report errors was broken.

Using a WaitGroup makes the pod controller wait for all pod visitors to end before checking for errors. Consequently, If one pod visitor fails (for instance, the agent command fails) the rest of the visitors will not be immediately canceled.

Checklist:

  • I have performed a self-review of my code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have added tests that prove my fix is effective or that my feature works.
  • I have run linter locally (make lint) and all checks pass.
  • I have run tests locally (make test) and all tests pass.
  • I have run relevant integration test locally (make integration-xxx for affected packages)
  • I have run relevant e2e test locally (make e2e-xxx for disruptors, or cluster related changes)
  • Any dependent changes have been merged and published in downstream modules

Signed-off-by: Pablo Chacin <pablochacin@gmail.com>
Comment on lines +51 to 64
for {
select {
case e := <-doneCh:
if e != nil {
return e
}
pending--
if pending == 0 {
return nil
}
case <-ctx.Done():
return ctx.Err()
}
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a regular, c-style for loop migth be slightly more readable. WDYT?

Suggested change
for {
select {
case e := <-doneCh:
if e != nil {
return e
}
pending--
if pending == 0 {
return nil
}
case <-ctx.Done():
return ctx.Err()
}
}
for i := 0; i < len(c.targets); i++ {
select {
case e := <-doneCh:
if e != nil {
return e
}
case <-ctx.Done():
return ctx.Err()
}
}
return nil

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not convinced. I think that even when this form is simpler, it somehow obscures when the pending is decreased because the select also has the ctx.Done() case.

@pablochacin pablochacin merged commit 9ddd019 into main Oct 30, 2023
8 checks passed
@pablochacin pablochacin deleted the fix-pod-controller-wait-logic branch October 30, 2023 11:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants