Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(executor): Deal with the pod watch API call timing out #4734

Merged
merged 3 commits into from
Dec 14, 2020
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
18 changes: 14 additions & 4 deletions workflow/executor/common/wait/wait.go
Expand Up @@ -26,22 +26,32 @@ func UntilTerminated(kubernetesInterface kubernetes.Interface, namespace, podNam
}

func untilTerminatedAux(podInterface v1.PodInterface, containerID string, listOptions metav1.ListOptions) (bool, error) {
for {
complete, done, err := doWatch(podInterface, containerID, listOptions)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In English, "complete" means the same thing as "done". Is there a better name to use?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very fair point. I couldn't think of a better synonym, so I went for something more explicit and swapped the booleans around to match.

if complete {
return done, err
}
log.Infof("Pod watch timed out, restarting watch on %s", containerID)
}
}

func doWatch(podInterface v1.PodInterface, containerID string, listOptions metav1.ListOptions) (bool, bool, error) {
w, err := podInterface.Watch(listOptions)
if err != nil {
return true, fmt.Errorf("could not watch pod: %w", err)
return true, true, fmt.Errorf("could not watch pod: %w", err)
}
defer w.Stop()
for event := range w.ResultChan() {
pod, ok := event.Object.(*corev1.Pod)
if !ok {
return false, apierrors.FromObject(event.Object)
return true, false, apierrors.FromObject(event.Object)
}
for _, s := range pod.Status.ContainerStatuses {
if common.GetContainerID(&s) == containerID && s.State.Terminated != nil {
return true, nil
return true, true, nil
}
}
listOptions.ResourceVersion = pod.ResourceVersion
}
return true, nil
return false, false, nil
}