-
Notifications
You must be signed in to change notification settings - Fork 38.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Query pod status outside loop over containers #77291
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -225,10 +225,10 @@ func (m *manager) reconcileState() (success []reconciledContainer, failure []rec | |
for _, pod := range m.activePods() { | ||
allContainers := pod.Spec.InitContainers | ||
allContainers = append(allContainers, pod.Spec.Containers...) | ||
status, ok := m.podStatusProvider.GetPodStatus(pod.UID) | ||
for _, container := range allContainers { | ||
status, ok := m.podStatusProvider.GetPodStatus(pod.UID) | ||
if !ok { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If we do make this change (and I'm not yet positive its safe), I believe we should also move this There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The variable reference is to the ok outside the container loop. So the check should be kept here. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I agree with @mattjmcnaughton on this one, in the interest of local readability, if There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Please confirm that the test needs to be modified when only pod name is returned:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Meaning, expectFailedContainerName parameter can be dropped from cpu_manager_test.go There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Let's keep the reconciledContainer struct as-is; there are other scenarios that need to test for individual container reconcile failures. For this change, if the pod status can't be queried we loop over each container in the pod and add it to the list of failures. I don't believe the tests would need to change beyond expecting all of the container to be in the failed list in this scenario. Let me know if that isn't clear. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I updated the PR. Let me know what you think, @ConnorDoyle In order not to have two if blocks with same condition, I kept the if block inside the container loop. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is no worse than what we had before, so if you don't feel like making further changes it's OK. Just for clarity, I had something like this in mind: for _, pod := range m.activePods() {
allContainers := pod.Spec.InitContainers
allContainers = append(allContainers, pod.Spec.Containers...)
if status, ok := m.podStatusProvider.GetPodStatus(pod.UID); !ok {
klog.Warningf("[cpumanager] reconcileState: skipping pod; status not found (pod: %s)", pod.Name)
for _, container := range allContainers {
failure = append(failure, reconciledContainer{pod.Name, container.Name, ""})
}
continue // to next pod
}
for _, container := range allContainers {
// ...
} The benefit I see would be to include all containers from the pod we failed to get status in the failure list, instead of just the first container. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The current code is consistent with previous behavior. So please accept the PR if it is Okay with you. Thanks There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Usually for such a small efficiency gain the readability hit would not be worth it. It's idiomatic to check error conditions immediately in go. But, since it's only one line removed it is OK. |
||
klog.Warningf("[cpumanager] reconcileState: skipping pod; status not found (pod: %s, container: %s)", pod.Name, container.Name) | ||
klog.Warningf("[cpumanager] reconcileState: skipping pod; status not found (pod: %s)", pod.Name) | ||
failure = append(failure, reconciledContainer{pod.Name, container.Name, ""}) | ||
break | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we be confident that the
podStatus
will remain consistent over this entire loop? If we can't, I think we open ourselves up to a possible race condition if we check the podStatus at the start of the loop, and then it somehow changes during the middle of executing the loop. Although to be fair, there's also that risk even if we check the pod status during each iteration of the loop, its just less because we are checking the pod status more frequently.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ConnorDoyle
Can you comment on this refactoring ?
Thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this makes sense to do. the pod status really only needs to be queried once before we loop through the containers.