Query pod status outside loop over containers #77291

tedyu · 2019-05-01T02:37:29Z

What type of PR is this?
/kind cleanup

What this PR does / why we need it:
In manager#reconcileState, we call GetPodStatus per container.

This is not needed.
This PR moved the call outside the loop over containers.

NONE

tedyu · 2019-05-01T03:48:42Z

/test pull-kubernetes-e2e-gce-100-performance

mattjmcnaughton

Thanks for the pr :)

One small functionality comment from me, and I would also like us to verify this change is safe to make before preceding.

mattjmcnaughton · 2019-05-01T14:33:38Z

pkg/kubelet/cm/cpumanager/cpu_manager.go

@@ -225,8 +225,8 @@ func (m *manager) reconcileState() (success []reconciledContainer, failure []rec
 	for _, pod := range m.activePods() {
 		allContainers := pod.Spec.InitContainers
 		allContainers = append(allContainers, pod.Spec.Containers...)
+		status, ok := m.podStatusProvider.GetPodStatus(pod.UID)


Can we be confident that the podStatus will remain consistent over this entire loop? If we can't, I think we open ourselves up to a possible race condition if we check the podStatus at the start of the loop, and then it somehow changes during the middle of executing the loop. Although to be fair, there's also that risk even if we check the pod status during each iteration of the loop, its just less because we are checking the pod status more frequently.

@ConnorDoyle
Can you comment on this refactoring ?

Thanks

Yes, this makes sense to do. the pod status really only needs to be queried once before we loop through the containers.

mattjmcnaughton · 2019-05-01T14:34:11Z

pkg/kubelet/cm/cpumanager/cpu_manager.go

 		for _, container := range allContainers {
-			status, ok := m.podStatusProvider.GetPodStatus(pod.UID)
 			if !ok {


If we do make this change (and I'm not yet positive its safe), I believe we should also move this if !ok check outside of the loop.

The variable reference is to the ok outside the container loop.
The intention of the check here is to include container name in the failure.

So the check should be kept here.

I agree with @mattjmcnaughton on this one, in the interest of local readability, if ok is false you could immediately append all of the containers to the failure list and continue the outer pod iteration loop. As is, only the first container is logged; it makes more sense to just log the pod name.

Please confirm that the test needs to be modified when only pod name is returned:

if !foundFailedContainer { t.Errorf("Expected reconciliation failure for container: %s", testCase.expectFailedContainerName)

Meaning, expectFailedContainerName parameter can be dropped from cpu_manager_test.go

Let's keep the reconciledContainer struct as-is; there are other scenarios that need to test for individual container reconcile failures.

For this change, if the pod status can't be queried we loop over each container in the pod and add it to the list of failures. I don't believe the tests would need to change beyond expecting all of the container to be in the failed list in this scenario.

Let me know if that isn't clear.

I updated the PR. Let me know what you think, @ConnorDoyle

In order not to have two if blocks with same condition, I kept the if block inside the container loop.

This is no worse than what we had before, so if you don't feel like making further changes it's OK. Just for clarity, I had something like this in mind:

for _, pod := range m.activePods() { allContainers := pod.Spec.InitContainers allContainers = append(allContainers, pod.Spec.Containers...) if status, ok := m.podStatusProvider.GetPodStatus(pod.UID); !ok { klog.Warningf("[cpumanager] reconcileState: skipping pod; status not found (pod: %s)", pod.Name) for _, container := range allContainers { failure = append(failure, reconciledContainer{pod.Name, container.Name, ""}) } continue // to next pod } for _, container := range allContainers { // ... }

The benefit I see would be to include all containers from the pod we failed to get status in the failure list, instead of just the first container.

The current code is consistent with previous behavior.

So please accept the PR if it is Okay with you.

Thanks

Usually for such a small efficiency gain the readability hit would not be worth it. It's idiomatic to check error conditions immediately in go. But, since it's only one line removed it is OK.

ConnorDoyle · 2019-05-01T17:53:06Z

/assign

ConnorDoyle

/lgtm
/approve

ConnorDoyle · 2019-05-02T05:01:20Z

pkg/kubelet/cm/cpumanager/cpu_manager.go

 		for _, container := range allContainers {
-			status, ok := m.podStatusProvider.GetPodStatus(pod.UID)
 			if !ok {


Usually for such a small efficiency gain the readability hit would not be worth it. It's idiomatic to check error conditions immediately in go. But, since it's only one line removed it is OK.

k8s-ci-robot · 2019-05-02T05:02:05Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ConnorDoyle, tedyu

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~pkg/kubelet/cm/cpumanager/OWNERS~~ [ConnorDoyle]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Query pod status outside loop over containers

66ce525

k8s-ci-robot requested review from mtaufen and pmorie May 1, 2019 02:38

mattjmcnaughton suggested changes May 1, 2019

View reviewed changes

k8s-ci-robot assigned ConnorDoyle May 1, 2019

Log pod name when pod status cannot be queried

3fc16a7

ConnorDoyle approved these changes May 2, 2019

View reviewed changes

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 2, 2019

k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 2, 2019

k8s-ci-robot merged commit 98c4c1e into kubernetes:master May 2, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Query pod status outside loop over containers #77291

Query pod status outside loop over containers #77291

tedyu commented May 1, 2019

tedyu commented May 1, 2019

mattjmcnaughton left a comment

mattjmcnaughton May 1, 2019

tedyu May 1, 2019

ConnorDoyle May 1, 2019

mattjmcnaughton May 1, 2019

tedyu May 1, 2019

ConnorDoyle May 1, 2019 •

edited

tedyu May 1, 2019

tedyu May 1, 2019

ConnorDoyle May 1, 2019

tedyu May 1, 2019

ConnorDoyle May 1, 2019 •

edited

tedyu May 1, 2019

ConnorDoyle May 2, 2019

ConnorDoyle commented May 1, 2019

ConnorDoyle left a comment

ConnorDoyle May 2, 2019

k8s-ci-robot commented May 2, 2019

Query pod status outside loop over containers #77291

Query pod status outside loop over containers #77291

Conversation

tedyu commented May 1, 2019

tedyu commented May 1, 2019

mattjmcnaughton left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ConnorDoyle May 1, 2019 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ConnorDoyle May 1, 2019 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ConnorDoyle commented May 1, 2019

ConnorDoyle left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

k8s-ci-robot commented May 2, 2019

ConnorDoyle May 1, 2019 •

edited

ConnorDoyle May 1, 2019 •

edited