Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ignore pod allocation errors during sync if no node exists #3945

Merged
merged 2 commits into from
Nov 13, 2023

Conversation

bpickard22
Copy link
Contributor

@bpickard22 bpickard22 commented Oct 3, 2023

During start up when existing pods run through the sync function, their IPs will attempt to be pre-allocated. This fails if the node does not exist, which can happen in kapi. Ignore those errors rather than causing ovnkube start up to fail.

}
return expectedLogicalPortName, nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question on this logic is. Do we need to return an error, or nil is the right value here ?
Assuming we are just going to ignore the allocation, then what you have is correct.

Otherwise, we may need to change that line to:

return expectedLogicalPortName, fmt.Errorf("pod: %s/%s is in a non-existing node: %s: %w", 
pod.Namespace, pod.Name, pod.Spec.NodeName, err)

@@ -64,7 +64,11 @@ func (bnc *BaseNetworkController) allocatePodIPsOnSwitch(pod *kapi.Pod,
// unknown condition how we are getting a non-terminating pod without a node here
klog.Errorf("Pod IP allocation found for a non-existent node in API with unknown "+
"condition. Pod: %s/%s, node: %s", pod.Namespace, pod.Name, pod.Spec.NodeName)
} else if node == nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so if bnc.doesNetworkRequireIPAM() is true we want to continue with the allocation attempt and not simply return?
or should the new check be added in line 63 above?

}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: undo space added (line 71)

@@ -54,7 +54,7 @@ func (bnc *BaseNetworkController) allocatePodIPsOnSwitch(pod *kapi.Pod,
// it is possible to try to add a pod here that has no node. For example if a pod was deleted with
// a finalizer, and then the node was removed. In this case the pod will still exist in a running state.
// Terminating pods should still have network connectivity for pre-stop hooks or termination grace period
if _, err := bnc.watchFactory.GetNode(pod.Spec.NodeName); kerrors.IsNotFound(err) &&
if node, err := bnc.watchFactory.GetNode(pod.Spec.NodeName); kerrors.IsNotFound(err) &&
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

talked to @trozet here is recap
returning an error from allocatePodIPsOnSwitch during initial sync will cause restart. we know there is a k8s bug that can give us Running pod without existing node. Currently, for all non-terminated pods waitForNodeLogicalSwitchSubnetsInCache will wait for 30 seconds. Now we agreed that in case waitForNodeLogicalSwitchSubnetsInCache returns error for running pods without a node, we should ignore it considering that the node actually doesn't exist and won't come up similar to https://github.com/ovn-org/ovn-kubernetes/blob/master/go-controller/pkg/ovn/base_network_controller_pods.go#L59-L62 and return error in other cases.
So you could set some bool like podWithoutNode under https://github.com/ovn-org/ovn-kubernetes/blob/master/go-controller/pkg/ovn/base_network_controller_pods.go#L57-L58 and then conditionally return error from waitForNodeLogicalSwitchSubnetsInCache based on this value
hope this makes sense :) if not, please reach out to me

Copy link
Contributor

@trozet trozet Oct 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a more elegant way to fix it is to make the error be a specific instance of a new type, lets call it WaitForNodeSwitchError. Then in syncPods, when this code happens:

			expectedLogicalPortName, annotations, err = oc.allocateSyncPodsIPs(pod)
			if err != nil {
				return err
			}

We check here if the error is WatiForNodeSwitchError and then just continue? I guess continue may not be the best option by itself, since we will continue to try to add every pod and wait 30 seconds for the switch, that we already waited 30 seconds for and it never showed up. So might need to store notFoundSwitches map[string]bool var or something and compare each pod so you can short-circuit pods that we already know arent going find the switch.

@@ -64,10 +64,14 @@ func (bnc *BaseNetworkController) allocatePodIPsOnSwitch(pod *kapi.Pod,
// unknown condition how we are getting a non-terminating pod without a node here
klog.Errorf("Pod IP allocation found for a non-existent node in API with unknown "+
"condition. Pod: %s/%s, node: %s", pod.Namespace, pod.Name, pod.Spec.NodeName)
} else {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you dont want this else statement. If a network does not require IPAM we wouldnt care if the node isnt found and cant allocate IP. What we are interested in is a few lines above in this block:

} else if bnc.doesNetworkRequireIPAM() {
			// unknown condition how we are getting a non-terminating pod without a node here
			klog.Errorf("Pod IP allocation found for a non-existent node in API with unknown "+
				"condition. Pod: %s/%s, node: %s", pod.Namespace, pod.Name, pod.Spec.NodeName)
		} 

Here we still want to communicate that we couldnt find the node for this pod, but we can change it to be a warning as we will not return out of the function and still proceed in the function to eventually call:

if err := bnc.waitForNodeLogicalSwitchSubnetsInCache(switchName); err != nil {

to see if the node (and switch) shows up within 30 seconds.

}
if err := bnc.waitForNodeLogicalSwitchSubnetsInCache(switchName); err != nil {
return expectedLogicalPortName, fmt.Errorf("failed to wait for switch %s to be added to cache. IP allocation may fail!",
return expectedLogicalPortName, fmt.Errorf("failed to wait for switch %s to be added to cache. IP allocation may fail",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here you need to make a new error type, and return that new error type. This is so the caller can check if the error is this type and decide to ignore it or not (which will be used in the sync case).

@@ -44,8 +46,18 @@ func (oc *DefaultNetworkController) syncPods(pods []interface{}) error {
}
} else if oc.isPodScheduledinLocalZone(pod) {
expectedLogicalPortName, annotations, err = oc.allocateSyncPodsIPs(pod)
for pod.Spec.NodeName = range switchesNotFound {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this for statement doesnt work with overriding the pod.Spec.NodeName with the values in the map, then checking if that value exists in the map. It will always be true. Also you want to update the map when the error is returned, and check the map before you allocateSyncPodIPs. Also the error needs to be the new type indicating the switch never showed up, and not a kapi not found error. So you need something like this:

if switchesNotFound[pod.Spec.NodeName] {
    klog.Warningf("Cannot allocate IPs for %s/%s, node was not found after 30 seconds")
   continue
}

expectedLogicalPortName, annotations, err = oc.allocateSyncPodsIPs(pod)

if errors.Is(err, WaitForNodeSwitchError) {
   klog.Warningf("Cannot allocate IPs for %s/%s, node was not found after 30 seconds")
   switchesNotFound[pod.Spec.NodeName] = true
   continue
} else {
   return err
}


@tssurya
Copy link
Member

tssurya commented Oct 24, 2023

poke @bpickard22 : PTAL review comments, this will help close https://issues.redhat.com/browse/OCPBUGS-15788

Copy link
Contributor

@trozet trozet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would be good to have a test for this, may be its as simple as creating a pod in the informer without the node and verifying sync does not error.

Also you need to squash your first and 2nd commit together.

}
} else {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you removed some logic here that shouldnt have been removed, like skipping non local zone pods, and checking if annotations is nil. This is causing the unit tests to crash. Also you need to check if there is an error other than nodeNotFoundError:

[trozet@fedora go-controller]$ git diff
diff --git a/go-controller/pkg/ovn/pods.go b/go-controller/pkg/ovn/pods.go
index 7c23fba85..7d53f2ecf 100644
--- a/go-controller/pkg/ovn/pods.go
+++ b/go-controller/pkg/ovn/pods.go
@@ -54,11 +54,20 @@ func (oc *DefaultNetworkController) syncPods(pods []interface{}) error {
                        }
                        expectedLogicalPortName, annotations, err = oc.allocateSyncPodsIPs(pod)
 
-                       if errors.Is(err, nodeNotFoundError) {
-                               klog.Warningf("Cannot allocate IPs for %s/%s, node was not found after 30 seconds", pod.Namespace, pod.Name)
-                               switchesNotFound[pod.Spec.NodeName] = true
-                               continue
+                       if err != nil {
+                               if errors.Is(err, nodeNotFoundError) {
+                                       klog.Warningf("Cannot allocate IPs for %s/%s, node was not found after 30 seconds", pod.Namespace, pod.Name)
+                                       switchesNotFound[pod.Spec.NodeName] = true
+                                       continue
+                               }
+                               return err
                        }
+               } else {
+                       continue
+               }
+
+               if annotations == nil {
+                       continue
                }

go-controller/pkg/ovn/base_network_controller_pods.go Outdated Show resolved Hide resolved
@coveralls
Copy link

coveralls commented Nov 8, 2023

Coverage Status

coverage: 50.436% (-0.004%) from 50.44%
when pulling 94eeac6 on bpickard22:ignore-pod-on-no-node
into effbc31 on ovn-org:master.

Copy link
Contributor

@trozet trozet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add a test case as well?

go-controller/pkg/ovn/base_network_controller_pods.go Outdated Show resolved Hide resolved
We should not fail silently if a pod is running on a node that does not
exist

Signed-off-by: Ben Pickard <bpickard@redhat.com>
Signed-off-by: Ben Pickard <bpickard@redhat.com>
@trozet trozet changed the title Fail louder Ignore pod allocation errors during sync if no node exists Nov 13, 2023
@trozet trozet merged commit d3dfa25 into ovn-org:master Nov 13, 2023
29 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants