-
Notifications
You must be signed in to change notification settings - Fork 260
Description
When configured for IPAMv2, where CNS watches Pods to calculated the IP demand for dynamic Podsubnet scenarios, CNS is overcounting the IP demand due to including Pods in terminal state.
https://github.com/Azure/azure-container-networking/blob/master/cns/service/main.go#L1576-L1587
if cnsconfig.WatchPods {
pw := podctrl.New(z)
if cnsconfig.EnableIPAMv2 {
hostNetworkListOpt := &client.ListOptions{FieldSelector: fields.SelectorFromSet(fields.Set{"spec.hostNetwork": "false"})} // filter only podsubnet pods
// don't relist pods more than every 500ms
limit := rate.NewLimiter(rate.Every(500*time.Millisecond), 1) //nolint:gomnd // clearly 500ms
pw.With(pw.NewNotifierFunc(hostNetworkListOpt, limit, ipampoolv2.PodIPDemandListener(ipDemandCh)))
}
if err := pw.SetupWithManager(ctx, manager); err != nil {
return errors.Wrapf(err, "failed to setup pod watcher with manager")
}
}
Node filtering is set up earlier by configuring the controller-runtime cache to do server side filtering to the Node that this CNS is on. Here, a client-side filter is added which only counts Pods with hostNetwork: false, because CNI is not involved for hostNetwork Pods and we don't need to have IPs to assign to them.
The issue is that if a Pod for a Job completes but is not GC'd for some time, that Pod continues to be counted towards the IP demand on the Node, even though the CNI has torn down the Pod sandbox and the IP is unassigned. Pods in terminal Phases should actually not be counted as they have no runtime presence (including no network sandbox) and should allow the demand to decrease/release IPs back to the subnet. Pods in terminal Phases should be filtered by the Pod watcher and be excluded from the demand count.
Open question - does filtering completed Pods have any impact on SwiftV2 multitenancy?