Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

azure/ipam: panic in Node.ResyncInterfacesAndIPs #26222

Closed
lmb opened this issue Jun 14, 2023 · 2 comments · Fixed by #26658
Closed

azure/ipam: panic in Node.ResyncInterfacesAndIPs #26222

lmb opened this issue Jun 14, 2023 · 2 comments · Fixed by #26658
Assignees
Labels
area/azure Impacts Azure based IPAM. area/ipam Impacts IP address management functionality. kind/bug/CI This is a bug in the testing code. release-blocker/1.14 This issue will prevent the release of the next version of Cilium. sig/agent Cilium agent related.

Comments

@lmb
Copy link
Contributor

lmb commented Jun 14, 2023

2023-06-14T09:54:31.2010993Z === RUN   Test/IPAMSuite/TestIpamManyNodes
2023-06-14T09:54:31.2011786Z level=info msg="Synchronized Azure IPAM information" numInstances=100 numSubnets=3 numVirtualNetworks=1 subsys=azure
2023-06-14T09:54:31.2012409Z level=info msg="Discovered new CiliumNode custom resource" name=node0 subsys=ipam
2023-06-14T09:54:31.2012913Z level=info msg="Discovered new CiliumNode custom resource" name=node1 subsys=ipam
2023-06-14T09:54:31.2014346Z level=info msg="Discovered new CiliumNode custom resource" name=node2 subsys=ipam
2023-06-14T09:54:31.2014993Z level=info msg="Discovered new CiliumNode custom resource" name=node3 subsys=ipam
2023-06-14T09:54:31.2015695Z level=info msg="Discovered new CiliumNode custom resource" name=node4 subsys=ipam
2023-06-14T09:54:31.2016891Z level=warning msg="Unable to compute pending pods, will not surge-allocate" error="pod store uninitialized" instanceID=vm0 name=node0 subsys=ipam
2023-06-14T09:54:31.2018765Z level=info msg="Resolving IP deficit of node" available=0 availableForAllocation=256 emptyInterfaceSlots=0 instanceID=vm0 maxIPsToAllocate=10 name=node0 neededIPs=10 remainingInterfaces=1 selectedInterface=/subscriptions/xxx/resourceGroups/g1/providers/Microsoft.Compute/virtualMachineScaleSets/vmss11/virtualMachines/vm0/networkInterfaces/vmss11 selectedPoolID=s-1 subsys=ipam used=0
2023-06-14T09:54:31.2020145Z level=info msg="Discovered new CiliumNode custom resource" name=node5 subsys=ipam
2023-06-14T09:54:31.2020923Z level=info msg="Discovered new CiliumNode custom resource" name=node6 subsys=ipam
2023-06-14T09:54:31.2021607Z level=info msg="Discovered new CiliumNode custom resource" name=node7 subsys=ipam
2023-06-14T09:54:31.2022124Z level=info msg="Discovered new CiliumNode custom resource" name=node8 subsys=ipam
2023-06-14T09:54:31.2023003Z level=info msg="Discovered new CiliumNode custom resource" name=node9 subsys=ipam
2023-06-14T09:54:31.2023558Z level=info msg="Discovered new CiliumNode custom resource" name=node10 subsys=ipam
2023-06-14T09:54:31.2024057Z level=info msg="Discovered new CiliumNode custom resource" name=node11 subsys=ipam
2023-06-14T09:54:31.2024695Z level=info msg="Discovered new CiliumNode custom resource" name=node12 subsys=ipam
2023-06-14T09:54:31.2025222Z level=info msg="Discovered new CiliumNode custom resource" name=node13 subsys=ipam
2023-06-14T09:54:31.2025683Z level=info msg="Discovered new CiliumNode custom resource" name=node14 subsys=ipam
2023-06-14T09:54:31.2026185Z level=info msg="Discovered new CiliumNode custom resource" name=node15 subsys=ipam
2023-06-14T09:54:31.2026749Z level=info msg="Synchronized Azure IPAM information" numInstances=100 numSubnets=3 numVirtualNetworks=1 subsys=azure
2023-06-14T09:54:31.2027294Z level=info msg="Discovered new CiliumNode custom resource" name=node16 subsys=ipam
2023-06-14T09:54:31.2027947Z level=info msg="Discovered new CiliumNode custom resource" name=node17 subsys=ipam
2023-06-14T09:54:31.2028834Z level=warning msg="Unable to compute pending pods, will not surge-allocate" error="pod store uninitialized" instanceID=vm2 name=node2 subsys=ipam
2023-06-14T09:54:31.2029699Z level=info msg="Discovered new CiliumNode custom resource" name=node18 subsys=ipam
2023-06-14T09:54:31.2031297Z level=info msg="Resolving IP deficit of node" available=0 availableForAllocation=256 emptyInterfaceSlots=0 instanceID=vm2 maxIPsToAllocate=10 name=node2 neededIPs=10 remainingInterfaces=1 selectedInterface=/subscriptions/xxx/resourceGroups/g1/providers/Microsoft.Compute/virtualMachineScaleSets/vmss11/virtualMachines/vm2/networkInterfaces/vmss11 selectedPoolID=s-1 subsys=ipam used=0
2023-06-14T09:54:31.2036407Z level=info msg="Discovered new CiliumNode custom resource" name=node19 subsys=ipam
2023-06-14T09:54:31.2036949Z level=info msg="Discovered new CiliumNode custom resource" name=node20 subsys=ipam
2023-06-14T09:54:31.2037612Z level=info msg="Discovered new CiliumNode custom resource" name=node21 subsys=ipam
2023-06-14T09:54:31.2038079Z level=info msg="Discovered new CiliumNode custom resource" name=node22 subsys=ipam
2023-06-14T09:54:31.2038565Z level=info msg="Discovered new CiliumNode custom resource" name=node23 subsys=ipam
2023-06-14T09:54:31.2039081Z level=info msg="Discovered new CiliumNode custom resource" name=node24 subsys=ipam
2023-06-14T09:54:31.2039612Z level=info msg="Discovered new CiliumNode custom resource" name=node25 subsys=ipam
2023-06-14T09:54:31.2040124Z level=info msg="Discovered new CiliumNode custom resource" name=node26 subsys=ipam
2023-06-14T09:54:31.2040605Z level=info msg="Discovered new CiliumNode custom resource" name=node27 subsys=ipam
2023-06-14T09:54:31.2041482Z level=warning msg="Unable to compute pending pods, will not surge-allocate" error="pod store uninitialized" instanceID=vm3 name=node3 subsys=ipam
2023-06-14T09:54:31.2043233Z level=info msg="Resolving IP deficit of node" available=0 availableForAllocation=256 emptyInterfaceSlots=0 instanceID=vm3 maxIPsToAllocate=10 name=node3 neededIPs=10 remainingInterfaces=1 selectedInterface=/subscriptions/xxx/resourceGroups/g1/providers/Microsoft.Compute/virtualMachineScaleSets/vmss11/virtualMachines/vm3/networkInterfaces/vmss11 selectedPoolID=s-1 subsys=ipam used=0
2023-06-14T09:54:31.2044654Z level=warning msg="Unable to compute pending pods, will not surge-allocate" error="pod store uninitialized" instanceID=vm11 name=node11 subsys=ipam
2023-06-14T09:54:31.2046369Z level=info msg="Resolving IP deficit of node" available=0 availableForAllocation=256 emptyInterfaceSlots=0 instanceID=vm11 maxIPsToAllocate=10 name=node11 neededIPs=10 remainingInterfaces=1 selectedInterface=/subscriptions/xxx/resourceGroups/g1/providers/Microsoft.Compute/virtualMachineScaleSets/vmss11/virtualMachines/vm11/networkInterfaces/vmss11 selectedPoolID=s-3 subsys=ipam used=0
2023-06-14T09:54:31.2047921Z level=warning msg="Unable to compute pending pods, will not surge-allocate" error="pod store uninitialized" instanceID=vm12 name=node12 subsys=ipam
2023-06-14T09:54:31.2049581Z level=info msg="Resolving IP deficit of node" available=0 availableForAllocation=256 emptyInterfaceSlots=0 instanceID=vm12 maxIPsToAllocate=10 name=node12 neededIPs=10 remainingInterfaces=1 selectedInterface=/subscriptions/xxx/resourceGroups/g1/providers/Microsoft.Compute/virtualMachineScaleSets/vmss11/virtualMachines/vm12/networkInterfaces/vmss11 selectedPoolID=s-1 subsys=ipam used=0
2023-06-14T09:54:31.2050940Z level=warning msg="Unable to compute pending pods, will not surge-allocate" error="pod store uninitialized" instanceID=vm13 name=node13 subsys=ipam
2023-06-14T09:54:31.2052519Z level=info msg="Resolving IP deficit of node" available=0 availableForAllocation=256 emptyInterfaceSlots=0 instanceID=vm13 maxIPsToAllocate=10 name=node13 neededIPs=10 remainingInterfaces=1 selectedInterface=/subscriptions/xxx/resourceGroups/g1/providers/Microsoft.Compute/virtualMachineScaleSets/vmss11/virtualMachines/vm13/networkInterfaces/vmss11 selectedPoolID=s-2 subsys=ipam used=0
2023-06-14T09:54:31.2053815Z level=warning msg="Unable to compute pending pods, will not surge-allocate" error="pod store uninitialized" instanceID=vm14 name=node14 subsys=ipam
2023-06-14T09:54:31.2055576Z level=info msg="Resolving IP deficit of node" available=0 availableForAllocation=256 emptyInterfaceSlots=0 instanceID=vm14 maxIPsToAllocate=10 name=node14 neededIPs=10 remainingInterfaces=1 selectedInterface=/subscriptions/xxx/resourceGroups/g1/providers/Microsoft.Compute/virtualMachineScaleSets/vmss11/virtualMachines/vm14/networkInterfaces/vmss11 selectedPoolID=s-1 subsys=ipam used=0
2023-06-14T09:54:31.2056792Z level=warning msg="Unable to compute pending pods, will not surge-allocate" error="pod store uninitialized" instanceID=vm15 name=node15 subsys=ipam
2023-06-14T09:54:31.2058381Z level=info msg="Resolving IP deficit of node" available=0 availableForAllocation=256 emptyInterfaceSlots=0 instanceID=vm15 maxIPsToAllocate=10 name=node15 neededIPs=10 remainingInterfaces=1 selectedInterface=/subscriptions/xxx/resourceGroups/g1/providers/Microsoft.Compute/virtualMachineScaleSets/vmss11/virtualMachines/vm15/networkInterfaces/vmss11 selectedPoolID=s-1 subsys=ipam used=0
2023-06-14T09:54:31.2059674Z level=warning msg="Unable to compute pending pods, will not surge-allocate" error="pod store uninitialized" instanceID=vm16 name=node16 subsys=ipam
2023-06-14T09:54:31.2061277Z level=info msg="Resolving IP deficit of node" available=0 availableForAllocation=256 emptyInterfaceSlots=0 instanceID=vm16 maxIPsToAllocate=10 name=node16 neededIPs=10 remainingInterfaces=1 selectedInterface=/subscriptions/xxx/resourceGroups/g1/providers/Microsoft.Compute/virtualMachineScaleSets/vmss11/virtualMachines/vm16/networkInterfaces/vmss11 selectedPoolID=s-3 subsys=ipam used=0
2023-06-14T09:54:31.2062595Z level=warning msg="Unable to compute pending pods, will not surge-allocate" error="pod store uninitialized" instanceID=vm17 name=node17 subsys=ipam
2023-06-14T09:54:31.2064228Z level=info msg="Resolving IP deficit of node" available=0 availableForAllocation=256 emptyInterfaceSlots=0 instanceID=vm17 maxIPsToAllocate=10 name=node17 neededIPs=10 remainingInterfaces=1 selectedInterface=/subscriptions/xxx/resourceGroups/g1/providers/Microsoft.Compute/virtualMachineScaleSets/vmss11/virtualMachines/vm17/networkInterfaces/vmss11 selectedPoolID=s-1 subsys=ipam used=0
2023-06-14T09:54:31.2065760Z level=warning msg="Unable to compute pending pods, will not surge-allocate" error="pod store uninitialized" instanceID=vm18 name=node18 subsys=ipam
2023-06-14T09:54:31.2067258Z level=info msg="Resolving IP deficit of node" available=0 availableForAllocation=256 emptyInterfaceSlots=0 instanceID=vm18 maxIPsToAllocate=10 name=node18 neededIPs=10 remainingInterfaces=1 selectedInterface=/subscriptions/xxx/resourceGroups/g1/providers/Microsoft.Compute/virtualMachineScaleSets/vmss11/virtualMachines/vm18/networkInterfaces/vmss11 selectedPoolID=s-1 subsys=ipam used=0
2023-06-14T09:54:31.2068783Z level=warning msg="Unable to compute pending pods, will not surge-allocate" error="pod store uninitialized" instanceID=vm20 name=node20 subsys=ipam
2023-06-14T09:54:31.2070433Z level=info msg="Resolving IP deficit of node" available=0 availableForAllocation=256 emptyInterfaceSlots=0 instanceID=vm20 maxIPsToAllocate=10 name=node20 neededIPs=10 remainingInterfaces=1 selectedInterface=/subscriptions/xxx/resourceGroups/g1/providers/Microsoft.Compute/virtualMachineScaleSets/vmss11/virtualMachines/vm20/networkInterfaces/vmss11 selectedPoolID=s-3 subsys=ipam used=0
2023-06-14T09:54:31.2071764Z level=warning msg="Unable to compute pending pods, will not surge-allocate" error="pod store uninitialized" instanceID=vm21 name=node21 subsys=ipam
2023-06-14T09:54:31.2073298Z level=info msg="Resolving IP deficit of node" available=0 availableForAllocation=256 emptyInterfaceSlots=0 instanceID=vm21 maxIPsToAllocate=10 name=node21 neededIPs=10 remainingInterfaces=1 selectedInterface=/subscriptions/xxx/resourceGroups/g1/providers/Microsoft.Compute/virtualMachineScaleSets/vmss11/virtualMachines/vm21/networkInterfaces/vmss11 selectedPoolID=s-1 subsys=ipam used=0
2023-06-14T09:54:31.2074602Z level=warning msg="Unable to compute pending pods, will not surge-allocate" error="pod store uninitialized" instanceID=vm22 name=node22 subsys=ipam
2023-06-14T09:54:31.2076411Z level=info msg="Resolving IP deficit of node" available=0 availableForAllocation=256 emptyInterfaceSlots=0 instanceID=vm22 maxIPsToAllocate=10 name=node22 neededIPs=10 remainingInterfaces=1 selectedInterface=/subscriptions/xxx/resourceGroups/g1/providers/Microsoft.Compute/virtualMachineScaleSets/vmss11/virtualMachines/vm22/networkInterfaces/vmss11 selectedPoolID=s-1 subsys=ipam used=0
2023-06-14T09:54:31.2077773Z level=warning msg="Unable to compute pending pods, will not surge-allocate" error="pod store uninitialized" instanceID=vm4 name=node4 subsys=ipam
2023-06-14T09:54:31.2079454Z level=info msg="Resolving IP deficit of node" available=0 availableForAllocation=256 emptyInterfaceSlots=0 instanceID=vm4 maxIPsToAllocate=10 name=node4 neededIPs=10 remainingInterfaces=1 selectedInterface=/subscriptions/xxx/resourceGroups/g1/providers/Microsoft.Compute/virtualMachineScaleSets/vmss11/virtualMachines/vm4/networkInterfaces/vmss11 selectedPoolID=s-1 subsys=ipam used=0
2023-06-14T09:54:31.2080864Z level=warning msg="Unable to compute pending pods, will not surge-allocate" error="pod store uninitialized" instanceID=vm5 name=node5 subsys=ipam
2023-06-14T09:54:31.2082489Z level=info msg="Resolving IP deficit of node" available=0 availableForAllocation=256 emptyInterfaceSlots=0 instanceID=vm5 maxIPsToAllocate=10 name=node5 neededIPs=10 remainingInterfaces=1 selectedInterface=/subscriptions/xxx/resourceGroups/g1/providers/Microsoft.Compute/virtualMachineScaleSets/vmss11/virtualMachines/vm5/networkInterfaces/vmss11 selectedPoolID=s-1 subsys=ipam used=0
2023-06-14T09:54:31.2084688Z level=warning msg="Unable to compute pending pods, will not surge-allocate" error="pod store uninitialized" instanceID=vm24 name=node24 subsys=ipam
2023-06-14T09:54:31.2086286Z level=info msg="Resolving IP deficit of node" available=0 availableForAllocation=256 emptyInterfaceSlots=0 instanceID=vm24 maxIPsToAllocate=10 name=node24 neededIPs=10 remainingInterfaces=1 selectedInterface=/subscriptions/xxx/resourceGroups/g1/providers/Microsoft.Compute/virtualMachineScaleSets/vmss11/virtualMachines/vm24/networkInterfaces/vmss11 selectedPoolID=s-1 subsys=ipam used=0
2023-06-14T09:54:31.2087628Z level=warning msg="Unable to compute pending pods, will not surge-allocate" error="pod store uninitialized" instanceID=vm25 name=node25 subsys=ipam
2023-06-14T09:54:31.2089843Z level=info msg="Resolving IP deficit of node" available=0 availableForAllocation=256 emptyInterfaceSlots=0 instanceID=vm25 maxIPsToAllocate=10 name=node25 neededIPs=10 remainingInterfaces=1 selectedInterface=/subscriptions/xxx/resourceGroups/g1/providers/Microsoft.Compute/virtualMachineScaleSets/vmss11/virtualMachines/vm25/networkInterfaces/vmss11 selectedPoolID=s-1 subsys=ipam used=0
2023-06-14T09:54:31.2091166Z level=warning msg="Unable to compute pending pods, will not surge-allocate" error="pod store uninitialized" instanceID=vm26 name=node26 subsys=ipam
2023-06-14T09:54:31.2092764Z level=info msg="Resolving IP deficit of node" available=0 availableForAllocation=256 emptyInterfaceSlots=0 instanceID=vm26 maxIPsToAllocate=10 name=node26 neededIPs=10 remainingInterfaces=1 selectedInterface=/subscriptions/xxx/resourceGroups/g1/providers/Microsoft.Compute/virtualMachineScaleSets/vmss11/virtualMachines/vm26/networkInterfaces/vmss11 selectedPoolID=s-1 subsys=ipam used=0
2023-06-14T09:54:31.2093763Z level=info msg="Discovered new CiliumNode custom resource" name=node28 subsys=ipam
2023-06-14T09:54:31.2094249Z level=info msg="Discovered new CiliumNode custom resource" name=node29 subsys=ipam
2023-06-14T09:54:31.2094725Z level=info msg="Discovered new CiliumNode custom resource" name=node30 subsys=ipam
2023-06-14T09:54:31.2095498Z level=warning msg="Unable to compute pending pods, will not surge-allocate" error="pod store uninitialized" instanceID=vm8 name=node8 subsys=ipam
2023-06-14T09:54:31.2097064Z level=info msg="Resolving IP deficit of node" available=0 availableForAllocation=256 emptyInterfaceSlots=0 instanceID=vm8 maxIPsToAllocate=10 name=node8 neededIPs=10 remainingInterfaces=1 selectedInterface=/subscriptions/xxx/resourceGroups/g1/providers/Microsoft.Compute/virtualMachineScaleSets/vmss11/virtualMachines/vm8/networkInterfaces/vmss11 selectedPoolID=s-2 subsys=ipam used=0
2023-06-14T09:54:31.2098507Z level=warning msg="Unable to compute pending pods, will not surge-allocate" error="pod store uninitialized" instanceID=vm7 name=node7 subsys=ipam
2023-06-14T09:54:31.2099374Z level=warning msg="Unable to compute pending pods, will not surge-allocate" error="pod store uninitialized" instanceID=vm1 name=node1 subsys=ipam
2023-06-14T09:54:31.2100955Z level=info msg="Resolving IP deficit of node" available=0 availableForAllocation=256 emptyInterfaceSlots=0 instanceID=vm1 maxIPsToAllocate=10 name=node1 neededIPs=10 remainingInterfaces=1 selectedInterface=/subscriptions/xxx/resourceGroups/g1/providers/Microsoft.Compute/virtualMachineScaleSets/vmss11/virtualMachines/vm1/networkInterfaces/vmss11 selectedPoolID=s-1 subsys=ipam used=0
2023-06-14T09:54:31.2102255Z level=warning msg="Unable to compute pending pods, will not surge-allocate" error="pod store uninitialized" instanceID=vm6 name=node6 subsys=ipam
2023-06-14T09:54:31.2103785Z level=info msg="Resolving IP deficit of node" available=0 availableForAllocation=256 emptyInterfaceSlots=0 instanceID=vm6 maxIPsToAllocate=10 name=node6 neededIPs=10 remainingInterfaces=1 selectedInterface=/subscriptions/xxx/resourceGroups/g1/providers/Microsoft.Compute/virtualMachineScaleSets/vmss11/virtualMachines/vm6/networkInterfaces/vmss11 selectedPoolID=s-1 subsys=ipam used=0
2023-06-14T09:54:31.2105383Z level=info msg="Resolving IP deficit of node" available=0 availableForAllocation=256 emptyInterfaceSlots=0 instanceID=vm7 maxIPsToAllocate=10 name=node7 neededIPs=10 remainingInterfaces=1 selectedInterface=/subscriptions/xxx/resourceGroups/g1/providers/Microsoft.Compute/virtualMachineScaleSets/vmss11/virtualMachines/vm7/networkInterfaces/vmss11 selectedPoolID=s-1 subsys=ipam used=0
2023-06-14T09:54:31.2106538Z level=warning msg="Unable to compute pending pods, will not surge-allocate" error="pod store uninitialized" instanceID=vm10 name=node10 subsys=ipam
2023-06-14T09:54:31.2108127Z level=info msg="Resolving IP deficit of node" available=0 availableForAllocation=256 emptyInterfaceSlots=0 instanceID=vm10 maxIPsToAllocate=10 name=node10 neededIPs=10 remainingInterfaces=1 selectedInterface=/subscriptions/xxx/resourceGroups/g1/providers/Microsoft.Compute/virtualMachineScaleSets/vmss11/virtualMachines/vm10/networkInterfaces/vmss11 selectedPoolID=s-1 subsys=ipam used=0
2023-06-14T09:54:31.2109364Z level=warning msg="Unable to compute pending pods, will not surge-allocate" error="pod store uninitialized" instanceID=vm19 name=node19 subsys=ipam
2023-06-14T09:54:31.2111061Z level=info msg="Resolving IP deficit of node" available=0 availableForAllocation=256 emptyInterfaceSlots=0 instanceID=vm19 maxIPsToAllocate=10 name=node19 neededIPs=10 remainingInterfaces=1 selectedInterface=/subscriptions/xxx/resourceGroups/g1/providers/Microsoft.Compute/virtualMachineScaleSets/vmss11/virtualMachines/vm19/networkInterfaces/vmss11 selectedPoolID=s-3 subsys=ipam used=0
2023-06-14T09:54:31.2112385Z level=warning msg="Unable to compute pending pods, will not surge-allocate" error="pod store uninitialized" instanceID=vm23 name=node23 subsys=ipam
2023-06-14T09:54:31.2114137Z level=info msg="Resolving IP deficit of node" available=0 availableForAllocation=256 emptyInterfaceSlots=0 instanceID=vm23 maxIPsToAllocate=10 name=node23 neededIPs=10 remainingInterfaces=1 selectedInterface=/subscriptions/xxx/resourceGroups/g1/providers/Microsoft.Compute/virtualMachineScaleSets/vmss11/virtualMachines/vm23/networkInterfaces/vmss11 selectedPoolID=s-1 subsys=ipam used=0
2023-06-14T09:54:31.2115149Z level=info msg="Discovered new CiliumNode custom resource" name=node31 subsys=ipam
2023-06-14T09:54:31.2115680Z level=info msg="Discovered new CiliumNode custom resource" name=node32 subsys=ipam
2023-06-14T09:54:31.2116023Z level=info msg="Discovered new CiliumNode custom resource" name=node33 subsys=ipam
2023-06-14T09:54:31.2116777Z level=info msg="Discovered new CiliumNode custom resource" name=node34 subsys=ipam
2023-06-14T09:54:31.2117270Z level=info msg="Discovered new CiliumNode custom resource" name=node35 subsys=ipam
2023-06-14T09:54:31.2117774Z level=info msg="Discovered new CiliumNode custom resource" name=node36 subsys=ipam
2023-06-14T09:54:31.2118284Z level=info msg="Discovered new CiliumNode custom resource" name=node37 subsys=ipam
2023-06-14T09:54:31.2118742Z level=info msg="Discovered new CiliumNode custom resource" name=node38 subsys=ipam
2023-06-14T09:54:31.2119253Z level=info msg="Discovered new CiliumNode custom resource" name=node39 subsys=ipam
2023-06-14T09:54:31.2119756Z level=info msg="Discovered new CiliumNode custom resource" name=node40 subsys=ipam
2023-06-14T09:54:31.2120235Z level=info msg="Discovered new CiliumNode custom resource" name=node41 subsys=ipam
2023-06-14T09:54:31.2120722Z level=info msg="Discovered new CiliumNode custom resource" name=node42 subsys=ipam
2023-06-14T09:54:31.2121197Z level=info msg="Discovered new CiliumNode custom resource" name=node43 subsys=ipam
2023-06-14T09:54:31.2121700Z level=info msg="Discovered new CiliumNode custom resource" name=node44 subsys=ipam
2023-06-14T09:54:31.2122200Z level=info msg="Discovered new CiliumNode custom resource" name=node45 subsys=ipam
2023-06-14T09:54:31.2122672Z level=info msg="Discovered new CiliumNode custom resource" name=node46 subsys=ipam
2023-06-14T09:54:31.2123165Z level=info msg="Discovered new CiliumNode custom resource" name=node47 subsys=ipam
2023-06-14T09:54:31.2123999Z level=warning msg="Unable to compute pending pods, will not surge-allocate" error="pod store uninitialized" instanceID=vm41 name=node41 subsys=ipam
2023-06-14T09:54:31.2125656Z level=info msg="Resolving IP deficit of node" available=0 availableForAllocation=256 emptyInterfaceSlots=0 instanceID=vm41 maxIPsToAllocate=10 name=node41 neededIPs=10 remainingInterfaces=1 selectedInterface=/subscriptions/xxx/resourceGroups/g1/providers/Microsoft.Compute/virtualMachineScaleSets/vmss11/virtualMachines/vm41/networkInterfaces/vmss11 selectedPoolID=s-1 subsys=ipam used=0
2023-06-14T09:54:31.2127010Z level=warning msg="Unable to compute pending pods, will not surge-allocate" error="pod store uninitialized" instanceID=vm37 name=node37 subsys=ipam
2023-06-14T09:54:31.2128652Z level=info msg="Resolving IP deficit of node" available=0 availableForAllocation=256 emptyInterfaceSlots=0 instanceID=vm37 maxIPsToAllocate=10 name=node37 neededIPs=10 remainingInterfaces=1 selectedInterface=/subscriptions/xxx/resourceGroups/g1/providers/Microsoft.Compute/virtualMachineScaleSets/vmss11/virtualMachines/vm37/networkInterfaces/vmss11 selectedPoolID=s-1 subsys=ipam used=0
2023-06-14T09:54:31.2129908Z level=info msg="Discovered new CiliumNode custom resource" name=node48 subsys=ipam
2023-06-14T09:54:31.2141931Z level=info msg="Synchronized Azure IPAM information" numInstances=100 numSubnets=3 numVirtualNetworks=1 subsys=azure
2023-06-14T09:54:31.2142734Z level=info msg="Discovered new CiliumNode custom resource" name=node49 subsys=ipam
2023-06-14T09:54:31.2143597Z level=warning msg="Unable to compute pending pods, will not surge-allocate" error="pod store uninitialized" instanceID=vm42 name=node42 subsys=ipam
2023-06-14T09:54:31.2145376Z level=info msg="Resolving IP deficit of node" available=0 availableForAllocation=256 emptyInterfaceSlots=0 instanceID=vm42 maxIPsToAllocate=10 name=node42 neededIPs=10 remainingInterfaces=1 selectedInterface=/subscriptions/xxx/resourceGroups/g1/providers/Microsoft.Compute/virtualMachineScaleSets/vmss11/virtualMachines/vm42/networkInterfaces/vmss11 selectedPoolID=s-2 subsys=ipam used=0
2023-06-14T09:54:31.2146761Z level=warning msg="Unable to compute pending pods, will not surge-allocate" error="pod store uninitialized" instanceID=vm28 name=node28 subsys=ipam
2023-06-14T09:54:31.2148809Z level=info msg="Resolving IP deficit of node" available=0 availableForAllocation=256 emptyInterfaceSlots=0 instanceID=vm28 maxIPsToAllocate=10 name=node28 neededIPs=10 remainingInterfaces=1 selectedInterface=/subscriptions/xxx/resourceGroups/g1/providers/Microsoft.Compute/virtualMachineScaleSets/vmss11/virtualMachines/vm28/networkInterfaces/vmss11 selectedPoolID=s-1 subsys=ipam used=0
2023-06-14T09:54:31.2150350Z level=warning msg="Unable to compute pending pods, will not surge-allocate" error="pod store uninitialized" instanceID=vm29 name=node29 subsys=ipam
2023-06-14T09:54:31.2168589Z level=info msg="Resolving IP deficit of node" available=0 availableForAllocation=256 emptyInterfaceSlots=0 instanceID=vm29 maxIPsToAllocate=10 name=node29 neededIPs=10 remainingInterfaces=1 selectedInterface=/subscriptions/xxx/resourceGroups/g1/providers/Microsoft.Compute/virtualMachineScaleSets/vmss11/virtualMachines/vm29/networkInterfaces/vmss11 selectedPoolID=s-1 subsys=ipam used=0
2023-06-14T09:54:31.2170177Z level=warning msg="Unable to compute pending pods, will not surge-allocate" error="pod store uninitialized" instanceID=vm30 name=node30 subsys=ipam
2023-06-14T09:54:31.2171770Z level=info msg="Resolving IP deficit of node" available=0 availableForAllocation=256 emptyInterfaceSlots=0 instanceID=vm30 maxIPsToAllocate=10 name=node30 neededIPs=10 remainingInterfaces=1 selectedInterface=/subscriptions/xxx/resourceGroups/g1/providers/Microsoft.Compute/virtualMachineScaleSets/vmss11/virtualMachines/vm30/networkInterfaces/vmss11 selectedPoolID=s-1 subsys=ipam used=0
2023-06-14T09:54:31.2173130Z level=warning msg="Unable to compute pending pods, will not surge-allocate" error="pod store uninitialized" instanceID=vm9 name=node9 subsys=ipam
2023-06-14T09:54:31.2174821Z level=info msg="Resolving IP deficit of node" available=0 availableForAllocation=256 emptyInterfaceSlots=0 instanceID=vm9 maxIPsToAllocate=10 name=node9 neededIPs=10 remainingInterfaces=1 selectedInterface=/subscriptions/xxx/resourceGroups/g1/providers/Microsoft.Compute/virtualMachineScaleSets/vmss11/virtualMachines/vm9/networkInterfaces/vmss11 selectedPoolID=s-1 subsys=ipam used=0
2023-06-14T09:54:31.2176146Z level=warning msg="Unable to compute pending pods, will not surge-allocate" error="pod store uninitialized" instanceID=vm27 name=node27 subsys=ipam
2023-06-14T09:54:31.2177789Z level=info msg="Resolving IP deficit of node" available=0 availableForAllocation=256 emptyInterfaceSlots=0 instanceID=vm27 maxIPsToAllocate=10 name=node27 neededIPs=10 remainingInterfaces=1 selectedInterface=/subscriptions/xxx/resourceGroups/g1/providers/Microsoft.Compute/virtualMachineScaleSets/vmss11/virtualMachines/vm27/networkInterfaces/vmss11 selectedPoolID=s-1 subsys=ipam used=0
2023-06-14T09:54:31.2178783Z panic: runtime error: invalid memory address or nil pointer dereference
2023-06-14T09:54:31.2179528Z [signal SIGSEGV: segmentation violation code=0x1 addr=0x248 pc=0x1fd772f]
2023-06-14T09:54:31.2179791Z 
2023-06-14T09:54:31.2179910Z goroutine 261 [running]:
2023-06-14T09:54:31.2180372Z github.com/cilium/cilium/pkg/azure/ipam.(*Node).ResyncInterfacesAndIPs(0xc000392020, {0x0?, 0x0?}, 0xc0002c9c70)
2023-06-14T09:54:31.2180794Z 	/host/pkg/azure/ipam/node.go:180 +0x34f
2023-06-14T09:54:31.2181175Z github.com/cilium/cilium/pkg/ipam.(*Node).recalculate(0xc0000e67e0)
2023-06-14T09:54:31.2181561Z 	/host/pkg/ipam/node.go:424 +0x97
2023-06-14T09:54:31.2182015Z github.com/cilium/cilium/pkg/ipam.(*NodeManager).resyncNode(0xc0000d8070, {0xc00077a120?, 0xc00077a120?}, 0xc0000e67e0, 0xc00081e180, {0x0?, 0x3000106?, 0x3e7c180?})
2023-06-14T09:54:31.2182404Z 	/host/pkg/ipam/node_manager.go:463 +0x65
2023-06-14T09:54:31.2182741Z github.com/cilium/cilium/pkg/ipam.(*NodeManager).Resync.func1(0x1db9026?, 0xc0009c94a0?)
2023-06-14T09:54:31.2183238Z 	/host/pkg/ipam/node_manager.go:522 +0x46
2023-06-14T09:54:31.2183612Z created by github.com/cilium/cilium/pkg/ipam.(*NodeManager).Resync
2023-06-14T09:54:31.2183991Z 	/host/pkg/ipam/node_manager.go:521 +0x24f
2023-06-14T09:54:31.2184340Z FAIL	github.com/cilium/cilium/pkg/azure/ipam	0.069s

https://github.com/cilium/cilium/actions/runs/5265497133/jobs/9518338900?pr=25684

Originally posted by @lmb in #11785 (comment)

@lmb lmb added kind/bug/CI This is a bug in the testing code. area/azure Impacts Azure based IPAM. area/ipam Impacts IP address management functionality. labels Jun 14, 2023
@joestringer joestringer added sig/agent Cilium agent related. release-blocker/1.14 This issue will prevent the release of the next version of Cilium. labels Jun 15, 2023
@tommyp1ckles tommyp1ckles self-assigned this Jun 21, 2023
@gandro
Copy link
Member

gandro commented Jun 21, 2023

Does this need to be a release blocker? This seems to be a flake which exists since 2020 #11785 (comment)

@lmb
Copy link
Contributor Author

lmb commented Jun 21, 2023

Context: the bug in that original code was supposedly fixed, it also was in a different line (something to do with logger). See #11786

tommyp1ckles added a commit to tommyp1ckles/cilium that referenced this issue Jul 18, 2023
A race condition between when all the resync triggers are setup for an
upserted CiliumNode and when the k8s node object is emplaced can cause a
crash.

Specifically, this can arise between when then ipam nodemanager lock is released and when the call to UpdatedResources occurs.

In CI was likely caused by the "ipam-node-interval-refresh" invoking a
resync which resulted in a panic in the ResyncInterfacesAndIPs function.
While testing this I was able to cause other, similar, panics by delaying the
UpdatedResource call, so this should fix a class of potential crashes.

Fixes: cilium#26222

Signed-off-by: Tom Hadlaw <tom.hadlaw@isovalent.com>
aditighag pushed a commit that referenced this issue Jul 19, 2023
A race condition between when all the resync triggers are setup for an
upserted CiliumNode and when the k8s node object is emplaced can cause a
crash.

Specifically, this can arise between when then ipam nodemanager lock is released and when the call to UpdatedResources occurs.

In CI was likely caused by the "ipam-node-interval-refresh" invoking a
resync which resulted in a panic in the ResyncInterfacesAndIPs function.
While testing this I was able to cause other, similar, panics by delaying the
UpdatedResource call, so this should fix a class of potential crashes.

Fixes: #26222

Signed-off-by: Tom Hadlaw <tom.hadlaw@isovalent.com>
nbusseneau pushed a commit that referenced this issue Jul 24, 2023
[ upstream commit 3521998 ]

A race condition between when all the resync triggers are setup for an
upserted CiliumNode and when the k8s node object is emplaced can cause a
crash.

Specifically, this can arise between when then ipam nodemanager lock is released and when the call to UpdatedResources occurs.

In CI was likely caused by the "ipam-node-interval-refresh" invoking a
resync which resulted in a panic in the ResyncInterfacesAndIPs function.
While testing this I was able to cause other, similar, panics by delaying the
UpdatedResource call, so this should fix a class of potential crashes.

Fixes: #26222

Signed-off-by: Tom Hadlaw <tom.hadlaw@isovalent.com>
Signed-off-by: Nicolas Busseneau <nicolas@isovalent.com>
aanm pushed a commit that referenced this issue Jul 25, 2023
[ upstream commit 3521998 ]

A race condition between when all the resync triggers are setup for an
upserted CiliumNode and when the k8s node object is emplaced can cause a
crash.

Specifically, this can arise between when then ipam nodemanager lock is released and when the call to UpdatedResources occurs.

In CI was likely caused by the "ipam-node-interval-refresh" invoking a
resync which resulted in a panic in the ResyncInterfacesAndIPs function.
While testing this I was able to cause other, similar, panics by delaying the
UpdatedResource call, so this should fix a class of potential crashes.

Fixes: #26222

Signed-off-by: Tom Hadlaw <tom.hadlaw@isovalent.com>
Signed-off-by: Nicolas Busseneau <nicolas@isovalent.com>
ldelossa pushed a commit to ldelossa/cilium that referenced this issue Sep 27, 2023
[ upstream commit 3521998 ]

A race condition between when all the resync triggers are setup for an
upserted CiliumNode and when the k8s node object is emplaced can cause a
crash.

Specifically, this can arise between when then ipam nodemanager lock is released and when the call to UpdatedResources occurs.

In CI was likely caused by the "ipam-node-interval-refresh" invoking a
resync which resulted in a panic in the ResyncInterfacesAndIPs function.
While testing this I was able to cause other, similar, panics by delaying the
UpdatedResource call, so this should fix a class of potential crashes.

Fixes: cilium#26222

Signed-off-by: Tom Hadlaw <tom.hadlaw@isovalent.com>
Signed-off-by: Nicolas Busseneau <nicolas@isovalent.com>
gandro pushed a commit to gandro/cilium that referenced this issue Dec 7, 2023
A race condition between when all the resync triggers are setup for an
upserted CiliumNode and when the k8s node object is emplaced can cause a
crash.

Specifically, this can arise between when then ipam nodemanager lock is released and when the call to UpdatedResources occurs.

In CI was likely caused by the "ipam-node-interval-refresh" invoking a
resync which resulted in a panic in the ResyncInterfacesAndIPs function.
While testing this I was able to cause other, similar, panics by delaying the
UpdatedResource call, so this should fix a class of potential crashes.

Fixes: cilium#26222

Signed-off-by: Tom Hadlaw <tom.hadlaw@isovalent.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/azure Impacts Azure based IPAM. area/ipam Impacts IP address management functionality. kind/bug/CI This is a bug in the testing code. release-blocker/1.14 This issue will prevent the release of the next version of Cilium. sig/agent Cilium agent related.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants