Ingress controller has labels in pods #30045

sowmyav27 · 2020-11-12T06:43:18Z

What kind of request is this (question/bug/enhancement/feature request): bug

Steps to reproduce (least amount of steps as possible):
Upgrade usecase:

On 2.4.8, deploy a DO cluster with Project Isolation enabled.
create workload and ingress ing01 pointing to this
Upgrade Rancher to 2.5.2
Notice user ingresses go to Initializing state.
The ingress still has only one worker node's ip address under loadBalancer when doing a View/Edit in Yaml for the ingress
Ingress-controller's pods have podName label on and off
disable and then enable project network isolation
field.cattle.io/podName: nginx-ingress-controller-bfnkt label on the ingress controller is seen.
And User Ingresses are stuck in Initializing state.
Rancher logs in debug mode:

2020/11/12 20:02:19 [DEBUG] podHandler: addLabelIfHostPortsPresent: deleting podNameFieldLabel map[app:ingress-nginx controller-revision-hash:7649c75c8c field.cattle.io/podName:nginx-ingress-controller-2r8pd pod-template-generation:1] in ingress-nginx
2020/11/12 20:02:19 [DEBUG] netpolMgr: delete: existing=nil, err=networkPolicy.networking.k8s.io "ingress-nginx/hp-nginx-ingress-controller-2r8pd" not found
2020/11/12 20:02:19 [DEBUG] podHandler: Sync: {TypeMeta:{Kind:Pod APIVersion:v1} ObjectMeta:{Name:nginx-ingress-controller-2r8pd GenerateName:nginx-ingress-controller- Namespace:ingress-nginx SelfLink:/api/v1/namespaces/ingress-nginx/pods/nginx-ingress-controller-2r8pd UID:c7437eac-db2f-446b-ad3f-93faec7be429 ResourceVersion:26731 Generation:0 CreationTimestamp:2020-11-12 19:29:32 +0000 UTC DeletionTimestamp:<nil> DeletionGracePeriodSeconds:<nil> Labels:map[app:ingress-nginx controller-revision-hash:7649c75c8c pod-template-generation:1] Annotations:map[prometheus.io/port:10254 prometheus.io/scrape:true] OwnerReferences:[{APIVersion:apps/v1 Kind:DaemonSet Name:nginx-ingress-controller UID:cc372d02-48ab-4082-b5e2-c974431a0ebb Controller:0xc0099ff579 BlockOwnerDeletion:0xc0099ff57a}] Finalizers:[] ClusterName: ManagedFields:[{Manager:kube-controller-manager Operation:Update APIVersion:v1 Time:2020-11-12 19:29:32 +0000 UTC FieldsType:FieldsV1 FieldsV1:&FieldsV1{Raw:*<redacted>

Expected Result:

field.cattle.io/podName: nginx-ingress-controller-bfnkt label on the ingress controller should NOT be seen

Other details that may be helpful:

Related issue: [BUG] Rancher breaks ingress-nginx by labelling its Pods ⚠️ #26677
This also happens on an upgrade from 2.4.8 to 2.4.10
This flapping is seen on a fresh install of 2.5.2 also - deploy a cluster on 2.5.2, debug statements in rancher logs show the flapping

Environment information

Rancher version (rancher/rancher/rancher/server image tag or shown bottom left in the UI): 2.4.8 to 2.5.2
Installation option (single install/HA): HA

Cluster information

Cluster type (Hosted/Infrastructure Provider/Custom/Imported): RKE DO
Kubernetes version (use kubectl version):

1.18.10-rancher1-2

The text was updated successfully, but these errors were encountered:

UberKuber · 2020-11-12T15:19:36Z

@sowmyav27, the process I've tested now is as per below:

Disable project network isolation on the cluster
Redeploy nginx ingress controller daemonset
Enable project network isolation on the cluster

I have tested this process against multiple downstream clusters behind Rancher v2.5.2 and Rancher v2.4.10 and it's resolved the issue from what I can pick up.

al45tair · 2020-11-13T11:54:26Z

Personally, I've just installed my patched ingress-nginx into our cluster again (here is the patch).

This resolves the problem by making the ingress controller ignore Rancher's podName label.

kinarashah · 2020-11-18T21:35:30Z

For testing:

Fresh Install Rancher v2.4-head and v2.5-head

Create cluster with Project Network Isolation Enabled >=2 worker nodes.
Confirm pods of nginx-ingress-controller in system project don't have podName labels.
Turn on debug, redeploy nginx-ingress-controller and confirm the pods don't get updated without cause. Can be confirmed by looking for podHandler: addLabelIfHostPortsPresent: deleting podNameFieldLabel in logs.
Create user ingress, confirm the status.LoadBalancer shows IPs of all the worker nodes.

Upgrade Scenario:

Rancher v2.4.8 or earlier, Create cluster with Project Network Isolation Enabled >=2 worker nodes.
Create few network policies. These would be in addition to the ones Rancher creates on default for project network isolation.
Create few user ingresses. status.LoadBalancer might not have IPs of all the worker nodes.
Upgrade to v2.4-head/v2.5-head/corresponding RCs
Confirm pods of nginx-ingress-controller in system project don't have podName labels.
Turn on debug, redeploy nginx-ingress-controller and confirm the pods don't get updated without cause. Can be confirmed by looking for podHandler: addLabelIfHostPortsPresent: deleting podNameFieldLabel in logs.
Delete pods of nginx-ingress-controller in system project one by one. Confirm user ingresses don't go to Initializing state.
Confirm user network policies remain as is and don't get deleted.
On user ingresses, confirm the status.LoadBalancer shows IPs of all the worker nodes.
Check Kubelet logs to confirm there are no unlimited SyncLoop updates on nginx ingress.

Note: There are a few ways Ingress could be redeployed.

Redeploy from UI / kubectl rollout restart ds nginx-ingress-controller -n ingress-nginx: These could temporarily let user ingresses go to Initializing state. But they should become active after all the pods are Active and running.
Delete pods of the ds manually and wait for the new pod to be running, then delete the next pod. This doesn't result into user ingresses go to Initializing state. (AFAIK, needs confirmation).

Upgrade Scenario from v2.4.9/v2.4.10/v2.5.2:

Rancher v2.4.8 or earlier, Create cluster with Project Network Isolation Enabled >=2 worker nodes.
Create few network policies. These would be in addition to the ones Rancher creates on default for project network isolation.
Create few user ingresses. status.LoadBalancer might not have IPs of all the worker nodes.
Upgrade to v2.4.9/v2.4.10/v2.5.2
Bug is reproduced
Upgrade to v2.4-head/v2.5-head/corresponding RCs
Confirm pods of nginx-ingress-controller in system project don't have podName labels.
Turn on debug, redeploy nginx-ingress-controller and confirm the pods don't get updated without cause. Can be confirmed by looking for podHandler: addLabelIfHostPortsPresent: deleting podNameFieldLabel in logs.
Delete pods of nginx-ingress-controller in system project one by one. Confirm user ingresses don't go to Initializing state.
Confirm user network policies remain as is and don't get deleted.
On user ingresses, confirm the status.LoadBalancer shows IPs of all the worker nodes.
Check Kubelet logs to confirm there are no unlimited SyncLoop updates on nginx ingress.

sowmyav27 · 2020-11-18T23:33:03Z

Fresh Install use case - On 2.4-head - commit id: `6a09f523f`

Create cluster with Project Network Isolation Enabled 1 etcd/control and 2 worker nodes.
Pods of nginx-ingress-controller in system project do not have podName labels.
In Debug mode, redeploy nginx-ingress-controller and confirm the pods don't get updated without cause. podHandler: addLabelIfHostPortsPresent: deleting podNameFieldLabel is NOT seen in logs.
Deploy user ingress, verified the status.LoadBalancer shows IPs of all the worker nodes - 2 worker nodes.

status:
  loadBalancer:
    ingress:
    - ip: <ip-1>
    - ip: <ip-2>

Upgrade from 2.4.8 to 2.4-head

On 2.4.8

Deploy a cluster - 2 worker nodes, 1 etcd/control plane node
Deploy a workload and an ingress pointing to the workload.
User ingress has only one worker node's ip address in status.LoadBalancer

status:
  loadBalancer:
    ingress:
    - ip: <wk01>

Create network policies using doc
Upgrade rancher to 2.4-head
User ingress in Initializing state. And does not recover
User ingress has no worker nodes' ip address in status.LoadBalancer

status:
  loadBalancer: {}

podHandler: addLabelIfHostPortsPresent: deleting podNameFieldLabel --> is seen in the logs

2020/11/19 02:57:49 [DEBUG] podHandler: addLabelIfHostPortsPresent: deleting podNameFieldLabel map[app:ingress-nginx controller-revision-hash:7649c75c8c field.cattle.io/podName:nginx-ingress-controller-p79hj pod-template-generation:1] in ingress-nginx

pods of nginx-ingress-controller in system project don't have podName labels
Redeploy nginx-ingress-controller. Wait for the workload and pods to come to Active
User ingress is seen in Active state.
pods of nginx-ingress-controller in system project don't have podName labels
User ingress has both the worker nodes' ip address in status.LoadBalancer

status:
  loadBalancer:
    ingress:
    - ip: <wk01>
    - ip: <wk02>

User created network policies do not get deleted
Kubelet logs on the worker node do not show repeated Sync Loop updates/messages

Upgrade from 2.4.8 to 2.4-head - Scenario#2

On 2.4.8

Deploy a cluster - 2 worker nodes, 1 etcd/control plane node
Deploy a workload and an ingress pointing to the workload.
User ingress has only one worker node's ip address in status.LoadBalancer

status:
  loadBalancer:
    ingress:
    - ip: <wk01>

Create network policies using doc
Upgrade rancher to 2.4.10
Bug is reproduced
Upgrade rancher to 2.4-head - commit id: 6a09f523f
User ingress in Initializing state. And does not recover
User ingress has no worker nodes' ip address in status.LoadBalancer

status:
  loadBalancer: {}

pods of nginx-ingress-controller in system project don't have podName labels
Redeploy nginx-ingress-controller. - by deleting the pods of ingress controller. Wait for the workload and pods to come to Active
User ingress is seen in Active state.
pods of nginx-ingress-controller in system project don't have podName labels
User ingress has both the worker nodes' ip address in status.LoadBalancer

status:
  loadBalancer:
    ingress:
    - ip: <wk01>
    - ip: <wk02>

User created network policies do not get deleted
Kubelet logs on the worker node do not show repeated Sync Loop updates/messages
Delete pods of the ds ingress controller manually and wait for the new pod to be running, then delete the next pod. This doesn't result into user ingresses go to Initializing state.

leflambeur · 2020-11-19T09:32:36Z

Thanks, @kinarashah and @sowmyav27 for the quick turnaround on this :)

sowmyav27 · 2020-11-21T23:34:32Z

Fresh Install use case - On `2.5.3-rc1`

Create cluster with Project Network Isolation Enabled 1 etcd/control and 2 worker nodes.
Pods of nginx-ingress-controller in system project do not have podName labels.
In Debug mode, redeploy nginx-ingress-controller and confirm the pods don't get updated without cause. podHandler: addLabelIfHostPortsPresent: deleting podNameFieldLabel is NOT seen in logs.
Deploy user ingress, verified the status.LoadBalancer shows IPs of all the worker nodes - 2 worker nodes.

status:
  loadBalancer:
    ingress:
    - ip: <ip-1>
    - ip: <ip-2>

Upgrade from 2.5.2 to 2.5.3-rc1

On 2.5.2

Deploy a cluster - 2 worker nodes, 1 etcd/control plane node
Enable cluster level monitoring
Deploy a workload and an ingress pointing to the workload.
User ingress has only one worker node's ip address in status.LoadBalancer

status:
  loadBalancer:
    ingress:
    - ip: <wk01>

ingress controller pods in the system project have podName labels
Create network policies using doc
Upgrade rancher to 2.5.3-rc1
User ingress in Initializing state. And does not recover
User ingress has no worker nodes' ip address in status.LoadBalancer

status:
  loadBalancer: {}

podHandler: addLabelIfHostPortsPresent: deleting podNameFieldLabel --> is seen in the logs

2020/11/19 02:57:49 [DEBUG] podHandler: addLabelIfHostPortsPresent: deleting podNameFieldLabel map[app:ingress-nginx controller-revision-hash:7649c75c8c field.cattle.io/podName:nginx-ingress-controller-p79hj pod-template-generation:1] in ingress-nginx

pods of nginx-ingress-controller in system project don't have podName labels
Redeploy nginx-ingress-controller. Wait for the workload and pods to come to Active
User ingress is seen in Active state.
pods of nginx-ingress-controller in system project don't have podName labels
User ingress has both the worker nodes' ip address in status.LoadBalancer

status:
  loadBalancer:
    ingress:
    - ip: <wk01>
    - ip: <wk02>

User created network policies do not get deleted
Kubelet logs on the worker node do not show repeated Sync Loop updates/messages
cluster metrics : Around 20.30 upgrade happened in rancher setup

Upgrade from 2.5.1 to 2.5.2 to 2.5.3-rc1

On 2.5.1

Deploy a cluster - 3 worker nodes, 1 etcd/control plane node
Enable cluster level monitoring
Deploy a workload and an ingress pointing to the workload.
User ingress has only one worker node's ip address in status.LoadBalancer

status:
  loadBalancer:
    ingress:
    - ip: <wk01>

Create networkpolicies
Upgrade to 2.5.2
Bug will be reproduced.
Upgrade to 2.5.3-rc1
podHandler: addLabelIfHostPortsPresent: deleting podNameFieldLabel --> is seen in the logs
User ingress in Initializing state. And does not recover
pods of nginx-ingress-controller in system project don't have podName labels
Redeploy nginx-ingress-controller. - by deleting the pods of ingress controller. Wait for the workload and pods to come to Active
User ingress is seen in Active state.
pods of nginx-ingress-controller in system project don't have podName labels
User ingress has both the worker nodes' ip address in status.LoadBalancer

status:
  loadBalancer:
    ingress:
    - ip: <wk01>
    - ip: <wk02>
    - ip: <wk03>

User created network policies do not get deleted
Kubelet logs on the worker node do not show repeated Sync Loop updates/messages
Cluster metrics: 21.40 - upgrade to 2.5.2 and 21.52 - upgrade to 2.5.3

sowmyav27 · 2020-11-24T17:18:50Z

Closing this as its been validated on 2.5.3 and 2.4 branches.

sowmyav27 added the kind/bug-qa Issues that have not yet hit a real release. Bugs introduced by a new feature or enhancement label Nov 12, 2020

sowmyav27 added this to the v2.5.3 milestone Nov 12, 2020

sowmyav27 assigned sowmyav27 and kinarashah Nov 12, 2020

maggieliu added the [zube]: Next Up label Nov 12, 2020

sowmyav27 mentioned this issue Nov 12, 2020

[BUG] Rancher breaks ingress-nginx by labelling its Pods ⚠️ #26677

Closed

maggieliu modified the milestones: v2.5.3, v2.4.11 Nov 12, 2020

Leen15 mentioned this issue Nov 13, 2020

Kubelet log flooded by SyncLoop on nginx-ingress-controller #30052

Closed

maggieliu modified the milestones: v2.4.11, v2.5.3 Nov 13, 2020

mrajashree mentioned this issue Nov 15, 2020

Upgrade to Rancher 2.5.2 and 2.4.10: cattle-cluster-agent hammers apiserver, very high CPU usage #30048

Closed

maggieliu mentioned this issue Nov 16, 2020

[2.4] Ingress issue with PNI #30089

Closed

kinarashah added the [zube]: Working label Nov 18, 2020

zube bot removed the [zube]: Next Up label Nov 18, 2020

kinarashah mentioned this issue Nov 18, 2020

[v2.4] [netpol] don't add back hostport labels for pods of system ns #30140

Merged

maggieliu modified the milestones: v2.5.4, v2.5.3 Nov 20, 2020

kinarashah mentioned this issue Nov 20, 2020

[v2.5.3] [netpol] don't add back hostport labels for pods of system ns #30171

Merged

kinarashah added the [zube]: Review label Nov 20, 2020

zube bot removed the [zube]: Working label Nov 20, 2020

This was referenced Nov 23, 2020

[v2.5] [netpol] don't add back hostport labels for pods of system ns #30204

Merged

[netpol] don't add back hostport labels for pods of system ns #30205

Merged

Forward port for ingress controller label issues #30208

Closed

kinarashah added the [zube]: To Test label Nov 24, 2020

zube bot removed the [zube]: Review label Nov 24, 2020

sowmyav27 closed this as completed Nov 24, 2020

zube bot added [zube]: Done and removed [zube]: To Test labels Nov 24, 2020

wirwolf mentioned this issue Dec 3, 2020

v1 monitoring chart does not deploy on local k3 cluster. #29328

Closed

zube bot removed the [zube]: Done label Feb 23, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ingress controller has labels in pods #30045

Ingress controller has labels in pods #30045

sowmyav27 commented Nov 12, 2020 •

edited

UberKuber commented Nov 12, 2020

al45tair commented Nov 13, 2020

kinarashah commented Nov 18, 2020 •

edited

sowmyav27 commented Nov 18, 2020 •

edited

leflambeur commented Nov 19, 2020

sowmyav27 commented Nov 21, 2020 •

edited

sowmyav27 commented Nov 24, 2020

Ingress controller has labels in pods #30045

Ingress controller has labels in pods #30045

Comments

sowmyav27 commented Nov 12, 2020 • edited

UberKuber commented Nov 12, 2020

al45tair commented Nov 13, 2020

kinarashah commented Nov 18, 2020 • edited

sowmyav27 commented Nov 18, 2020 • edited

Fresh Install use case - On 2.4-head - commit id: 6a09f523f

Upgrade from 2.4.8 to 2.4-head

Upgrade from 2.4.8 to 2.4-head - Scenario#2

leflambeur commented Nov 19, 2020

sowmyav27 commented Nov 21, 2020 • edited

Fresh Install use case - On 2.5.3-rc1

Upgrade from 2.5.2 to 2.5.3-rc1

Upgrade from 2.5.1 to 2.5.2 to 2.5.3-rc1

sowmyav27 commented Nov 24, 2020

sowmyav27 commented Nov 12, 2020 •

edited

kinarashah commented Nov 18, 2020 •

edited

sowmyav27 commented Nov 18, 2020 •

edited

Fresh Install use case - On 2.4-head - commit id: `6a09f523f`

sowmyav27 commented Nov 21, 2020 •

edited

Fresh Install use case - On `2.5.3-rc1`