Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rancher logs spammed with "Updating workload [ingress-nginx/nginx-ingress-controller]" and "Updating service [frontend] with public endpoints" on a rollback #35798

Closed
sowmyav27 opened this issue Dec 8, 2021 · 11 comments
Assignees
Labels
kind/bug-qa Issues that have not yet hit a real release. Bugs introduced by a new feature or enhancement release-note Note this issue in the milestone's release notes status/wontfix
Milestone

Comments

@sowmyav27
Copy link
Contributor

sowmyav27 commented Dec 8, 2021

Rancher Server Setup

  • Rancher version: 2.5.11 to 2.6-head e4db3e to 2.5.11
  • Installation option (Docker install/Helm Chart): docker install

Information about the Cluster

  • Downstream clusters:

Screen Shot 2021-12-07 at 10 43 16 PM

Describe the bug

  • Deployed the clusters
  • Deployed few resources including workloads, ingress and services. Deployed fleet bundles
  • Followed rancher docs to upgrade a single node docker install and rollback
  • On a rollback, rancher logs have these messages, but it does stop after a couple of minutes
2021/12/08 06:38:01 [INFO] Updating workload [ingress-nginx/nginx-ingress-controller] with public endpoints [[{"nodeName":"c-t6mq2:m-29kfb","addresses":["ip1"],"port":80,"protocol":"TCP","podName":"ingress-nginx:nginx-ingress-controller-5bj94","allNodes":false},{"nodeName":"c-t6mq2:m-29kfb","addresses":["ip1"],"port":443,"protocol":"TCP","podName":"ingress-nginx:nginx-ingress-controller-5bj94","allNodes":false},{"nodeName":"c-t6mq2:m-vbtqs","addresses":["ip2"],"port":80,"protocol":"TCP","podName":"ingress-nginx:nginx-ingress-controller-86cfr","allNodes":false},{"nodeName":"c-t6mq2:m-vbtqs","addresses":["ip2"],"port":443,"protocol":"TCP","podName":"ingress-nginx:nginx-ingress-controller-86cfr","allNodes":false}]]
2021/12/08 06:38:01 [INFO] Updating workload [kube-system/aws-node] with public endpoints [[{"nodeName":"c-bnx9l:machine-7wc6j","addresses":["ip3"],"port":61678,"protocol":"TCP","podName":"kube-system:aws-node-zj6zl","allNodes":false},{"nodeName":"c-bnx9l:machine-hc6q7","addresses":["ip4"],"port":61678,"protocol":"TCP","podName":"kube-system:aws-node-d5mcb","allNodes":false}]]
2021/12/08 06:38:01 [INFO] Updating workload [fleet-mc-helm-example/frontend] with public endpoints [[{"addresses":["ip4"],"port":30596,"protocol":"TCP","serviceName":"fleet-mc-helm-example:frontend","allNodes":true}]]
  • and
2021/12/08 06:38:13 [INFO] Updating service [frontend] with public endpoints [[{"addresses":["<>"],"port":31061,"protocol":"TCP","serviceName":"fleet-helm-example:frontend","allNodes":true}]]
2021/12/08 06:38:13 [INFO] Updating service [frontend] with public endpoints [[{"addresses":["<>"],"port":30596,"protocol":"TCP","serviceName":"fleet-mc-helm-example:frontend","allNodes":true}]]
2021/12/08 06:38:13 [INFO] Updating workload [fleet-mc-helm-example/frontend] with public endpoints [[{"addresses":["<>"],"port":30596,"protocol":"TCP","serviceName":"fleet-mc-helm-example:frontend","allNodes":true}]]
  • The existing ingresses. are functional. and the clusters are Active.
@sowmyav27 sowmyav27 self-assigned this Dec 8, 2021
@sowmyav27 sowmyav27 added kind/bug-qa Issues that have not yet hit a real release. Bugs introduced by a new feature or enhancement status/release-blocker labels Dec 8, 2021
@sowmyav27 sowmyav27 added this to the v2.6.3 milestone Dec 8, 2021
@sowmyav27 sowmyav27 changed the title Rancher logs spammed with "Updating workload [ingress-nginx/nginx-ingress-controller]" on a rollback Rancher logs spammed with "Updating workload [ingress-nginx/nginx-ingress-controller]" and "Updating service [frontend] with public endpoints" on a rollback Dec 8, 2021
@cbron
Copy link
Contributor

cbron commented Dec 8, 2021

Might be tangentially related to: https://github.com/rancher/rancher/issues/35690

@anupama2501
Copy link
Contributor

anupama2501 commented Dec 9, 2021

Seeing this on HA as well. Upgraded from 2.6.2 --> 2.6-head 1069e and rolled back to 2.6.2

12/08 23:45:14 [INFO] Updating workload [ingress-nginx/nginx-ingress-controller] with public endpoints [[{"nodeName":"c-lbn6r:m-rs89d","addresses":[""],"port":80,"protocol":"TCP","podName":"ingress-nginx:nginx-ingress-controller-95xjv","allNodes":false},{"nodeName":"c-lbn6r:m-rs89d","addresses":[""],"port":443,"protocol":"TCP","podName":"ingress-nginx:nginx-ingress-controller-95xjv","allNodes":false}]]

@sowmyav27
Copy link
Contributor Author

This is seen on a docker install upgrade + rollback 2.6.2 to 2.6-head to 2.6.2

@deniseschannon deniseschannon added [zube]: To Triage release-note Note this issue in the milestone's release notes labels Dec 9, 2021
@ibuildthecloud
Copy link
Contributor

I would expect to see this issue on rollback but after the cluster agent has been rolled back to the 2.6.2 agent the messages should stop. Is that the case? If 2.6.2 is running the cluster agent from 2.6.3 there would be an issue, but 2.6.2 is supposed to update the cluster agent back to 2.6.2 eventually.

@sowmyav27
Copy link
Contributor Author

@ibuildthecloud Yes, the logs do stop after a few minutes. And would be explained by your comments

@dnoland1
Copy link
Contributor

dnoland1 commented Dec 22, 2021

I hit this same issue, but on an upgrade from v2.6.2 to v2.6.3, so maybe we should consider fixing it. It did eventually stop, but went on for 80 minutes and generated 17MB logs with 24,468 entries. Could be related to rancher managing downstream clusters with rancher.

@oingooing
Copy link

oingooing commented Jan 7, 2022

same issue with v2.6.3

@nickvth
Copy link

nickvth commented Jan 19, 2022

@sowmyav27 Why is this closed? After upgrade to 2.6.3 rancher log is full with Updating workload.

2022/01/18 20:48:34 [INFO] Updating workload [cattle-monitoring-system/rancher-monitoring-prometheus-node-exporter] with public endpoints [[{"nodeName":....................

@rchench
Copy link

rchench commented Jan 31, 2022

same question here, after upgraded from 2.6.2 to 2.6.3, the rancher pod keeps updating the nginx-ingress-controller public endpoints like crazy, and eventually caused panic and restarted.

[INFO] Updating workload [ingress-nginx/nginx-ingress-controller] with public endpoints [[

@nickvth
Copy link

nickvth commented Apr 1, 2022

@deniseschannon please reopen this issue, this is still a problem and also etcd size is growing very fast after upgrade to 2.6.3. This is because of Updating workload issue. We can't upgrade to 2.6.3 because of this error and etcd size is growing like crazy. This problem is on all daemonsets, like node-exporter..

@zube zube bot removed the [zube]: Done label Apr 2, 2022
@cbron
Copy link
Contributor

cbron commented Sep 16, 2022

If you're seeing this issue with workloads and either service or ingress, you should see if you service or ingress is being repeatedly updated with different values. For example this can happen when ingress controllers assign new IP's to an ingress object, and another one does the same, so they waffle back and forth. Rancher will take each update and update various objects with that data. This can lead to degraded kube api performance and lots of log messages.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug-qa Issues that have not yet hit a real release. Bugs introduced by a new feature or enhancement release-note Note this issue in the milestone's release notes status/wontfix
Projects
None yet
Development

No branches or pull requests

10 participants