Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: VirtualServer/VirtualServerRoute flapping status #7491

Open
sass1997 opened this issue Mar 11, 2025 · 2 comments
Open

[Bug]: VirtualServer/VirtualServerRoute flapping status #7491

sass1997 opened this issue Mar 11, 2025 · 2 comments
Labels
bug An issue reporting a potential bug waiting for response Waiting for author's response

Comments

@sass1997
Copy link

Version

3.7.0

What Kubernetes platforms are you running on?

Kind

Steps to reproduce

  1. Deploy some VirtualServer and VirtualServerRoutes -> It really not depends what combination
  2. Do a Rollout Restart of the NGINX Ingress Controller -> Deployed as 3 Replicas
  3. Instead of Rollout Restart sometimes it also flaps the status if you just kill 1 of 3 pods
  4. Following a rollout restart, we've observed an intermittent issue. While it doesn't occur every time, there's a high probability that some virtual servers and virtual server routes experience status fluctuations. These fluctuations typically manifest as either a change to a warning status or a complete loss of status information. This behavior is most commonly observed after the restart process, although it's not guaranteed to happen in every instance. As well there is no guarantee which of the virutal server or virtual server route status will change

Method to fix is:

  • Each VirtualServer and VirtualServerRoute which has this new "wrong" status needs to be deleted and reapplied. Sometimes I've seen that I don't need to delete all of them that all are updated again correctly. So e.x . after deleting 3 of 8 the remaining 5 also got updated. As well here I can't see any schema or pattern how to force this.

Important to say is the virtualserver and virtualserverroutes are working like they should even when the status is Warning or empty.

In my opinion this status is really flaky and the deleting and readding means each time a little downtime which doesn't make sense.

Is there maybe an endpoint where I can trigger the ingress to update the virtualserver status as it does when delete and readd the manifests.

@sass1997 sass1997 added bug An issue reporting a potential bug needs triage An issue that needs to be triaged labels Mar 11, 2025
Copy link

Hi @sass1997 thanks for reporting!

Be sure to check out the docs and the Contributing Guidelines while you wait for a human to take a look at this 🙂

Cheers!

@vepatel
Copy link
Contributor

vepatel commented Mar 11, 2025

Hi @sass1997 the behaviour you're seeing is due to batch reloading implemented in NIC, see https://docs.nginx.com/nginx-ingress-controller/overview/design/#when-nginx-ingress-controller-reloads-nginx
the resources which aren't included in the first batch after restart shows the wrong status and once they're included in the subsequent batch (e.x . after deleting 3 of 8 the remaining 5 also got updated) the correct status is reinstated.

Important to say is the virtualserver and virtualserverroutes are working like they should even when the status is Warning or empty.

This is because the nginx config is still correct, hope this helps. You can minimise this by changing the nginx-reload-timeout. hope this helps

@vepatel vepatel added waiting for response Waiting for author's response and removed needs triage An issue that needs to be triaged labels Mar 11, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug An issue reporting a potential bug waiting for response Waiting for author's response
Projects
None yet
Development

No branches or pull requests

2 participants