New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rolling-update with NodePort and Service VIP looses requests #55667

Closed
sdaschner opened this Issue Nov 14, 2017 · 4 comments

Comments

Projects
None yet
5 participants
@sdaschner

sdaschner commented Nov 14, 2017

Is this a BUG REPORT or FEATURE REQUEST?:

Uncomment only one, leave it on its own line:

/kind bug

/kind feature

What happened:
Since the update from Kubernetes 1.6.4 to 1.8 -- in my case provided by Minikube v0.18, now v0.23 -- the zero-downtime rolling-update with NodePort services (and Service VIP) doesn't work anymore without loosing requests. After doing some investigation (how Kubernetes managed rolling-updates, also see kubernetes/contrib#1140), my understanding was, that at least the Service VIP should provide rolling-update without any lost requests. This was always the case when connecting against the Minikube node port (e.g. 192.168.99.100:30655). When downgrading the installation to Minikube v0.18 and Kubernetes 1.6.4 all requests to the service are handled reliably; the latest version reliably fails some requests when performing a rolling-update.

What you expected to happen:
Rolling-update with zero-downtime and without loosing requests, when accessing the Service VIP (Minikube with NodePort).

How to reproduce it (as minimally and precisely as possible):
Create a single service with type NodePort and a deployment with liveness and readiness probes and a container that gracefully handles termination. See https://github.com/sdaschner/hello-cloud/tree/master/deployment/ as an example.

Anything else we need to know?:

Environment:

  • Kubernetes version (use kubectl version): v1.8 (worked in v1.6.4)
  • Cloud provider or hardware configuration: Minikube v0.23.0 (worked in v0.18)
  • OS (e.g. from /etc/os-release): ArchLinux
  • Kernel (e.g. uname -a): Linux archlinux 4.13.9-1-ARCH #1 SMP PREEMPT Sun Oct 22 09:07:32 CEST 2017 x86_64 GNU/Linux
  • Install tools: Minikube, pacaur
  • Others:

/sig network

@thockin

This comment has been minimized.

Show comment
Hide comment
@thockin
Member

thockin commented Jan 6, 2018

@sdaschner

This comment has been minimized.

Show comment
Hide comment
@sdaschner

sdaschner Jan 6, 2018

Not yet, i.e. the application server will be (gracefully) shutdown but very quickly stop to accept new requests. That would be one possible solution, yes, to add a report-unhealthy-readiness-probe-but-keep-serving-requests functionality.

However, in the old Minikube version the Service VIP took care of that. In other words, the same example satisfied the zero-downtime from the K8s already.

sdaschner commented Jan 6, 2018

Not yet, i.e. the application server will be (gracefully) shutdown but very quickly stop to accept new requests. That would be one possible solution, yes, to add a report-unhealthy-readiness-probe-but-keep-serving-requests functionality.

However, in the old Minikube version the Service VIP took care of that. In other words, the same example satisfied the zero-downtime from the K8s already.

@sdaschner

This comment has been minimized.

Show comment
Hide comment
@sdaschner

sdaschner Feb 7, 2018

Update as of today:
In Minikube v0.25.0, Kubernetes 1.9.0, the behavior now seems to work as expected. That is, the service IP doesn't loose requests anymore.

Also -- for the first time, as far as I can tell -- taking the default strategy.rollingUpdate.maxUnavailable actually causes the currently running pod to be terminated immediately (if replicas is 1), before the new one is ready. This definitely wasn't the case in previous versions, which I for now thought was a downtime-prevention feature. However, the behavior also matches the documentation of the rolling update strategy.

So for me this issue is resolved, I'm just wondering if there is any confirmation or related issue?

sdaschner commented Feb 7, 2018

Update as of today:
In Minikube v0.25.0, Kubernetes 1.9.0, the behavior now seems to work as expected. That is, the service IP doesn't loose requests anymore.

Also -- for the first time, as far as I can tell -- taking the default strategy.rollingUpdate.maxUnavailable actually causes the currently running pod to be terminated immediately (if replicas is 1), before the new one is ready. This definitely wasn't the case in previous versions, which I for now thought was a downtime-prevention feature. However, the behavior also matches the documentation of the rolling update strategy.

So for me this issue is resolved, I'm just wondering if there is any confirmation or related issue?

@fejta-bot

This comment has been minimized.

Show comment
Hide comment
@fejta-bot

fejta-bot May 8, 2018

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

fejta-bot commented May 8, 2018

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment