Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nginx Controller using endpoints instead of Services #257

Closed
Nalum opened this issue Feb 10, 2017 · 40 comments
Closed

Nginx Controller using endpoints instead of Services #257

Nalum opened this issue Feb 10, 2017 · 40 comments

Comments

@Nalum
Copy link
Contributor

Nalum commented Feb 10, 2017

Is there any reason as to why the Nginx controller is set up to get the endpoints that a Service uses rather than using the Service?

When I run updates to a deployment and they get rolled out some requests are being sent to the pods that are being terminated which results in a 5XX error because nginx still thinks it's available. Doesn't the use of the Service take care of this issue?

I'm using this image: gcr.io/google_containers/nginx-ingress-controller:0.8.3

I'd be happy to look into changing this so it works with the Services rather than the Endpoints if it makes sense to everyone that it does that.

@yissachar
Copy link

The old docs explained:

The NGINX ingress controller does not uses Services to route traffic to the pods. Instead it uses the Endpoints API in order to bypass kube-proxy to allow NGINX features like session affinity and custom load balancing algorithms. It also removes some overhead, such as conntrack entries for iptables DNAT.

@Nalum
Copy link
Contributor Author

Nalum commented Feb 10, 2017

Ah okay.

Thanks for that, I didn't find it in my searches.

Are there efforts being made to have it more responsive to the readiness and liveness checks do you know?

@aledbf
Copy link
Member

aledbf commented Feb 10, 2017

Are there efforts being made to have it more responsive to the readiness and liveness checks do you know?

@Nalum yes. We need to release a final version before changing the current behavior.
The idea (we need a POC first) is to use the kong load balancer code that allows the update of the nginx upstreams without requiring a reload.

@Nalum
Copy link
Contributor Author

Nalum commented Feb 13, 2017

@aledbf anything I can do to help with this?

@Nalum
Copy link
Contributor Author

Nalum commented Feb 22, 2017

Just dropping these comments here:

aledbf Slack 11:59 AM 2017-02-22
@Nalum about that, I just need time 🙂 . We need to release 0.9 first. I hope after beta 2 #303 we can release 0.9. After that I will start looking the kong load balancer lua code. Basically this is the code we need to understand and “port” Kong/kong#1735

aledbf Slack 12:00 PM 2017-02-22
@Nalum please take this just as an idea and POC. We really need to found a way to avoid reloads and better handling of nginx upstreams

@rikatz
Copy link
Contributor

rikatz commented Jun 7, 2017

OK, I was taking a look at this:

https://github.com/yaoweibin/nginx_upstream_check_module

Maybe we can open a feature request to configure this, but I think this is not the case of this issue.

About using services instead of PODs, we also have to take in mind that serviceIPs are valid only inside a Kubernetes Cluster, while POD IPs might be valid in your network (using Calico or some other solution).

Anyway, ingress gets the upstream IP from Service (connect to service, watch the service and check for POD IP changes in that service) so it might be the case to health check the Upstream POD with the module above, instead of using the Service IP :)

@aledbf Is this the case to open a feature request using the upstream healthcheck module or should we keep this open?

Thanks!

@Nalum
Copy link
Contributor Author

Nalum commented Jun 7, 2017

@rikatz correct me if I'm wrong, the brief look I took at that repo didn't show that it is using the kubernetes health checks, I would think it better to take advantage of those rather than adding additional checks. So have nginx be more responsive to the changes of the service.

@aledbf
Copy link
Member

aledbf commented Jun 7, 2017

@rikatz adding an external module for this just duplicates the probes feature already provided by kubernetes.

@aledbf
Copy link
Member

aledbf commented Jun 7, 2017

@rikatz what this issue is really about how we update the configuration (change in pods) without reloading nginx

@rikatz
Copy link
Contributor

rikatz commented Jun 7, 2017

Yeap, Kube health check is the best approach. I was just thinking about an alternative :)

Will take a look in how to change the config without reload the nginx also.

@rikatz
Copy link
Contributor

rikatz commented Jun 9, 2017

@aledbf and about this module: https://github.com/yzprofile/ngx_http_dyups_module

And ingress adding / removing upstreams based in HTTP request for this? This is the same approach that NGINX Plus uses to add/remove upstreams without reload / kill nginx process

@aledbf
Copy link
Member

aledbf commented Jun 9, 2017

@rikatz I tried that module like a year ago (without success). Maybe we need to check again

@rikatz
Copy link
Contributor

rikatz commented Jun 12, 2017

@aledbf I did some quick checks here and it worked 'fine'.

But the following situations still 'reloads' the nginx:

  • New vHost
  • New Path (changed Ingress)
  • New SSL Cert

I couldn't verify if, for each upstream change, NGINX got reloaded or if this doesn't happens.

So, is this module of upstreams still applicable to the whole problem? I can start writing a PoC of this approach (it includes changing the nginx.tmpl, the core and a lot of other things) but if you think it's still a valuable change, we can do this :D

Thanks

@arikkfir
Copy link

Hi @chrismoos,
Would it be possible to also support this annotation in the Nginx configmap as well? We have quite a few Ingress resources and would love setting it in a centralized place instead of in each one... ?

@aledbf
Copy link
Member

aledbf commented Oct 8, 2017

Closing. It is possible to choose between endpoints or services using a flag.

@aledbf aledbf closed this as completed Oct 8, 2017
@mbugeia
Copy link

mbugeia commented Oct 11, 2017

@aledbf Hi, can you tell me which flag should I use to use service-upstream by default ? I can't find it in the documentation or with -h.

@jordanjennings
Copy link

@mbugeia It's ingress.kubernetes.io/service-upstream and you can find that on this page: https://github.com/kubernetes/ingress-nginx/blob/master/configuration.md#service-upstream

@montanaflynn
Copy link

@jesseshieh
Copy link

@Nalum @aledbf sorry to comment on a closed issue, but I was wondering what the status of using "the kong load balancer code that allows the update of the nginx upstreams without requiring a reload" is. Was there any progress made on that at all?

@aledbf
Copy link
Member

aledbf commented Nov 5, 2017

Was there any progress made on that at all?

No, sorry. One of the reason is that adding lua again implies that will work only in some platforms

@rikatz
Copy link
Contributor

rikatz commented Nov 5, 2017

It might be a stupid question of mine, but are we 'HUP'ing the process? (http://nginx.org/en/docs/control.html)

I've seen also that we could send a 'USR2' signal (https://www.digitalocean.com/community/tutorials/how-to-upgrade-nginx-in-place-without-dropping-client-connections) but this might overload NGINX with stale connections, right?

@aledbf
Copy link
Member

aledbf commented Nov 5, 2017

@rikatz the USR2 procedure create several new problems:

  • doubles the resource usage
  • hard to coordinate
  • if you have clients connections with keepalive you loose request
    (I already tried this approach two years ago)

@mcfedr
Copy link

mcfedr commented Apr 2, 2018

This is likely improved by using #2174

@jwatte
Copy link

jwatte commented Oct 17, 2018

@montanaflynn that link is 404, here's the now-working link: https://github.com/kubernetes/ingress-nginx/blob/master/docs/user-guide/nginx-configuration/annotations.md#service-upstream
(and that's very likely to rot, too -- google for nginx annotation service-upstream to find it whenever you next need this)

@michaelajr
Copy link

michaelajr commented Mar 8, 2019

When I use the annotation I still see all the Pod IPs as endpoints. Shouldn't I see the Service's ClusterIp?

I see this:

Host  Path  Backends
  ----  ----  --------
  *     *     testapp:443 (192.168.103.2:443,192.168.203.66:443,192.168.5.3:443)

@doubleyewdee
Copy link

@michaelajr make sure you're using nginx.ingress.kubernetes.io/service-upstream -- the shorter form no longer works for nginx-specific ingress settings by default.

@robbie-demuth
Copy link

robbie-demuth commented Nov 14, 2019

Does anyone know why the docs indicate that sticky sessions won't work if the service-upstream annotation is used? Wouldn't the session affinity configuration on the Kubernetes Service itself be honored in that case?

EDIT

It looks like this is due to the fact that Kubernetes session affinity is based on client IPs. By default, NGINX ingress obscures client IPs because the Service it creates sets externalTrafficPolicy to Cluster. It looks like setting it to Local instead might resolve the issue, but I'm not quite sure I understand what the implications are - particularly regarding imbalanced traffic spreading

https://kubernetes.io/docs/tasks/access-application-cluster/create-external-load-balancer/#preserving-the-client-source-ip

@zhangyuxuan1992
Copy link

zhangyuxuan1992 commented Jun 5, 2020

@michaelajr make sure you're using nginx.ingress.kubernetes.io/service-upstream -- the shorter form no longer works for nginx-specific ingress settings by default.

I too use
nginx.ingress.kubernetes.io/service-upstream: 'true'
but till see all pod and ports instead of ClusterIp.

/*.* *:8423 (192.168.102.25:8423,192.168.149.225:8423)

@VengefulAncient
Copy link

I too use
nginx.ingress.kubernetes.io/service-upstream: 'true'
but till see all pod and ports instead of ClusterIp.

/*.* *:8423 (192.168.102.25:8423,192.168.149.225:8423)

Same here and I would really like to understand whether the directive is actually working. Right now it seems to me that it isn't.

@cayla
Copy link

cayla commented May 23, 2022

For anyone finding this page years later, you can confirm the change is working by looking at the ingress-nginx logs for traffic to the ingress in question. By default, the logging will show the backend IP that handled the request. It's either an endpoint IP or service IP depending on the configuration.

https://kubernetes.github.io/ingress-nginx/user-guide/nginx-configuration/log-format/

hazzadous pushed a commit to PostHog/charts-clickhouse that referenced this issue Aug 17, 2022
Instead of having NGINX maintain it's own list of upstreams, we want it
instead to use the service ClusterIP such that k8s can handle e.g. cases
where pods are evicted.

Without this we end up with 5xx errors e.g. on deploys. See
kubernetes/ingress-nginx#257 for more details.
hazzadous pushed a commit to PostHog/charts-clickhouse that referenced this issue Aug 22, 2022
Instead of having NGINX maintain it's own list of upstreams, we want it
instead to use the service ClusterIP such that k8s can handle e.g. cases
where pods are evicted.

Without this we end up with 5xx errors e.g. on deploys. See
kubernetes/ingress-nginx#257 for more details.
@narqo
Copy link
Contributor

narqo commented Nov 22, 2022

Just so people coming to this issue considered it:

We tried rolling service-upstream = true setting to the nginx-ingress running in EKS cluster, and observed a huge imbalance in traffic handling per individual deployment's POD. I believe, this must have been expected, due to kube-proxy in iptables mode doing load-balancing randomly, and since we spread the deployments' PODs across the nodes and the AZs.

We've decided to roll back to use the default service-upstream = false, as shown on the screenshot below (no details on the screenshot, due the internal policies)

Screenshot 2022-11-22 at 11 25 30

@longwuyuan
Copy link
Contributor

@narqo thanks for the udpate

@snigdhasambitak
Copy link

@narqo did you get it working with service-upstream = true? Could you please specify the version of nginx ingress you are using and your k8s cluster details? It seems to be not working in 1.22

@narqo
Copy link
Contributor

narqo commented Dec 1, 2022

did you get it working with service-upstream = true?

Yes it worked in our tests (AWS EKS 1.21, ingress-nginx v1.3.1, deployed and configured via Helm). We confirmed that the change (and the rollback of the change) took the effect by observing the changes in nginx controller's access logs, where we log the details of the request's upstream.

@sherifabdlnaby
Copy link

Just so people coming to this issue considered it:

We tried rolling service-upstream = true setting to the nginx-ingress running in EKS cluster, and observed a huge imbalance in traffic handling per individual deployment's POD. I believe, this must have been expected, due to kube-proxy in iptables mode doing load-balancing randomly, and since we spread the deployments' PODs across the nodes and the AZs.

We've decided to roll back to use the default service-upstream = false, as shown on the screenshot below (no details on the screenshot, due the internal policies)

Screenshot 2022-11-22 at 11 25 30

Interesting, would using Toplogy Aware Hints with Toplogy Spread Constraint (for Nginx pods as well as deployment) help ?

@taliastocks
Copy link

Just so people coming to this issue considered it:

We tried rolling service-upstream = true setting to the nginx-ingress running in EKS cluster, and observed a huge imbalance in traffic handling per individual deployment's POD. I believe, this must have been expected, due to kube-proxy in iptables mode doing load-balancing randomly, and since we spread the deployments' PODs across the nodes and the AZs.

We've decided to roll back to use the default service-upstream = false, as shown on the screenshot below (no details on the screenshot, due the internal policies)

Screenshot 2022-11-22 at 11 25 30

Were you able to have any luck with using preStop lifecycle hooks on your service pods to delay long enough for nginx to pick up the endpoint configuration change? Were you able to adjust any nginx settings, like proxy_next_upstream to avoid failed requests during a service upgrade?

I've been playing around with these myself with no luck (the service backend is gRPC, so we would lose service call load balancing if we enabled service-upstream=True, similar to what you show in your chart).

@narqo
Copy link
Contributor

narqo commented Jan 20, 2023

Were you able to have any luck with using preStop lifecycle hooks on your service pods to delay long enough for nginx to pick up the endpoint configuration change?

This is exactly the issue we were looking at, when we experimented with service-upstream=true. The preStop hook for every deployment (initially raised in #7330) isn't a scalable solution, for dynamic production with hundreds of µ-services behind the ingress. We are looking for better alternatives.

@taliastocks
Copy link

Were you able to have any luck with using preStop lifecycle hooks on your service pods to delay long enough for nginx to pick up the endpoint configuration change?

This is exactly the issue we were looking at, when we experimented with service-upstream=true. The preStop hook for every deployment (initially raised in #7330) isn't a scalable solution, for dynamic production with hundreds of µ-services behind the ingress. We are looking for better alternatives.

Honestly, I would even be happy with like a 5-10 second preStop hook, but even 30 seconds preStop delay + 30 seconds connection draining was not consistently long enough to prevent seeing DEADLINE_EXCEEDED.

The next thing I'm planning on trying is setting nginx.ingress.kubernetes.io/proxy-connect-timeout: "1" and relying on proxy_next_upstream behavior. It's not ideal because it would still add a second latency to some requests, but that's not a complete deal-breaker for our use-case.

@taliastocks
Copy link

Honestly, I would even be happy with like a 5-10 second preStop hook, but even 30 seconds preStop delay + 30 seconds connection draining was not consistently long enough to prevent seeing DEADLINE_EXCEEDED.

The next thing I'm planning on trying is setting nginx.ingress.kubernetes.io/proxy-connect-timeout: "1" and relying on proxy_next_upstream behavior. It's not ideal because it would still add a second latency to some requests, but that's not a complete deal-breaker for our use-case.

Welp, turns out the actual problem I was running into was related to new pods coming up slowly rather than old pods terminating. Apologies for misidentifying the issue!

@dcodix
Copy link

dcodix commented Dec 2, 2023

Just so people coming to this issue considered it:

We tried rolling service-upstream = true setting to the nginx-ingress running in EKS cluster, and observed a huge imbalance in traffic handling per individual deployment's POD. I believe, this must have been expected, due to kube-proxy in iptables mode doing load-balancing randomly, and since we spread the deployments' PODs across the nodes and the AZs.

We've decided to roll back to use the default service-upstream = false, as shown on the screenshot below (no details on the screenshot, due the internal policies)

Screenshot 2022-11-22 at 11 25 30

We observed the same behaviour.
This seems to happen because nginx uses "long lived" (keepalive) connections, so it does just 1 connection to the service IP, which DNATs to 1 pod IP, and then all the requests go thru it up to a maximum number. That number seems to be defined by upstream-keepalive-requests which I believe defaults to 10k requests, so, only after 10k requests a new TCP connection will be established. This lowers the number of TCP connections by x/10k so, if you have 1k requests per minute, you will not establish a new connection for 10 minutes (unless it idles). kube-proxy (iptables) does a good job at simulation roundrobin, but only with a big enough volume. Like, if you actually do 20k TCP connections via iptables, it will most likely look balanced, but if instead you do 2 chances are that it will not balance.
We were thinking if it was worth leaving service-upstream: true but lowering upstream-keepalive-requests so more TCP connections are made, but so far we did not because we did not have time to test the impact of this change in performance and resources in general.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests