Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add option to enable proxy protocol #699

Closed
mcfedr opened this issue Mar 20, 2019 · 28 comments
Closed

Add option to enable proxy protocol #699

mcfedr opened this issue Mar 20, 2019 · 28 comments

Comments

@mcfedr
Copy link
Contributor

mcfedr commented Mar 20, 2019

The google cloud load balancer can do it, so seems it would just need an annotation to enable.

https://cloud.google.com/load-balancing/docs/tcp/setting-up-tcp#proxy-protocol

@rramkumar1
Copy link
Contributor

@mcfedr TCP Proxy LB is not really meant for HTTP traffic. As a result, it does not make sense for us to support it here.

@dcherniv
Copy link

dcherniv commented Jan 17, 2020

@rramkumar1 Surely you cannot be serious. Here's a real world example that a lot of people out there use.
TCP LB (L4) -> nginx ingress controller (L7) -> internal service -> pod
Please have a look at https://kubernetes.io/docs/concepts/cluster-administration/cloud-providers/#aws

Responses like these are the reason why AWS is eating your lunch. If a user comes to you with a feature request that is not only valid, but highly sought after the correct response is not CLOSEWONTFIX.

@elsbrock
Copy link

Using it for HTTP traffic and would like to enable it via annotation too.

@CapoD
Copy link

CapoD commented Mar 31, 2020

@rramkumar1 When you run Istio on GKE, the ingress-gateway service will configure a TCP LB. Now, if you want to use Istio functions like Rate Limiting on IP basis, you need to know the source IP address. I hope that example might serve as a reasonable use case.
This is already possible on AWS

@Eifoen
Copy link

Eifoen commented Aug 28, 2020

+1 on this.

We're currently planning to deploy a scenario like the one @dcherniv mentioned with traefik instead of nginx.

Since there is no way to deploy traefik externally of the Kubernetes-Cluster by using e.g. keepalived (GCP Doc) - and the gcp integrated lb is the only way to go for a solutions like this - it's basically a deal breaker.

Btw.. Using the gcp integrated http/s lb is no real world option as it lacks tons of features other ingress controllers offer.

@Semmix
Copy link

Semmix commented Feb 1, 2021

Why was this closed ?

@jbielick
Copy link

jbielick commented Jun 3, 2021

If you're looking for preservation of source IP in a GCE Network Load Balancer (L4) -> Ingress Nginx Controller (L7) -> Pod setup, you can achieve that without proxy protocol: kubernetes/ingress-nginx#3431 (comment)

GCE NLB (L4) is not a proxy, so it does not need proxy protocol. Packets are forwarded straight to the VMs without SNAT. Further, your Ingress Nginx service should be externalTrafficPolicy: Local so that packets are not routed to VMs without endpoints and SNATed by kube-proxy.

In summary:

To preserve source IP from TCP LB (NLB) to Nginx Ingress: use externalTrafficPolicy: Local

To preserve source IP from Ingress Nginx to your pods (L7): Use enable-real-ip and proxy-real-ip-cidr. (though I think Forwarded-For gets added automatically without these).

TCP Proxy LB (not an NLB) seems like it has other use cases. NLB is sufficient for the case described above. I hope this helps.

@Eifoen
Copy link

Eifoen commented Jun 4, 2021

@jbielick Does this really work? Wouldn't this actually mean the following:

  • Client talks to the IP of the NLB - which forwards the traffic to one of the pods hosting the service
  • pods don't know anything about all this NLB Stuff and answer to the actual Source-IP of the client (which is in the TCP Packets)
  • pod answers with it's own IP as source (which might be SNATed by a gateway but must be allowed specifically anyways as far as I know)
  • the client gets data from some random IP which is not the one of the NLB which he connected to initially

result: no connection at all

As you said an NLB is a L4 device which means it's part of the transport layer which in turn means it terminates the TCP connections with it's own IP an connects to the actual service with a new connection between the NLB an the service. Of course with the same payload.

I doubt that the externalTrafficPolicy will solve this but will look further into it within the next weeks.

Please correct me if I'm totally mistaken of course.

Greetings,
Eifoen

@jbielick
Copy link

jbielick commented Jun 4, 2021

@Eifoen

Yes, it works. I think termination of the connection would be Layer 5. The major thing to note is that the NLB does not terminate the TCP connection. And you are correct that the packets are returned directly from the VM but maintain the correct source IP on return (See "direct server return").

Responses from the backend VMs go directly to the clients, not back through the load balancer. The industry term for this is direct server return.

image

https://cloud.google.com/load-balancing/docs/network

@elsbrock
Copy link

elsbrock commented Jun 5, 2021

I am using this too. I think you will have to run the Ingress as Daemonset in this case though, so that it is present on all of the nodes.

@zufardhiyaulhaq
Copy link

@jbielick,

To preserve source IP from TCP LB (NLB) to Nginx Ingress: use externalTrafficPolicy: Local

we already do this one, but when we rollout restarts the Istio pod, it causes a short downtime.

@pjestin
Copy link

pjestin commented Nov 25, 2022

We're using the Kong Kubernetes Ingress Controller and Kong seems to agree on the 2 solutions described by @jbielick above. However, as @zufardhiyaulhaq pointed out, the externalTrafficPolicy: Local solution causes errors during deployment restarts.

It seems that a load balancer with externalTrafficPolicy: Local sends traffic to the nodes that are running a "healthy" Kong pod. However that "health" is not perceived by the load balancer through pod readiness, but instead through healthchecks that can take some time to reflect terminated pods.

Ideally we should be able to configure a load balancer with externalTrafficPolicy: Cluster and proxy protocol through the Service resource.

It looks like there is support for this feature, would it be possible to reopen the issue?

@mmiller1
Copy link

Seconding support to have this reconsidered. I've been deploying the LoadBalancer L4 Service, waiting to the LB instance to come online and manually enabling the proxy protocol on these instances to get this to work (ew). Thanks.

@Eifoen
Copy link

Eifoen commented Jan 11, 2023

@mmiller1 You should take a look at the post of @jbielick

GCE NLB (L4) is not a proxy, so it does not need proxy protocol. Packets are forwarded straight to the VMs without SNAT. Further, your Ingress Nginx service should be externalTrafficPolicy: Local so that packets are not routed to VMs without endpoints and SNATed by kube-proxy.

Setting the externalTrafficPolicy: Local attribute does actually solve this by using the direct server return scheme - which solves this in a more elegant and less compute intensive way imho.

Although there exists a whitepaper by kemp technologies somewhere I can't find it. Anyways: here is some background on DSR

@mmiller1
Copy link

@Eifoen thanks for the response. Using the Local traffic policy doesn't fit our use-case. We can't tolerate any service unavailability when restarting the backing pods, kube-proxy handles this gracefully, but relying on the external load balancers health checking to remove a node from service when the pod is shut down will result in 10-20 seconds of dropped traffic.

@Eifoen
Copy link

Eifoen commented Jan 12, 2023

@mmiller1
The proxy-protocol does not solve this. Even though you enable the proxy protocol the node health checks of the GCP NLB still apply in the same way the do without the proxy protocol.

I was to suggest, that you consider using another kind of ingress-proxy within your deployment as DeamonSet - but this would still not solve your problem, as the healthcheck of the TCP LB of GCP still apply.

You might want to dig deeper into the configuration of service healthchecks and configure the Backend-CRD (which allows to specify the checkIntervalSec attribute) according to your requirements.

@mmiller1
Copy link

mmiller1 commented Jan 12, 2023 via email

@Eifoen
Copy link

Eifoen commented Jan 12, 2023

We can't tolerate any service unavailability when restarting the backing pods

The manually configured LBs will be subject to the same helthcheck intervals as the GKE created ones.

According to the explanation of your deployment the failover between pods is handled by ingress-nginx. The external LBs are just handling the failover between nginx pods (or the kube-proxy Cluster-IPs in your case). Thus your deployment is still subject to the (default 15 sec. as far as I know) timeout of the GCP LBs in case of the kubernetes-node (currently holding that IP) failing - at least if you did not change the default checkinterval of your manually deployed LBs. The fact that you might be using the kube-proxy as a middle man doesn't reduce your failover time in any way (at worst it might even lengthen it).

Anyways - all of the above isn't in any way tied to the fact that you are using the proxy protocol. It is however tied to the fact that you might have reduced the healthcheck interval in your manually configured LBs.

This, however, can already be configured automatically by using the Backend-CRDs a far as I'm concerned. In combination with using the automatically deployed service LBs with externalTrafficPolicy: Local (pointing to your nginx-ingress service pods not to your application pods directly) this should suit your use-case.

As I said:

You might want to dig deeper into the configuration of service healthchecks and configure the Backend-CRD (which allows to specify the checkIntervalSec attribute) according to your requirements.

@hobti01
Copy link

hobti01 commented Apr 13, 2023

We would also like the ability to enable proxy protocol with a Service annotation so that identifying the client source IP is possible for any TCP application.

Unfortunately, without proxy protocol support, it is simply not possible to masquerade Pod IPs for outbound connections while preserving client source IP.

Note: Using the Local policy to preserve source IP addresses will not work if ip-masq-agent is running on your cluster and the Pod IP address range is not excluded from IP masquerading. See Specifying nonMasqueradeCIDRs for instructions on excluding IP address ranges from masquerading.
Ref: https://cloud.google.com/kubernetes-engine/docs/how-to/service-parameters#externalTrafficPolicy

IP masquerade is rather necessary when Pods make connections that can only accept Node IP addresses, which is defined as the purpose of the ip-masq-agent. The ip-masq-agent is automatically installed in GKE clusters under several conditions (e.g. Pod IP range is not within 10.0.0.0/8) and the Pod IPs will be masqueraded - which will prevent client source IP from being available.

If proxy protocol is not necessary in this configuration, then we would very much like to know the alternative configuration that will provide the client source IP.

@aojea
Copy link
Member

aojea commented Apr 15, 2023

Unfortunately, without proxy protocol support, it is simply not possible to masquerade Pod IPs for outbound connections while preserving client source IP.

a Pod that connects to a Service loadbalancer on the same cluster will see the PodIP, because the traffic is shortcuted and it not leaving the cluster

@hobti01
Copy link

hobti01 commented Apr 26, 2023

a Pod that connects to a Service loadbalancer on the same cluster will see the PodIP, because the traffic is shortcuted and it not leaving the cluster

Yes, that's true but not the situation we have.

  • Pods make connections to non-Kubernetes system X which accepts Node IPs, requiring masquerade.
  • Internet clients make connections to Pods via Load Balancer and their source IP is not available.

@aojea
Copy link
Member

aojea commented Apr 29, 2023

Pods make connections to non-Kubernetes system X which accepts Node IPs, requiring masquerade.

I miss somem details here, but how is this related to the Service Loadbalancer type?

  • Internet clients make connections to Pods via Load Balancer and their source IP is not available

you have to use externalTrafficPolicy: Local, that was explicitly added to preserver client IP

@hobti01
Copy link

hobti01 commented Apr 30, 2023

you have to use externalTrafficPolicy: Local, that was explicitly added to preserver client IP

This setting on the service is not relevant in this case. We have this enabled but when IP masquerading is enabled, source IP is not available. See my previous comment (#699 (comment)) for the quote from the Google Cloud documentation.

@aojea
Copy link
Member

aojea commented Apr 30, 2023

with kube-proxy in recent versions, the traffic is shotcutted so it will not hit the ipmasquerade rules,
is this something you are evidencing or is just a theoretical conclusion based on the docs?

@hobti01
Copy link

hobti01 commented Apr 30, 2023

We experience in a production cluster that when IP masquerade is enabled and a Service set with externalTrafficPolicy: Local is accessed from a client across the internet via Load Balancer, the client source IP is not available. We were very surprised that the documentation of this significant limitation is only within the GKE service parameters documentation and no mention is given in the IP masquerade documentation.
With ip-masq-agent and externalTrafficPolicy: Local, the client IP is not available to the Pod:
client -> internet -> LB -> (Node) -> Pod

@aojea
Copy link
Member

aojea commented May 1, 2023

Hmm 🤔 that should not happen... Is this a gke cluster? Have you open an issue with support about it? I can take a look if I get the issue number

@hobti01
Copy link

hobti01 commented May 1, 2023

Thanks for the offer @aojea , I'll send you a message directly.

@gitanuj
Copy link

gitanuj commented Aug 25, 2023

Is there a way to enable proxy protocol through Kube Service annotations? I want to use Proxy Protocol for reading the Private Service Connection IDs for incoming internal traffic. (and proxy protocol is the only way to get this information from incoming traffic)

Both AWS and Azure have annotations to do this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests