HTTP L7 load balancer / reverse proxy #561

lavalamp · 2014-07-22T18:11:19Z

Configurable, using labels to route traffic.

From discussion: https://groups.google.com/d/msg/google-containers/frOLMyNl5U4/W5_DQUL933IJ

I suppose, if we had some sort of dynamic router based on label queries, you could make that work for your needs with a bit of configuration. I'm not sure if there's really a need for that once DNS naming is set up, though.

I think you're going to want to incorporate some sort of http/s router or at least have a suggested means of configuring one. It seems to be one of the most obvious use cases for Kubernetes.

Like I said, I'm not sure this is needed if there's a good load balancer and DNS name resolution, but filing this for tracking the discussion.

smarterclayton · 2014-07-23T02:35:45Z

Discussed a bit in #260 already. I've got some folks looking at adding arbitrary load balancer units and label based query backends for http(s), sni, and websockets - will have them describe some of what they're working on soon.

bgrant0607 · 2014-09-25T03:23:08Z

/cc @thockin

smarterclayton · 2014-09-25T04:50:13Z

Pull openshift/origin#88 in origin is a prototype of a route - a resource representing an inbound connection from the external network that would be satisfied by a load balancer and direct traffic to a service. It will (but is not yet) complemented by a go client implementation that can read endpoints like the kube-proxy and generate arbitrary proxy server configs for things like apache, haproxy, and nginx. Ideally, those would be routers running in docker containers with external ips, parameterized by the address of the master.

thockin · 2014-09-25T15:45:17Z

Yeah, I think something will need to work more or less out of the box
here. Services are great, but they are mostly an internal construct - our
story about routing external traffic is not as strong as it needs to be.
Not enough hours in the day to think about it all :)

On Wed, Sep 24, 2014 at 9:50 PM, Clayton Coleman notifications@github.com
wrote:

Pull openshift/origin#88 openshift/origin#88 in
origin is a prototype of a route - a resource representing an inbound
connection from the external network that would be satisfied by a load
balancer and direct traffic to a service. It will (but is not yet)
complemented by a go client implementation that can read endpoints like the
kube-proxy and generate arbitrary proxy server configs for things like
apache, haproxy, and nginx. Ideally, those would be routers running in
docker containers with external ips, parameterized by the address of the
master.

Reply to this email directly or view it on GitHub
#561 (comment)
.

lavalamp · 2014-09-25T17:24:09Z

FYI, /api/v1betaX/proxy/services/serviceName works already, and it's as load balanced as anything else in our system ;)

thockin · 2014-09-25T18:07:43Z

I somehow doubt we want to route all external traffic through our apiserver
:)

On Thu, Sep 25, 2014 at 10:24 AM, Daniel Smith notifications@github.com
wrote:

FYI, /api/v1betaX/proxy/services/serviceName works already, and it's as
load balanced as anything else in our system ;)

Reply to this email directly or view it on GitHub
#561 (comment)
.

rektide · 2014-10-04T01:17:08Z

Just a headsup: Mailgun has a very slick go-based, etcd configured (with additional http control points) http proxy, Vulcand: https://github.com/mailpipe/vulcand

bgrant0607 · 2014-12-18T21:23:29Z

For reference, GCE's L7 APIs:

"Route" sounds too network-y. URLMapper sounds more accurate.

Not sure how fancy we'd want to get with URL mapping. Probably at least permit a target path, in order to facilitate multiplexing. Ideally not more general-purpose pattern matching.

Copied from #2585:

OpenShift's Route type:

type Route struct {
    TypeMeta   `json:",inline" yaml:",inline"`
    ObjectMeta `json:"metadata,omitempty" yaml:"metadata,omitempty"`
    // Required: Alias/DNS that points to the service
    // Can be host or host:port
    // host and port are combined to follow the net/url URL struct
    Host string `json:"host" yaml:"host"`
    // Optional: Path that the router watches for, to route traffic for to the service
    Path string `json:"path,omitempty" yaml:"path,omitempty"`
    // the name of the service that this route points to
    ServiceName string `json:"serviceName" yaml:"serviceName"`
}

Much like our service proxy watches endpoints, an HTTP reverse proxy, such as HAProxy, could watch routes and reprogram itself (or a management agent could do that to the proxy).

I'd change this to follow v1beta3 metadata/spec/status conventions --host, path, and serviceName would go in spec. Also, rather than just service name, I'd be inclined to use whatever our canonical object cross-reference format is -- ObjectReference or (partial) URL (#1490 (comment)). I'm more and more leaning towards partial URLs, which would be generated similarly to selfLink upon GET of a particular API version.

phemmer · 2015-01-26T21:45:35Z

This is very much of interest to us as we run numerous applications, all fronted by a single external endpoint (eg, http://api.example.com).

We've run our own in-house layer 7 load balancer / router for a few years now, and aside from URL rewriting (which is mentioned with the 'target path' thing), the only other feature I can think of which would be of interest is source address filtering.
Our specific use case for this is to allow internal applications to talk to other internal applications through the load balancer using 'private' routes. But we could probably work around this by just adding authentication to these routes.

I'm also wondering if it would be good to define what http headers get added to the request. Since a layer 7 router hides a lot of the client information, this'll need to be passed on. Information such as client IP address, client/server port, whether the client is using SSL, SSL client cert subject & verification status, etc.

supirman · 2015-03-27T09:14:10Z

Hi, I am master student who want to apply to GSOC. I am interested with this problem. And I had project that involving nginx as load balancer and reverse proxy in the past. Can this be implemented using nginx?

smarterclayton · 2015-03-27T13:34:32Z

It could, although we would prefer a solution that is generic to load balancers. The OpenShift route concept and mechanism is mostly fully implemented and integrates now with HAProxy and F5 automatically, and has a generic template mode for any other router. I think that a lot of the work for this has really been done, and it's a matter of defining how we evolve the service api and then moving that code over to Kube.

On Mar 27, 2015, at 5:14 AM, Firman Rosdiansyah notifications@github.com wrote:

Hi, I am master student who want to apply to GSOC. I am interested with this problem. And I had project that involving nginx as load balancer and reverse proxy in the past. Can this be implemented using nginx?

—
Reply to this email directly or view it on GitHub.

glerchundi · 2015-05-14T18:12:10Z

Hi guys, I've already created a working example of http reverse proxy and loadbalancer inside kubernetes using nginx + confd (with etcd backend, which is just a proxy to the master etcd). It is composed by three components:

etcd-proxy: it persists routing and upstreams data
nginx-loadbalancer: using nginx and confd, it works as a reverse proxy load balancing traffic based in the registered upstreams.
loadbalancer-feeder: Listens to kubernetes pod events using kubelistener (single ones or created by replication controllers) and updates loadbalancer upstreams accordingly.

A working controller+service example is also available at: https://github.com/glerchundi/kubernetes-http-loadbalancer

Any comment is really appreciated!

bgrant0607 · 2015-05-15T04:02:28Z

@glerchundi Thanks for the pointers! Just a quick comment for now since we're trying to wrap up 1.0. If you define a service (set portalIP: "None" if you don't need a VIP allocated) for your pods, the endpoints controller will generate a list of their addresses and ports in an Endpoints object for you.
https://github.com/GoogleCloudPlatform/kubernetes/blob/master/docs/services.md#headless-services

glerchundi · 2015-05-26T20:05:42Z

@bgrant0607 ok, thanks for pointing that out. After watching some videos about kubernetes, it seems that the best way/pattern to handle this is to watch services and not replication controllers / pods and modify the http load-balancer accordingly.

Something like:

Service 1: App Production (label: app, production)
Service 2: App Canary (label: app, canary)
Service 3: Load Balancer (selector: app)

This would balance between two services (App Production & App Canary) which in turn will balance between all pods belonging to the corresponding replication controller.

bgrant0607 · 2015-05-27T06:14:46Z

@glerchundi Do you mean "service 1" and "service 2" or "replication controller 1" and "replication controller 2"? Otherwise, yes, that's the recommended approach.

smarterclayton · 2015-08-08T01:19:08Z

I don't think joining is terrible. The route is an atomic unit of change,
in some cases I'd prefer to change one route from a blue to green
deployment than two. I think we could make the argument paths together is
strictly better than paths separate.

On Aug 7, 2015, at 8:54 PM, Prashanth B notifications@github.com wrote:

Another idea we discussed was to move from a service-per-route model, to a
service-per-path model. Eg:

type: routeSpec:
host: foo.bar.com
paths:
- /prod: svc1, port
- /test: svc2, port
tlsMode: Termination
secret: certStatus:
host: foo.bar.com
ingressIp: 134

That seems to fit better with my mental model of a website with multiple
endpoints serviced by different groups of pods, all sharing a common
security policy.

The openshift model is to have a route for /prod and another for /test, and
since they all join the same router things work out. But in a world where a
single route creates a new loadbalancer, the ability to specify multiple
services per path makes it easier to get a single ip for the entire site.

Another wrinkle is that a router might be one or multiple IPs and DNSes.
If we had elastic ip binding, I might want to add that to a given router
(shared or no) and thus there may be multiple effective IPs/DNS entries for
clients.

Is there a reason to handle each path with a different route object?

—
Reply to this email directly or view it on GitHub
#561 (comment)
.

bprashanth · 2015-08-08T02:33:58Z

Ok, so it sounds like the main reason is that a single-service-per-route keeps it hermetic. I mostly buy that. I also like small resources because they're easier to update, watch, display etc, and I can invalidate a single path<->service (eg: because the service doesn't have nodeport, which is currently required for gce l7).

So the model we're assuming is: a hostname + multiple url endpoints, each managed by a different kubernetes service.

To make a useful 1.1 l7 api around this I think we need the ability to route all requests for that hostname, through a single ip, to different backends based on routes, without creating a global router up front (because gce only allows one cert per loadbalancer ip, and mixing the models complicates things).

There are 2 ways to achieve this:

Keep the routes simple (more like the first route HTTP L7 load balancer / reverse proxy #561 (comment)) and implement a basic claims model that allows joining
Make the route expressive enough to accommodate the basic requirement (more like the second route HTTP L7 load balancer / reverse proxy #561 (comment)), so we have a usable api even without joining

@thockin and @bgrant0607 wdyt?

thockin · 2015-08-08T03:25:36Z

I'm having a hard time seeing the whole picture from this thread. I want
to be the voice of "do the simplest thing we can get away with", here. I
argued with Prashanth that the multitude of tiny route objects feels
awkward to me. Admittedly, I am not the webbiest guy, but my mental model
is really: Some set of inputs arrive at a mux which decides based on path
which backend Service:Port to send traffic to.

e.g ingress{"foobar.com"} -> map{"/foo": Service{"foo", 80}, "/bar":
Service{"bar", 8080}}

Changing that to multiple route objects seems confusing and unnecessary.
Can someone explain it?

It might help to get a sketch of the data model and some examples using it.

On Fri, Aug 7, 2015 at 7:34 PM, Prashanth B notifications@github.com
wrote:

Ok, so it sounds like the main reason is that a single-service-per-route
keeps it hermetic. I mostly buy that. I also like small resources because
they're easier to update, watch, display etc, and I can invalidate a single
path<->service map (eg: because the service doesn't have nodeport, which is
currently required for gce l7).

So the model we're assuming is: a hostname + multiple url endpoints, each
managed by a different kubernetes service.

To make a useful 1.1 l7 api around this, I think we need the ability to
route all requests for that hostname, through a single ip, to different
backends based on routes, without creating a global router up front
(because gce only allows one cert per loadbalancer ip, and mixing the
models complicates things).

There are 2 ways to achieve this:

Keep the routes simple (more like the first route HTTP L7 load balancer / reverse proxy #561 (comment)
HTTP L7 load balancer / reverse proxy #561 (comment))
and implement a basic claims model that allows joining

Make the route expressive enough to accomodate the basic requirement
(more like the second route HTTP L7 load balancer / reverse proxy #561 (comment)
HTTP L7 load balancer / reverse proxy #561 (comment)),
so we have a usable api even without joining

@thockin https://github.com/thockin and @bgrant0607
https://github.com/bgrant0607 wdyt?

—
Reply to this email directly or view it on GitHub
#561 (comment)
.

smarterclayton · 2015-08-08T04:11:43Z

I may not have expressed it clearly, but having multiple paths and services per route doesn't seem so bad (nor multiple hosts) because you can change them atomically (which when doing a blue-green cutover has some advantages). We did not start with that due to caution, but in practice folks have asked for it.

thockin · 2015-08-08T04:15:19Z

Gotcha. That sort of approximates GCE's API, too. We do need to think
about what is possible in AWS and others before we design something
unimplementable.

@justinsb (should have looped him in sooner, sorry).

On Fri, Aug 7, 2015 at 9:12 PM, Clayton Coleman notifications@github.com
wrote:

I may not have expressed it clearly, but having multiple paths and
services per route doesn't seem so bad (nor multiple hosts) because you can
change them atomically (which when doing a blue-green cutover has some
advantages). We did not start with that due to caution, but in practice
folks have asked for it.

—
Reply to this email directly or view it on GitHub
#561 (comment)
.

smarterclayton · 2015-08-08T04:42:10Z

If we do have multiples we have to consider partial rejection on shared
routers for duplicate hosts or paths.

On Aug 8, 2015, at 12:15 AM, Tim Hockin notifications@github.com wrote:

Gotcha. That sort of approximates GCE's API, too. We do need to think
about what is possible in AWS and others before we design something
unimplementable.

@justinsb (should have looped him in sooner, sorry).

On Fri, Aug 7, 2015 at 9:12 PM, Clayton Coleman notifications@github.com
wrote:

I may not have expressed it clearly, but having multiple paths and
services per route doesn't seem so bad (nor multiple hosts) because you
can
change them atomically (which when doing a blue-green cutover has some
advantages). We did not start with that due to caution, but in practice
folks have asked for it.

—
Reply to this email directly or view it on GitHub
<
#561 (comment)

.

—
Reply to this email directly or view it on GitHub
#561 (comment)
.

bprashanth · 2015-08-17T21:16:57Z

Please review #12827 when you have time

justinsb · 2015-08-18T01:46:18Z

AWS ELB has very limited Layer 7 support. Although it has some Layer 7 features, these are limited to SSL termination (with a single cert), sticky sessions based on cookies, and writing an access log. There is no path-based routing for example. Typically you set up ELB in front of nginx/haproxy. I think we would likely want to do the same thing for AWS, with a k8s managed nginx/haproxy/vulcand.

In other words, the AWS API for load balancing is so limited that I do not think we should even try to constrain the k8s API to fit within it. Rather, we should have a k8s option that uses a cloudprovider Layer 4 load balancer in front of a k8s managed software load balancer. If you have a better load balancer (GCE, OpenStack, hardware) then ideally we would allow you to use that instead. But AWS will be primarily software implemented.

phemmer · 2015-08-18T02:02:21Z

AWS ELB has very limited Layer 7 support. ... There is no path-based routing for example

In AWS land, they have a separate service for this, API Gateway. It's basically another layer that sits on top of the ELB.

justinsb · 2015-08-18T02:13:47Z

Oh, good point @phemmer. I hadn't seen it marketed this way, but it does look like we could indeed use API Gateway as "just" a Layer 7 load balancer. I can't help but worry that it isn't really what it is intended for, but I'm very happy for the suggestion - we'll have to evaluate it!

smarterclayton · 2015-08-18T02:38:56Z

Yeah, that's how we use ELB today - as the HA layer for a pair of redundant
proxies.

On Mon, Aug 17, 2015 at 10:14 PM, Justin Santa Barbara <
notifications@github.com> wrote:

Oh, good point @phemmer https://github.com/phemmer. I hadn't seen it
marketed this way, but it does look like we could indeed use API Gateway as
"just" a Layer 7 load balancer. I can't help but worry that it isn't really
what it is intended for, but I'm very happy for the suggestion - we'll have
to evaluate it!

—
Reply to this email directly or view it on GitHub
#561 (comment)
.

Clayton Coleman | Lead Engineer, OpenShift

bprashanth · 2015-08-18T18:01:26Z

Rather, we should have a k8s option that uses a cloudprovider Layer 4 load balancer in front of a k8s managed software load balancer. If you have a better load balancer (GCE, OpenStack, hardware) then ideally we would allow you to use that instead. But AWS will be primarily software implemented.

This is exactly the case for loadbalancer classes (either embedded in the ingress point or via claims). I'm a little wary of offer this out of the box, because there are several multi-tier setups (ELB l7 for ssl termination -> nginx, f5 l4 -> apache ssl proxy -> l7 etc). You should have 2 loadbalancer controllers, one for aws and another for haproxy.

handwaving a bit in this example:

aws and haproxy loadbalancer controllers running in cluster

create svc
create ingresspoint {/foo: svc1, class:haproxy, layer:7}
wait for haproxy loadbalancer controller to allocate an ip
wrap the ip in a service with nodeport (there have been a couple of discussions on how to do this: Simple services with external IPs on bare-metal #10456, DESIGN: External IPs #1161, External IPs support #12561)
create ingresspoint {/foo: nodeportsvc1, class: elb, layer:4}

In other words, the AWS API for load balancing is so limited that I do not think we should even try to constrain the k8s API to fit within it.

This is why I'd like to move away from the current interface/cloud-provider model, to a more plugin centric approach. Each loadbalancer is a different beast and kube should just get out of the way. Most of the points @justinsb mentioned for ELB are true for GCELB as well.

pires · 2015-08-27T16:39:12Z

/cc @mikedanese

https://github.com/kubernetes/contrib/tree/master/service-loadbalancer seems a nice project to fork and take further on to support other load-balancing solutions.

mikedanese · 2015-08-27T16:43:01Z

@bprashanth actually authored that package. I'm just git blamed since I moved it out of the main repo.

JeanMertz · 2015-09-01T06:38:49Z

I haven't seen this mentioned here, but is there any thought on integrating the new HTTPS LBs on GCE?

https://cloud.google.com/compute/docs/load-balancing/http/ssl-certificates?hl=en_US

We currently create separate RCs to host a standard nginx reverse-proxy with SSL termination for each "real" service that we host on Kubernetes. This is managable so far, but leveraging Kubernetes to auto-create a managed LB for us, would be much better.

bprashanth · 2015-09-09T23:34:06Z

@pires yeah the real challenge is providing a consistent interface that allows multiple loadbalancers to co-exist in the same cluster. https://github.com/kubernetes/kubernetes/pull/12827/files talks about our efforts in this direction.

@JeanMertz yes. See same TLS bits on same proposal.

jayunit100 · 2015-09-15T15:06:53Z

related to https://github.com/kubernetes/contrib/tree/master/service-loadbalancer which proposes resolution of this issue as a possible next iteration

bgrant0607 · 2015-10-15T19:14:27Z

Can this be closed in favor of more specific follow-up issues?

thockin · 2015-10-23T17:55:45Z

I'm closing this in favor of the more detailed bugs, now that we have something. Yay!

Make machine-id sources flag a comma-separated list.

Remove node_modules from GitBook

Resolving conflicts for pull request kubernetes#561 and adding documentation.

Added arm64 targets for linux binaries

lavalamp added the question label Jul 24, 2014

bgrant0607 added the sig/network Categorizes an issue or PR as relevant to SIG Network. label Sep 25, 2014

smarterclayton mentioned this issue Sep 25, 2014

DESIGN: External IPs #1161

Closed

bgrant0607 added priority/awaiting-more-evidence Lowest priority. Possibly useful, but not yet enough support to actually get it done. and removed kind/support-question labels Dec 3, 2014

bgrant0607 mentioned this issue Dec 4, 2014

Service reorg ideas #2585

Closed

bgrant0607 changed the title ~~Consider an HTTP proxy component~~ HTTP L7 load balancer / reverse proxy Dec 4, 2014

bgrant0607 added priority/backlog Higher priority than priority/awaiting-more-evidence. and removed priority/awaiting-more-evidence Lowest priority. Possibly useful, but not yet enough support to actually get it done. labels Dec 4, 2014

thockin added the team/cluster label Feb 6, 2015

bgrant0607 added priority/awaiting-more-evidence Lowest priority. Possibly useful, but not yet enough support to actually get it done. status/help-wanted area/example and removed priority/backlog Higher priority than priority/awaiting-more-evidence. labels Feb 28, 2015

This was referenced Sep 17, 2015

Add Basic IngressPoint resource type for L7 Loadbalancing #13947

Closed

L7 Loadbalancing #12827

Closed

bprashanth mentioned this issue Sep 24, 2015

Ingress resource V0.2 #14459

Merged

aronchick added the kind/feature Categorizes issue or PR as related to a new feature. label Oct 1, 2015

bprashanth mentioned this issue Oct 19, 2015

Ingress docs #15899

Merged

thockin closed this as completed Oct 23, 2015

vishh pushed a commit to vishh/kubernetes that referenced this issue Apr 6, 2016

Merge pull request kubernetes#561 from rjnagal/rename

d94f936

Make machine-id sources flag a comma-separated list.

wking pushed a commit to wking/kubernetes that referenced this issue Jul 21, 2020

Merge pull request kubernetes#561 from pwittrock/book

a45bc60

Remove node_modules from GitBook

b3atlesfan pushed a commit to b3atlesfan/kubernetes that referenced this issue Feb 5, 2021

aws-vpc: add support for multiple route tables

2cbd855

Resolving conflicts for pull request kubernetes#561 and adding documentation.

linxiulei pushed a commit to linxiulei/kubernetes that referenced this issue Jan 18, 2024

Merge pull request kubernetes#561 from pwschuurman/arm64-support

9ce0dbf

Added arm64 targets for linux binaries

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

HTTP L7 load balancer / reverse proxy #561

HTTP L7 load balancer / reverse proxy #561

lavalamp commented Jul 22, 2014

smarterclayton commented Jul 23, 2014

bgrant0607 commented Sep 25, 2014

smarterclayton commented Sep 25, 2014

thockin commented Sep 25, 2014

lavalamp commented Sep 25, 2014

thockin commented Sep 25, 2014

rektide commented Oct 4, 2014

bgrant0607 commented Dec 18, 2014

phemmer commented Jan 26, 2015

supirman commented Mar 27, 2015

smarterclayton commented Mar 27, 2015

glerchundi commented May 14, 2015

bgrant0607 commented May 15, 2015

glerchundi commented May 26, 2015

bgrant0607 commented May 27, 2015

smarterclayton commented Aug 8, 2015

bprashanth commented Aug 8, 2015

thockin commented Aug 8, 2015

smarterclayton commented Aug 8, 2015

thockin commented Aug 8, 2015

smarterclayton commented Aug 8, 2015

bprashanth commented Aug 17, 2015

justinsb commented Aug 18, 2015

phemmer commented Aug 18, 2015

justinsb commented Aug 18, 2015

smarterclayton commented Aug 18, 2015

bprashanth commented Aug 18, 2015

pires commented Aug 27, 2015

mikedanese commented Aug 27, 2015

JeanMertz commented Sep 1, 2015

bprashanth commented Sep 9, 2015

jayunit100 commented Sep 15, 2015

bgrant0607 commented Oct 15, 2015

thockin commented Oct 23, 2015

HTTP L7 load balancer / reverse proxy #561

HTTP L7 load balancer / reverse proxy #561

Comments

lavalamp commented Jul 22, 2014

smarterclayton commented Jul 23, 2014

bgrant0607 commented Sep 25, 2014

smarterclayton commented Sep 25, 2014

thockin commented Sep 25, 2014

lavalamp commented Sep 25, 2014

thockin commented Sep 25, 2014

rektide commented Oct 4, 2014

bgrant0607 commented Dec 18, 2014

phemmer commented Jan 26, 2015

supirman commented Mar 27, 2015

smarterclayton commented Mar 27, 2015

glerchundi commented May 14, 2015

bgrant0607 commented May 15, 2015

glerchundi commented May 26, 2015

bgrant0607 commented May 27, 2015

smarterclayton commented Aug 8, 2015

bprashanth commented Aug 8, 2015

thockin commented Aug 8, 2015

smarterclayton commented Aug 8, 2015

thockin commented Aug 8, 2015

smarterclayton commented Aug 8, 2015

bprashanth commented Aug 17, 2015

justinsb commented Aug 18, 2015

phemmer commented Aug 18, 2015

justinsb commented Aug 18, 2015

smarterclayton commented Aug 18, 2015

bprashanth commented Aug 18, 2015

pires commented Aug 27, 2015

mikedanese commented Aug 27, 2015

JeanMertz commented Sep 1, 2015

bprashanth commented Sep 9, 2015

jayunit100 commented Sep 15, 2015

bgrant0607 commented Oct 15, 2015

thockin commented Oct 23, 2015