Support Failover Load-balancing #1007

Tester98 · 2016-12-31T08:46:09Z

What version of Traefik are you using (`traefik version`)?

v1.1.2

What is your environment & configuration (arguments, toml...)?

linux, simple file backend

[backends]
  [backends.testing]
    [backends.testing.servers.server1]
    url = "http://primary:80"
    weight = 1
    [backends.testing.servers.server2]
    url = "http://backup:80"
    weight = 1

What did you do?

I want to do load balancing simple active passive. however i want to use server only as backup ( in case of failure of network issues)

What did you expect to see?

Wanted to see a simple case working where failover is supported instead of load-balancing

What did you see instead?

Couldn't figure out how it will work by reading documentation N times.

The use case is simple i have 2 server in single backend, one primary and one backup. However i would like to sent traffic to backup only when primary is not working ( tcp connection could not be made).

I have tested some odd forms, where i add some abnormally high weight to primary and set retry to yes. In this case if primary is not responding, retry again chooses the primary instead of backup. So, retry should be skipping the primary server toactually retry. This is broken.

Can someone help me to configure such as a case ?

The text was updated successfully, but these errors were encountered:

digipigeon · 2017-02-27T20:43:58Z

I am having the exact same problem. I want to deploy Traefik to fail-over to a different zone (e.g a different data-center).

It actually seems to compound the problem if the weights are set high as it appears that the weights consume retry.attempts.

If I add 8 servers (5 primary, 3 secondary) but all with equal weights, then take the primary ones offline it continues to field requests. But requests will also go to the backup servers.

If I set a higher weight (10) on the primary servers, then take them offline it seems like the amount of retries is "spent" by the weights of the primaries that its trying, so it never gets chance to try the backup servers. (well actually 3/8 requests succeed, as it lands on them first).

If I set the primary weight as 2 and the secondary as 1, then take the primaries offline, the ratio goes to 3/4 successful. The attempts automatically is set to 7, so if it lands on the first server, all 7 retries are spent on the weights of the 5 failed servers. All of the others work.

Lastly if I set the primary weight considerably higher (1000) and a high retry count 10000, and primaries offline, it 50 seconds to complete the first request.

If I was serving a static file then I could set the primary weight to 50 and secondaries to 1 then set 1000 as the retry count, and maybe drr strategy would help. The only problem is if there is an application level failure I have just DoS'd my own system, so no real solution here.

Would love to hear what I am doing wrong, or how I can configure this better. Thanks.

mattcollier · 2017-04-20T17:24:57Z

A HealthCheck feature was added in v1.2 by way of #918 and #1132

https://github.com/containous/traefik/blob/master/docs/basics.md#backends

mr-manuel · 2020-01-03T12:13:29Z

Is there now a possible configuration for an active/passive failover configuration in v2 without setting the weight on a very high value?

luklss · 2020-04-24T09:57:31Z

It would be really nice to have this! Traefik rocks, but I will probably have to choose another solution given this is not supported.

mr-manuel · 2020-05-13T16:11:23Z

Can someone address this issue please?

sparkacus · 2020-05-28T14:45:35Z

I am in need of a similar feature.

An example I gave in #6856 is of an AWS Target group, in that if there are no healthy services it'll route to all registered instances.

Currently if all instances fail a Consul health check, the route is completely removed.

rdxmb · 2020-07-27T10:12:16Z

I've read all the actual mechanisms available and wondering about how this could be implemented:

CircuitBreaker with a customizable Fallback mechanism
Supporting a Fallback mechanism within LoadBalancing
WeightedRoundRobin with special weight (e.g. a negative integer weight like -1 which will only be used when the positive weights like 1 and 3 are not available.

jdoss · 2020-08-13T16:57:03Z

I've read all the actual mechanisms available and wondering about how this could be implemented:

CircuitBreaker with a customizable Fallback mechanism

Supporting a Fallback mechanism within LoadBalancing

WeightedRoundRobin with special weight (e.g. a negative integer weight like -1 which will only be used when the positive weights like 1 and 3 are not available.

I think the Fallback mechanism within LoadBalancing itself or the WeightedRoundRobin with special weight are the best ideas here. I want to be able to load a static page if my downstream servers are having issues. I can't seem to find a way to do this with Traefik in its current form.

tvld · 2020-09-09T07:03:59Z

Would a weight of 0 or -1 not be enough? Simply meaning: never use this server, unless the other(s) fail ?

rdxmb · 2020-09-10T09:14:06Z

The weight in RoundRobin is already defined, I would not try to mix that.

As a user, I would prefer to have the weight within LoadBalancer. Maybe it would be even possible to work with positive numbers here - so you could do things like

5 - (main service)
5 - (second main service)
3 - (fallback service - optional)
1 - (maintenance site)

mr-manuel · 2020-11-25T23:36:08Z

Can we push this a little bit more? The first request was almost 4 years ago... There were already good solutions here. Can this be assigned to someone in the Traefik team?

Ankurkh1 · 2021-02-25T19:50:59Z

Any update on this request please?
Traefik is leap and bounds better than any reverse proxy solution available. Yet it is missing something as simple and straightforward as Active/Passive Configuration.
Something which Nginx achieves using simple backup keyword against a server

upstream backend {
server backend1.example.com;
server backup1.example.com backup;
}

The definition of backup as per documentation here https://nginx.org/en/docs/http/ngx_http_upstream_module.html

backup

marks the server as a backup server. It will be passed requests when the primary servers are unavailable.

Can you please consider implementing this in some form? Just because of this one functionality, we would have to use NGINX instead of Traefik :(.

sbrattla · 2021-03-13T23:37:41Z

I'm very much cheering for this feature as well! The use case is multiple data centers, but I'd like not to send traffic to other regions unless servers in the local region is down.

Per0x · 2021-04-29T01:07:45Z

Any update about this feature? I spent days reading the whole doc and understanding traefik to realize in the end that it was not possible to have such a basic but essential function. For now it seems round-robin only :/

stefaanv · 2021-05-25T12:46:00Z

I'm looking for the same feature
Any idea if/when this will be available ?

jazzmuesli · 2021-10-29T16:30:00Z

I wanted to implement blue/green deployments with resilience on a nomad cluster with consul and traefik, I hope my example helps someone.

I have a nomad cluster with blue/green instances in tomcats/docker and consul for service discovery that is used by traefik. Green instances register themselves in consul with tags
traefik.http.routers.website-green.priority=123
traefik.http.routers.website-green.rule=Host("website") || Host("green.website")

similar for blue instance:

traefik.http.routers.website-blue.priority=120
traefik.http.routers.website-blue.rule=Host("website") || Host("blue.website")

This way traefik routes to green instance by default. In case the green instance dies, traefik will route to the blue instance. This mechanism can also be used for blue/green deployments in general: you deploy the blue instance, check it by accessing blue.website and if it's ok, promote it by increasing blue priority to 126 and later deploy green/increase green priority to 124.

rdxmb · 2021-10-30T08:40:58Z

Sounds great. I will try 👍

tobiasb · 2021-11-05T23:01:29Z

How about changing the Retry middleware to retry not the same backend but from a different backend? This would also finally enable us to do zero downtime deployments while using Docker service discovery.

tobiasb · 2021-11-08T15:50:32Z

I wanted to implement blue/green deployments with resilience on a nomad cluster with consul and traefik, I hope my example helps someone.

I have a nomad cluster with blue/green instances in tomcats/docker and consul for service discovery that is used by traefik. Green instances register themselves in consul with tags traefik.http.routers.website-green.priority=123 traefik.http.routers.website-green.rule=Host("website") || Host("green.website")

similar for blue instance:

traefik.http.routers.website-blue.priority=120 traefik.http.routers.website-blue.rule=Host("website") || Host("blue.website")

This way traefik routes to green instance by default. In case the green instance dies, traefik will route to the blue instance. This mechanism can also be used for blue/green deployments in general: you deploy the blue instance, check it by accessing blue.website and if it's ok, promote it by increasing blue priority to 126 and later deploy green/increase green priority to 124.

@jazzmuesli this seems to work great, thanks for sharing ❤️ . In addition to the router I had to also "namespace" the service, otherwise Traefik gets confused about differences in configuration of the same service between different backends.

mannharleen · 2021-11-18T03:54:21Z

Almost 5 years and no love from the Traefik team. 😕
The workaround from @jazzmuesli does work though

tobiasb · 2021-11-18T16:39:42Z

@mannharleen @jazzmuesli It didn't end up working for me after all because backends were added before they were healthy, see #8570

ldez added the kind/question a question label Apr 21, 2017

ldez added kind/enhancement a new or improved feature. priority/P2 need to be fixed in the future and removed kind/question a question labels Jun 8, 2017

ldez added status/0-needs-triage and removed status/0-needs-triage labels Feb 2, 2018

SantoDE mentioned this issue Jan 3, 2020

Use weighted service with 0 weight as fallback #6127

Closed

SantoDE mentioned this issue May 28, 2020

Consul: Route to unhealthy service #6856

Closed

SantoDE self-assigned this Nov 26, 2020

ldez unassigned SantoDE Nov 5, 2021

tomMoulard mentioned this issue Mar 8, 2022

Add Failover service #8825

Merged

2 tasks

ddtmachado added this to the next milestone Mar 10, 2022

kevinpollet changed the title ~~Simple Failover Load-balancing Not Possible ?~~ Support Failover Load-balancing Mar 11, 2022

traefiker closed this as completed in #8825 Mar 17, 2022

traefik locked and limited conversation to collaborators Apr 17, 2022

traefiker added the status/5-frozen-due-to-age label Apr 17, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support Failover Load-balancing #1007

Support Failover Load-balancing #1007

Tester98 commented Dec 31, 2016 •

edited by ldez

Loading

digipigeon commented Feb 27, 2017

mattcollier commented Apr 20, 2017

mr-manuel commented Jan 3, 2020

luklss commented Apr 24, 2020

mr-manuel commented May 13, 2020

sparkacus commented May 28, 2020

rdxmb commented Jul 27, 2020

jdoss commented Aug 13, 2020 •

edited

Loading

tvld commented Sep 9, 2020

rdxmb commented Sep 10, 2020

mr-manuel commented Nov 25, 2020

Ankurkh1 commented Feb 25, 2021 •

edited

Loading

sbrattla commented Mar 13, 2021

Per0x commented Apr 29, 2021 •

edited

Loading

stefaanv commented May 25, 2021

jazzmuesli commented Oct 29, 2021 •

edited

Loading

rdxmb commented Oct 30, 2021 via email

tobiasb commented Nov 5, 2021

tobiasb commented Nov 8, 2021 •

edited

Loading

mannharleen commented Nov 18, 2021 •

edited

Loading

tobiasb commented Nov 18, 2021

Support Failover Load-balancing #1007

Support Failover Load-balancing #1007

Comments

Tester98 commented Dec 31, 2016 • edited by ldez Loading

What version of Traefik are you using (traefik version)?

What is your environment & configuration (arguments, toml...)?

What did you do?

What did you expect to see?

What did you see instead?

digipigeon commented Feb 27, 2017

mattcollier commented Apr 20, 2017

mr-manuel commented Jan 3, 2020

luklss commented Apr 24, 2020

mr-manuel commented May 13, 2020

sparkacus commented May 28, 2020

rdxmb commented Jul 27, 2020

jdoss commented Aug 13, 2020 • edited Loading

tvld commented Sep 9, 2020

rdxmb commented Sep 10, 2020

mr-manuel commented Nov 25, 2020

Ankurkh1 commented Feb 25, 2021 • edited Loading

sbrattla commented Mar 13, 2021

Per0x commented Apr 29, 2021 • edited Loading

stefaanv commented May 25, 2021

jazzmuesli commented Oct 29, 2021 • edited Loading

rdxmb commented Oct 30, 2021 via email

tobiasb commented Nov 5, 2021

tobiasb commented Nov 8, 2021 • edited Loading

mannharleen commented Nov 18, 2021 • edited Loading

tobiasb commented Nov 18, 2021

Tester98 commented Dec 31, 2016 •

edited by ldez

Loading

What version of Traefik are you using (`traefik version`)?

jdoss commented Aug 13, 2020 •

edited

Loading

Ankurkh1 commented Feb 25, 2021 •

edited

Loading

Per0x commented Apr 29, 2021 •

edited

Loading

jazzmuesli commented Oct 29, 2021 •

edited

Loading

tobiasb commented Nov 8, 2021 •

edited

Loading

mannharleen commented Nov 18, 2021 •

edited

Loading