-
Notifications
You must be signed in to change notification settings - Fork 5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support Failover Load-balancing #1007
Comments
I am having the exact same problem. I want to deploy Traefik to fail-over to a different zone (e.g a different data-center). It actually seems to compound the problem if the weights are set high as it appears that the weights consume retry.attempts. If I add 8 servers (5 primary, 3 secondary) but all with equal weights, then take the primary ones offline it continues to field requests. But requests will also go to the backup servers. If I set a higher weight (10) on the primary servers, then take them offline it seems like the amount of retries is "spent" by the weights of the primaries that its trying, so it never gets chance to try the backup servers. (well actually 3/8 requests succeed, as it lands on them first). If I set the primary weight as 2 and the secondary as 1, then take the primaries offline, the ratio goes to 3/4 successful. The attempts automatically is set to 7, so if it lands on the first server, all 7 retries are spent on the weights of the 5 failed servers. All of the others work. Lastly if I set the primary weight considerably higher (1000) and a high retry count 10000, and primaries offline, it 50 seconds to complete the first request. If I was serving a static file then I could set the primary weight to 50 and secondaries to 1 then set 1000 as the retry count, and maybe drr strategy would help. The only problem is if there is an application level failure I have just DoS'd my own system, so no real solution here. Would love to hear what I am doing wrong, or how I can configure this better. Thanks. |
A HealthCheck feature was added in v1.2 by way of #918 and #1132 https://github.com/containous/traefik/blob/master/docs/basics.md#backends |
Is there now a possible configuration for an active/passive failover configuration in v2 without setting the weight on a very high value? |
It would be really nice to have this! Traefik rocks, but I will probably have to choose another solution given this is not supported. |
Can someone address this issue please? |
I am in need of a similar feature. An example I gave in #6856 is of an AWS Target group, in that if there are no healthy services it'll route to all registered instances. Currently if all instances fail a Consul health check, the route is completely removed. |
I've read all the actual mechanisms available and wondering about how this could be implemented:
|
I think the Fallback mechanism within LoadBalancing itself or the WeightedRoundRobin with special weight are the best ideas here. I want to be able to load a static page if my downstream servers are having issues. I can't seem to find a way to do this with Traefik in its current form. |
Would a weight of 0 or -1 not be enough? Simply meaning: never use this server, unless the other(s) fail ? |
The weight in As a user, I would prefer to have the weight within
|
Can we push this a little bit more? The first request was almost 4 years ago... There were already good solutions here. Can this be assigned to someone in the Traefik team? |
Any update on this request please? upstream backend { The definition of backup as per documentation here https://nginx.org/en/docs/http/ngx_http_upstream_module.html backup
Can you please consider implementing this in some form? Just because of this one functionality, we would have to use NGINX instead of Traefik :(. |
I'm very much cheering for this feature as well! The use case is multiple data centers, but I'd like not to send traffic to other regions unless servers in the local region is down. |
Any update about this feature? I spent days reading the whole doc and understanding traefik to realize in the end that it was not possible to have such a basic but essential function. For now it seems round-robin only :/ |
I'm looking for the same feature |
I wanted to implement blue/green deployments with resilience on a nomad cluster with consul and traefik, I hope my example helps someone. I have a nomad cluster with blue/green instances in tomcats/docker and consul for service discovery that is used by traefik. Green instances register themselves in consul with tags similar for blue instance: traefik.http.routers.website-blue.priority=120 This way traefik routes to green instance by default. In case the green instance dies, traefik will route to the blue instance. This mechanism can also be used for blue/green deployments in general: you deploy the blue instance, check it by accessing blue.website and if it's ok, promote it by increasing blue priority to 126 and later deploy green/increase green priority to 124. |
Sounds great. I will try 👍
|
How about changing the |
@jazzmuesli this seems to work great, thanks for sharing ❤️ . In addition to the |
Almost 5 years and no love from the Traefik team. 😕 |
@mannharleen @jazzmuesli It didn't end up working for me after all because backends were added before they were healthy, see #8570 |
What version of Traefik are you using (
traefik version
)?v1.1.2
What is your environment & configuration (arguments, toml...)?
linux, simple file backend
What did you do?
I want to do load balancing simple active passive. however i want to use server only as backup ( in case of failure of network issues)
What did you expect to see?
Wanted to see a simple case working where failover is supported instead of load-balancing
What did you see instead?
Couldn't figure out how it will work by reading documentation N times.
The use case is simple i have 2 server in single backend, one primary and one backup. However i would like to sent traffic to backup only when primary is not working ( tcp connection could not be made).
I have tested some odd forms, where i add some abnormally high weight to primary and set retry to yes. In this case if primary is not responding, retry again chooses the primary instead of backup. So, retry should be skipping the primary server toactually retry. This is broken.
Can someone help me to configure such as a case ?
The text was updated successfully, but these errors were encountered: