New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Drain a backend server #41

Open
andrioid opened this Issue Oct 7, 2015 · 9 comments

Comments

Projects
None yet
8 participants
@andrioid

andrioid commented Oct 7, 2015

Many load-balancers have this feature. Example for AWS: http://docs.aws.amazon.com/ElasticLoadBalancing/latest/DeveloperGuide/config-conn-drain.html

I would like to be able to set /backends/awesomebackend/servers/server2/drain 1

The effect would be that no new clients should be sent towards this backend. Ideally the proxy should communicate back when no existing clients remain.

Use Case

Deployment server could launch the new environment, set the old environment to "drain=1" and that way we could deploy without disturbing existing connections.

This sounds a bit out of scope for an auto-configured reverse-proxy, but if you're going to implement #5. Then maybe consider this.

@emilevauge

This comment has been minimized.

Show comment
Hide comment
@emilevauge

emilevauge Oct 7, 2015

Member

Hi! This is in the pipeline :) What will be possible is to remove a server from a backend using the API. The existing connections will then be gracefully shutted down.

Member

emilevauge commented Oct 7, 2015

Hi! This is in the pipeline :) What will be possible is to remove a server from a backend using the API. The existing connections will then be gracefully shutted down.

@psi-4ward

This comment has been minimized.

Show comment
Hide comment
@psi-4ward

psi-4ward Oct 24, 2016

anything new here?

psi-4ward commented Oct 24, 2016

anything new here?

@domino14

This comment has been minimized.

Show comment
Hide comment
@domino14

domino14 Nov 24, 2016

How could this not be implemented? There's no way to have a zero-downtime deploy of backend services?

domino14 commented Nov 24, 2016

How could this not be implemented? There's no way to have a zero-downtime deploy of backend services?

@emilevauge

This comment has been minimized.

Show comment
Hide comment
@emilevauge

emilevauge Nov 24, 2016

Member

@domino14 how have you NOT submitted a PR on this yet? 😇

Member

emilevauge commented Nov 24, 2016

@domino14 how have you NOT submitted a PR on this yet? 😇

@commarla

This comment has been minimized.

Show comment
Hide comment
@commarla

commarla Feb 14, 2017

I think @samber can do it easily ^^

commarla commented Feb 14, 2017

I think @samber can do it easily ^^

@timoreimann

This comment has been minimized.

Show comment
Hide comment
@timoreimann

timoreimann Mar 24, 2017

Member

A workaround currently possible (at the price of increased latency) is to enable retrying and make sure there are enough surplus servers available to compensate for the "draining" server.

Specific to the Kubernetes provider, the canonical way to achieve draining is to have a readiness probe implemented by the backing pods. Support from Traefik won't be necessary.

Marathon also supports readiness probes (called readiness checks there), but some degree of support might be needed in Traefik.

Member

timoreimann commented Mar 24, 2017

A workaround currently possible (at the price of increased latency) is to enable retrying and make sure there are enough surplus servers available to compensate for the "draining" server.

Specific to the Kubernetes provider, the canonical way to achieve draining is to have a readiness probe implemented by the backing pods. Support from Traefik won't be necessary.

Marathon also supports readiness probes (called readiness checks there), but some degree of support might be needed in Traefik.

@atecey

This comment has been minimized.

Show comment
Hide comment
@atecey

atecey Jul 27, 2017

Would it be possible to add a --drain=true flag to a Docker swarm service which Traefik picks up and doesn't route any new requests to. The use case would be:
I have a service running (ServiceA version 1) using the frontend rule (mydomain.com). I then bring online ServiceA version 2 running on the same frontend rule. With the --drain flag specified on version 1 Traefik will stop routing new requests to it but honour existing connections. All new requests will go to version 2 allowing zero downtime

atecey commented Jul 27, 2017

Would it be possible to add a --drain=true flag to a Docker swarm service which Traefik picks up and doesn't route any new requests to. The use case would be:
I have a service running (ServiceA version 1) using the frontend rule (mydomain.com). I then bring online ServiceA version 2 running on the same frontend rule. With the --drain flag specified on version 1 Traefik will stop routing new requests to it but honour existing connections. All new requests will go to version 2 allowing zero downtime

@timoreimann

This comment has been minimized.

Show comment
Hide comment
@timoreimann

timoreimann Jul 27, 2017

Member

A very logical mechanism to me would be to use weights and define the semantics for the zero value to mean "send no more traffic to this backend". #1780 plays a role on this one.

Member

timoreimann commented Jul 27, 2017

A very logical mechanism to me would be to use weights and define the semantics for the zero value to mean "send no more traffic to this backend". #1780 plays a role on this one.

marco-jantke added a commit to marco-jantke/traefik that referenced this issue Aug 22, 2017

@kristinn

This comment has been minimized.

Show comment
Hide comment
@kristinn

kristinn Mar 7, 2018

I'm wondering what the current status of this issue is?

Is there a way to achieve this using the current version of Træfik, or are there any plans on implementing this functionality?

My use case is Docker Swarm.

I just looked through the source code for the Docker provider and I noticed this comment https://github.com/containous/traefik/blob/master/provider/docker/docker.go#L159.
It seems Docker didn't have any support for listening for Swarm events, until version 17.06 (see moby/moby#23827 (comment)).

I was thinking about implementing an events listener in Træfik for Docker, instead of the one today that polls a list of services every 15 seconds (by default), that is, if there is no one working on this issue.

I'm just wondering if building the support right into the Docker provider is the right way to go around this or if there needs to be a more generic solution (if it's at all possible to do in a proper way)?

kristinn commented Mar 7, 2018

I'm wondering what the current status of this issue is?

Is there a way to achieve this using the current version of Træfik, or are there any plans on implementing this functionality?

My use case is Docker Swarm.

I just looked through the source code for the Docker provider and I noticed this comment https://github.com/containous/traefik/blob/master/provider/docker/docker.go#L159.
It seems Docker didn't have any support for listening for Swarm events, until version 17.06 (see moby/moby#23827 (comment)).

I was thinking about implementing an events listener in Træfik for Docker, instead of the one today that polls a list of services every 15 seconds (by default), that is, if there is no one working on this issue.

I'm just wondering if building the support right into the Docker provider is the right way to go around this or if there needs to be a more generic solution (if it's at all possible to do in a proper way)?

kristinn added a commit to inteleon/traefik that referenced this issue Mar 16, 2018

Docker Swarm: Support for real time event listening.
These changes provide a support for load balancer draining for Docker
Swarm. Note, the containers and services should also support graceful
shutdowns.

This change makes sure Traefik stops routing, almost instantly, traffic
to containers that are not in the "running" state.
We still poll every 15 seconds.

This will not work with Docker versions earlier than 17.06 (moby/moby#23827 (comment)).
I did not spend time on backwards compatability. Please tell me if that's a
requirement.

Related issue: containous#41

Results from some tests I did locally on my Swarm cluster, using the official Traefik
Docker image from the date of the testing (15th of March 2018), versus the
patched Traefik binary. The file names describe what is being tested.

https://gist.github.com/kristinn/e3c450b71aa3898f39fea20abe87bade

kristinn added a commit to inteleon/traefik that referenced this issue Mar 16, 2018

Docker Swarm: Support for real time event listening.
These changes provide a support for load balancer draining for Docker
Swarm. Note, the containers and services should also support graceful
shutdowns.

This change makes sure Traefik stops routing, almost instantly, traffic
to containers that are not in the "running" state.
We still poll every 15 seconds.

We require a Docker Swarm load balancer that supports connection
draining.

Related issue: containous#41

This will not work with Docker versions earlier than 17.06 (moby/moby#23827 (comment)).
I did not spend time on backwards compatability. Please tell me if that's a
requirement.

Results from some tests I did locally on my Swarm cluster, using the official Traefik
Docker image from the date of the testing (15th of March 2018), versus the
patched Traefik binary. The file names describe what is being tested.

https://gist.github.com/kristinn/e3c450b71aa3898f39fea20abe87bade

kristinn added a commit to inteleon/traefik that referenced this issue Mar 16, 2018

Docker Swarm: Support for real time event listening.
What is being changed:

These changes provide a support for load balancer draining for Docker
Swarm. Note, the containers and services should also support graceful
shutdowns.

This change makes sure Traefik stops routing, almost instantly, traffic
to containers that are not in the "running" state.
We still poll every 15 seconds.

Motivation:

We require a Docker Swarm load balancer that supports connection
draining.

Related issue: containous#41

Additional information:

Breaking changes:

This will not work with Docker versions earlier than 17.06 (moby/moby#23827 (comment)).
I did not spend time on backwards compatability. Please tell me if that's a
requirement.

Stress testing results:

Results from some tests I did locally on my Swarm cluster, using the official Traefik
Docker image from the date of the testing (15th of March 2018), versus the
patched Traefik binary. The file names describe what is being tested.

https://gist.github.com/kristinn/e3c450b71aa3898f39fea20abe87bade

kristinn added a commit to inteleon/traefik that referenced this issue Mar 16, 2018

Docker Swarm: Support for real time event listening.
What is being changed:

These changes provide a support for load balancer draining for Docker
Swarm. Note, the containers and services should also support graceful
shutdowns.

This change makes sure Traefik stops routing, almost instantly, traffic
to containers that are not in the "running" state.
We still poll every 15 seconds.

Motivation:

We require a Docker Swarm load balancer that supports connection
draining.

Related issue: containous#41

Additional information:

Breaking changes:

This will not work with Docker versions earlier than 17.06 (moby/moby#23827 (comment)).
I did not spend time on backwards compatibility. Please tell me if that's a
requirement.

Stress testing results:

Results from some tests I did locally on my Swarm cluster, using the official Traefik
Docker image from the date of the testing (15th of March 2018), versus the
patched Traefik binary. The file names describe what is being tested.

https://gist.github.com/kristinn/e3c450b71aa3898f39fea20abe87bade

kristinn added a commit to inteleon/traefik that referenced this issue Mar 19, 2018

Docker Swarm: Support for real time event listening.
What is being changed:

These changes provide a support for load balancer draining for Docker
Swarm. Note, the containers and services should also support graceful
shutdowns.

This change makes sure Traefik stops routing, almost instantly, traffic
to containers that are not in the "running" state.
We still poll every 15 seconds.

Motivation:

We require a Docker Swarm load balancer that supports connection
draining.

Related issue: containous#41

Additional information:

These changes do not break backwards compatibility.

Stress testing results:

Results from some tests I did locally on my Swarm cluster, using the official Traefik
Docker image from the date of the testing (15th of March 2018), versus the
patched Traefik binary. The file names describe what is being tested.

https://gist.github.com/kristinn/e3c450b71aa3898f39fea20abe87bade

kristinn added a commit to inteleon/traefik that referenced this issue Mar 19, 2018

Docker Swarm: Support for real time event listening.
What is being changed:

These changes provide a support for load balancer draining for Docker
Swarm. Note, the containers and services should also support graceful
shutdowns.

This change makes sure Traefik stops routing, almost instantly, traffic
to containers that are not in the "running" state.
We still poll every 15 seconds.

Motivation:

We require a Docker Swarm load balancer that supports connection
draining.

Related issues:
containous#41
containous#3035

Additional information:

These changes do not break backwards compatibility.

Stress testing results:

Results from some tests I did locally on my Swarm cluster, using the official Traefik
Docker image from the date of the testing (15th of March 2018), versus the
patched Traefik binary. The file names describe what is being tested.

https://gist.github.com/kristinn/e3c450b71aa3898f39fea20abe87bade

kristinn added a commit to inteleon/traefik that referenced this issue Apr 6, 2018

Docker Swarm: Support for real time event listening.
What is being changed:

These changes provide a support for load balancer draining for Docker
Swarm. Note, the containers and services should also support graceful
shutdowns.

This change makes sure Traefik stops routing, almost instantly, traffic
to containers that are not in the "running" state.
We still poll every 15 seconds.

Motivation:

We require a Docker Swarm load balancer that supports connection
draining.

Related issues:
containous#41
containous#3035

Additional information:

These changes do not break backwards compatibility.

Stress testing results:

Results from some tests I did locally on my Swarm cluster, using the official Traefik
Docker image from the date of the testing (15th of March 2018), versus the
patched Traefik binary. The file names describe what is being tested.

https://gist.github.com/kristinn/e3c450b71aa3898f39fea20abe87bade

kristinn added a commit to inteleon/traefik that referenced this issue Apr 16, 2018

Docker Swarm: Support for real time event listening.
What is being changed:

These changes provide a support for load balancer draining for Docker
Swarm. Note, the containers and services should also support graceful
shutdowns.

This change makes sure Traefik stops routing, almost instantly, traffic
to containers that are not in the "running" state.
We still poll every 15 seconds.

Motivation:

We require a Docker Swarm load balancer that supports connection
draining.

Related issues:
containous#41
containous#3035

Additional information:

These changes do not break backwards compatibility.

Stress testing results:

Results from some tests I did locally on my Swarm cluster, using the official Traefik
Docker image from the date of the testing (15th of March 2018), versus the
patched Traefik binary. The file names describe what is being tested.

https://gist.github.com/kristinn/e3c450b71aa3898f39fea20abe87bade

kristinn added a commit to inteleon/traefik that referenced this issue Apr 25, 2018

Docker Swarm: Support for real time event listening.
What is being changed:

These changes provide a support for load balancer draining for Docker
Swarm. Note, the containers and services should also support graceful
shutdowns.

This change makes sure Traefik stops routing, almost instantly, traffic
to containers that are not in the "running" state.
We still poll every 15 seconds.

Motivation:

We require a Docker Swarm load balancer that supports connection
draining.

Related issues:
containous#41
containous#3035

Additional information:

These changes do not break backwards compatibility.

Stress testing results:

Results from some tests I did locally on my Swarm cluster, using the official Traefik
Docker image from the date of the testing (15th of March 2018), versus the
patched Traefik binary. The file names describe what is being tested.

https://gist.github.com/kristinn/e3c450b71aa3898f39fea20abe87bade

nmengin added a commit to inteleon/traefik that referenced this issue Jul 12, 2018

Docker Swarm: Support for real time event listening.
What is being changed:

These changes provide a support for load balancer draining for Docker
Swarm. Note, the containers and services should also support graceful
shutdowns.

This change makes sure Traefik stops routing, almost instantly, traffic
to containers that are not in the "running" state.
We still poll every 15 seconds.

Motivation:

We require a Docker Swarm load balancer that supports connection
draining.

Related issues:
containous#41
containous#3035

Additional information:

These changes do not break backwards compatibility.

Stress testing results:

Results from some tests I did locally on my Swarm cluster, using the official Traefik
Docker image from the date of the testing (15th of March 2018), versus the
patched Traefik binary. The file names describe what is being tested.

https://gist.github.com/kristinn/e3c450b71aa3898f39fea20abe87bade
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment