Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: Allow configuring envoy connection draining #252

Closed
awsiv opened this issue Aug 18, 2020 · 4 comments
Closed

Feature Request: Allow configuring envoy connection draining #252

awsiv opened this issue Aug 18, 2020 · 4 comments
Assignees
Labels
Roadmap: Accepted We are planning on doing this work.

Comments

@awsiv
Copy link

awsiv commented Aug 18, 2020

If you want to see App Mesh implement this idea, please upvote with a 👍.

Tell us about your request
Allow configuring envoy connection draining
https://www.envoyproxy.io/docs/envoy/latest/intro/arch_overview/operations/draining

Which integration(s) is this request for?
This could be Fargate, ECS

Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard?
Currently we are using ALB target groups with ECS for draining connections. This helps us avoid 5xx errors during scale-ins/deployments.

With move to appmesh, it would be great if we could use envoy's connection draining functionality. The default behaviour for envoy is to close connections immediately. Appmesh should support configuring envoy's graceful drain period and send a request to this endpoint after the task enters the DEACTIVATING state, wait for drain-time-s and proceed to sending SIGTERM to the container.

To add a graceful drain period prior to listeners being closed, 
use the query parameter drain_listeners?graceful.
By default, Envoy will discourage requests for some period of time (as determined by --drain-time-s).
The behaviour of request discouraging is determined by the drain manager.

ref: https://www.envoyproxy.io/docs/envoy/latest/intro/arch_overview/operations/draining

Are you currently working around this issue?
By using target group deregistration_delay, which is unnecessary complicating our architecture because we only use this feature of the target group.

Additional context
Evaluating move to appmesh

@awsiv
Copy link
Author

awsiv commented Aug 19, 2020

Notes on deregistration delay here: aws/containers-roadmap#1039

@MattiasKurvits
Copy link

how far are we with this?

@Y0Username
Copy link
Contributor

@MattiasKurvits We're currently working on building a solution around this. We'll post here when we have more details to share.

@karanvasnani
Copy link

In AppMesh Envoy version > 1.21.0.0, we've introduced an Agent as part of the image acting as a process manager which also facilitates connection draining. This will gracefully drain Envoy connections when the task is being stopped until a configurable timeout. Please refer to the documentation here: https://docs.aws.amazon.com/app-mesh/latest/userguide/appmesh-agent.html.

Closing the issue, feel free to reopen or cut a new issue if this feature doesn't solve your use case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Roadmap: Accepted We are planning on doing this work.
Projects
None yet
Development

No branches or pull requests

10 participants