You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Please vote on this issue by adding a 馃憤 reaction to the original issue to help the community and maintainers prioritize this request
Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
If you are interested in working on this issue or have submitted a pull request, please leave a comment
Tell us about your request
What do you want us to build?
Right now if a container health-check fails for an essential ECS task, the task is restarted. This case is not desirable in most cases, because the health-check failure could be transient due to traffic spike, network unavailability, resource constraint, misconfiguration etc.
One would also want the failed task to remain running for debugging, to understand why the health check is failing.
This is also related to the ALB health-checks, which behave in the same way ( #1271 and #289 ). But this ticket is regarding container health-check.
I think a flag to control this behaviour (for container and ALB health-check restart behaviour) would help, so that even if the health check fails, ECS does not stop the task.
Which service(s) is this request for?
Fargate, ECS
Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard?
Explained above
Are you currently working around this issue?
By not using ECS container health checks, and instead use a third-party side-car to do the same.
Additional context
Health-check failures already generate a CW event, which can be used for alerting. With this alert, the engineer can investigate the issue, and make a call to restart or provision additional tasks, without losing the running task, which can be used for debugging.
The text was updated successfully, but these errors were encountered:
Community Note
Tell us about your request
What do you want us to build?
Right now if a container health-check fails for an essential ECS task, the task is restarted. This case is not desirable in most cases, because the health-check failure could be transient due to traffic spike, network unavailability, resource constraint, misconfiguration etc.
One would also want the failed task to remain running for debugging, to understand why the health check is failing.
This is also related to the ALB health-checks, which behave in the same way ( #1271 and #289 ). But this ticket is regarding container health-check.
I think a flag to control this behaviour (for container and ALB health-check restart behaviour) would help, so that even if the health check fails, ECS does not stop the task.
Which service(s) is this request for?
Fargate, ECS
Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard?
Explained above
Are you currently working around this issue?
By not using ECS container health checks, and instead use a third-party side-car to do the same.
Additional context
Health-check failures already generate a CW event, which can be used for alerting. With this alert, the engineer can investigate the issue, and make a call to restart or provision additional tasks, without losing the running task, which can be used for debugging.
The text was updated successfully, but these errors were encountered: