-
Notifications
You must be signed in to change notification settings - Fork 53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AWS::CodeDeploy::DeploymentGroup-DeploymentType Support BLUE_GREEN also for ECS/Fargate #37
Comments
I don't think AWS Blue/Green is compatible with CloudFormation. When CodeDeploy deploy in Blue/Green. it create a new AutoScaling group and then on completed it delete the old AutoScaling group. it is a hard delete and unrecoverable. This mean managing AutoScaling group with CloudFormation with CodeDeploy Blue/Green does not work well. Because when you try to update your AutoScaling group (e.g. update the LaunchConfiguration) it will say the AutoScaling group does not exist and just simply fail |
This has 62 likes on the Containers Roadmap: aws/containers-roadmap#130 and it has been a big issue for mission critical deployments on ECS with cloud-formation. this can be done with other non aws toolsets. |
Super excited you are working on this 😃 |
Way to go @luiseduardocolon ! |
Looking forward to this so much. Please, let us know as soon as you release something. |
You can now use CloudFormation to perform Amazon ECS blue/green and canary deployments through AWS CodeDeploy! To learn more visit the announcement and the user guide. Note: This issue and #56 still call out valid coverage gaps for using CloudFormation to create a CodeDeploy deployment group, where that deployment group will then be used for doing ECS blue-green deployments directly with CodeDeploy outside of CloudFormation (as described in https://aws.amazon.com/blogs/devops/use-aws-codedeploy-to-implement-blue-green-deployments-for-aws-fargate-and-amazon-ecs/). The new feature released today is used for doing an ECS blue-green deployment during a CloudFormation stack update, orchestrated by CodeDeploy; it is not for doing an ECS blue-green deployment outside of CloudFormation. It doesn't require a |
@clareliguori That means CloudFormation can now do with ECS exactly what CodeDeploy was already capable of but CloudFormation is not using any CodeDeploy resources for that? |
It still uses CodeDeploy under the hood and follows the same steps as direct CodeDeploy blue-green deployments. It just doesn't require any explicit CodeDeploy resources to be created in the template; instead it is configured via the Hooks and Transform sections. There are some slight differences between the two 'modes': for example, instead of configuring alarms to watch for rollback in CodeDeploy, you configure them on your CloudFormation stack and CFN monitors and rolls back the template. There is an example template here: |
@clareliguori Can you explain what the new 'Hooks' section is doing? How is it different to 'Transforms'? |
@clareliguori There are no alarms defined in the example you linked to. Are you saying that if I define an alarm in the same stack as the blue/green resources and then perform a stack update it will work similar to when I tell CodeDeploy to watch an alarm during a deployment? |
Has anyone successfully tested this using an NLB instead of an ALB? I've adjusted the example provided in https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/blue-green.html#blue-green-template-example to use an NLB. The template I use is available here https://github.com/mattias-fjellstrom/ecs-blue-green/blob/master/templates/ecs-blue-green-nlb.yml |
@mattias-fjellstrom re: alarms: No, alarms defined in the stack template are not automatically monitored. CloudFormation has a 'rollback triggers' feature to configure a list of pre-existing alarms that CFN should watch and rollback stack updates on: I have asked the team to take a look at the Internal failure you're seeing |
We followed the example provided. The stack was created successfully, but when we tried to update the task definition (CPU/memory) we got an error:
|
@mattias-fjellstrom Re: NLB vs ALB. Yes, NLB is supported, but only with AllAtOnce traffic routing config. The template you provided is setup properly. I don't see any apparent issues. I can look into the "Internal Failure" you are facing if you can DM me your CodeDeploy deployment ID, Account ID and CloudFormation Stack ID. |
@KiamarzFallahi @clareliguori |
Will close this issue, although we understand there might be specific use cases not covered here. Feel free to open new issues for those use cases specifically so we can track them individually, and we'll keep an eye out for them. |
Can you check if you added the "Transform" section in your template, in addition to "Hooks" section. As called out in public docs, both of these sections are needed for CD B/G deployments for ECS Add a reference to the AWS::CodeDeployBlueGreen transform to your template:
Here is an example with full template - https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/blue-green.html#blue-green-template-example |
It should work for TaskDefinition updates as well. Can you post the issue (if different from above) you are seeing? |
I did use the same example you suggested. |
I am experiencing the exact same thing when trying to use an NLB, i.e. error message |
@henrik-utter-wcar I actually sent the question to the AWS business support, the relevant part of the answer is this:
They say this is a workaround, and that the underlying issue will be resolved at some point (not able to give an estimate). I tried the workaround, and it works, however: the behavior during the deployment is not very good. It seems like you get a downtime of 1 to several minutes after the target group is switched. Let me know if you see any different behavior, I only spent an hour investigating without getting anywhere. |
Thanks @mattias-fjellstrom. I will give that a shot, but I assume I will see the same behavior then. |
@henrik-utter-wcar I've also tried that with NLB and seen the exact same behavior. So it seems like all versions of blue/green are not compatible with an NLB. I really need to use an NLB due to having a setup like this: API Gateway > VPC Link > NLB > ECS I tried adding an ALB behind the NLB, i.e. API Gateway > VPC Link > NLB > ALB > ECS and that works fine with blue/green (since in that case the target from the viewpoint of the NLB does not change, instead the target change is between the ALB and ECS) but it feels like a hack that I shouldn't have to use - and it is twice the loadbalancer cost. I was really hoping this new blue/green feature would solve the underlying issue, but I realize the problem is with the NLB rather than something else. |
@mattias-fjellstrom It seems like we are in the same boat, that is exactly the setup I'm struggling with as well. I also had high hopes for this feature, but I can also verify now that it is still the same issue with the NLB. |
I am not seeing much value in this "CF Macro" solution other then providing an unmanageable Blue/Green deployment option for a simple CloudFormation managed ECS service that doesn't use CodePipeline. Limitations:
After simplifying my test CF template, I am still unable to successfully update the stack. I keep running into a useless "Failed to transform template" error. I am testing this with an ALB. |
@henrik-utter-wcar @mattias-fjellstrom - can you provide more details about the downtime scenario when you are using NLB? One thing we are aware of is: when the "Green" ECS TaskSet is created with the "Green" TargetGroup, because at that point the "Green" TargetGroup is not added to the NLB prod listener, so ECS will rely on NLB health check to determine ECS TaskSet health. The "Green" ECS TaskSet creation will be marked as completed once "Green" ECS TaskSet shows "Stable" and the stack update will promote forward. It's possible when the "Green" TargetGroup is flipped to serve traffic, the "Green" ECS TaskSet is not ready and can not pass the NLB health check. This is a known issue ECS, ELB and CodeDeploy need to work together on a proper improvement. Because ALB does support weighted traffic shifting, we will force a weight of 0 for "Green" TargetGroup at the beginning of deployment so that "Green" ECS TaskSet will not show as "Stable" until the NLB health check passed. That's why ALB will not have the same issue. For NLB right now, the workaround will be using test listener or CodeDeploy lifecycle hook to perform baking and testing to make sure the "Green" ECS TaskSet is ready to serve traffic before the traffic is actually flipped, which can minimize any potential downtime here. |
@yubangxi I'll paste what I wrote to the AWS Support back in September 2019 when I noticed unusual behavior when using regular CodeDeploy blue/green deploy to ECS. The behavior is similar when using this new CloudFormation blue/green approach together with an NLB.
Everything I expect to work in the scenario described above works as intended if I use an ALB instead of an NLB. I have not tested this new CloudFormation approach enough to give a detailed description, but in one of the tests I ran I saw the same behavior with the NLB where the traffic did not reach the new targets even though the traffic was supposed to have been shifted. When I stopped the deployment by cancelling the stack update I expected a rollback to occur, but at that point it seemed like my old targets had completely disappeared and after a while I received 500 responses, similar to what I saw when using the regular CodeDeploy blue/green approach. To me it seems like there is something with the NLB that is not compatible with blue/green deployments like this. As I mentioned in a comment above I tried blue/green deployment in a setup where I had an ALB behind the NLB, and let the NLB just send traffic to the ALB, while the ALB was involved in the blue/green deployment. In that case it worked fine. I guess it was because from the NLB's perspective the targets never changed. It's almost like the NLB has a memory that takes 1-2 minutes to update. |
@mattias-fjellstrom thanks for the details. This is a very good feedback and helpful. We will do some related investigation on our side and post update here if we find anything. |
I am having the exact same issue with an NLB, is NLB support pretty much Alpha? |
@vladiscovery I verified that changes to only CPU/memory or both properties on “AWS::ECS::TaskDefinition” resource is triggering BG deployments and they are getting executed successfully without any issues, similar to “image” property update. I used “canary” style traffic-shift (the example linked in docs uses “AllAtOnce”) but that should show similar behavior. I am sharing sample templates where only ‘Memory’ property is changed and B/G deployment is triggered, for your reference : create-stack template : https://cfnda-datalake.s3.amazonaws.com/ecsbg/create-stack-canary-lc.yaml Can you share the templates/ snippets where you saw the error when you changed CPU/memory property so that we can debug the issue you saw ? Pls share both existing and new templates. |
@anugarg07 , I ran the example from the AWS doc, and changing the "Image" property of the container definition also triggers the error that @vladiscovery had... |
@anugarg07 Thanks for the update and the example templates. I was able to get those stacks up and modify the memory property on the
I'm also unable to update that value manually through the console. Am I missing something here? |
Tried to get the demo deployed and working today: https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/blue-green.html I can get the first set of resources deployed, but upon updating the image to a new version I receive the error: Template parameters modified by transform I can't tell if this is "working as desired" but if it is, this is useless. It should also be noted that the YAML demo Hook is missing the |
@anugarg07 How do I achieve Service auto scaling with EXTERNAL deployment controller? The ECS user guide says AutoScaling is not supported with EXTERNAL controller. When I use CODE_DEPLOY as controller, the hooks section throws an error saying TaskSet is not usable with CODE_DEPLOY. How do I solve this? |
Failed with this error when there are multiple parameters.
|
I'm getting the same error as @flyinprogrammer, |
@TomLBarden |
Unfortunately none of this has helped @ta-takeuchi The issue itself is so vague and I can't find any documentation as to what's causing it. |
We've identified an issue with the Blue/Green transform involving boolean parsing which leads to those |
Hey @JeremySB , I am facing same error Template parameters modified by transform by updating below template. I already created an empty cluster before deploying the above stack and while updating I am only changing the "ImageTag" |
Can we reopen this? I am hitting an internal error problem when just changing the image property in my task definition. This basically makes the CF a one time usage without the ability to update anything. |
Here is a link to the gist. I am using ALB and running into the mysterious internal error problem. |
Blue/Green conversion issues, including Boolean parsing, still seem to occur. I shared the template with the support team. Best regards. |
If it helps anyone, I was getting |
Facing "Failed to transform template" error, anyone has fixed it? |
Issue is not solved yet.
|
If it can help I was facing this
would only proceed when I'd replace all my parameters Refs by hard coded values except for
|
The text was updated successfully, but these errors were encountered: