Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ECS] [CodeDeploy] [CloudFormation]: CloudFormation support for BLUE/GREEN deployments on ECS #130

Closed
ilyasotkov opened this issue Jan 24, 2019 · 48 comments
Labels
ECS

Comments

@ilyasotkov
Copy link

@ilyasotkov ilyasotkov commented Jan 24, 2019

The feature was announced in November: https://www.youtube.com/watch?v=01ewawuL-IY

For blue/green deployments, AWS CloudFormation supports deployments on AWS Lambda compute platforms only.

https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-codedeploy-deploymentgroup.html

Not exactly sure whether AWS::ECS::Service.DeploymentConfiguration would be affected as well.

@ilyasotkov ilyasotkov added the Proposed label Jan 24, 2019
@abby-fuller abby-fuller added the ECS label Jan 25, 2019
@ghost
Copy link

@ghost ghost commented Feb 13, 2019

would be good to get more info on this and what are the plans.

@ngamradt-turner
Copy link

@ngamradt-turner ngamradt-turner commented Mar 1, 2019

We would like to start experimenting with this feature, but without CloudFormation support, it is not practical for us to use.

@ghost
Copy link

@ghost ghost commented Mar 1, 2019

We would like to start experimenting with this feature, but without CloudFormation support, it is not practical for us to use.

we have been promised that CFN will be getting 1st class treatment going forward ... we shall see :)

@deleugpn
Copy link

@deleugpn deleugpn commented Mar 1, 2019

@steven-cuthill-otm who promised that?

@ghost
Copy link

@ghost ghost commented Mar 1, 2019

@steven-cuthill-otm who promised that?
jeff barr, have a read ...
https://www.reddit.com/r/aws/comments/avx449/a_quick_cloudformation_update/

@deleugpn
Copy link

@deleugpn deleugpn commented Mar 1, 2019

I read that thread and didn't see any moment the promise to get CloudFormation support for resources as 1st class treatment. The wording says they'll open their roadmap to allow us to tell what we would like to prioritize, which for me does not translate as getting 1st class treatment.

@sarojgharat
Copy link

@sarojgharat sarojgharat commented Mar 11, 2019

Any update on this ?

@mmoulton
Copy link

@mmoulton mmoulton commented Mar 26, 2019

I'm also very interested in this capability. Specifically I'd like to use it with CDK (aws/aws-cdk#2056)

@angusfz
Copy link

@angusfz angusfz commented Mar 28, 2019

Waiting for CFN support

@nitzanyemal123
Copy link

@nitzanyemal123 nitzanyemal123 commented Apr 15, 2019

+1 For CF support

1 similar comment
@mrbueno92
Copy link

@mrbueno92 mrbueno92 commented May 8, 2019

+1 For CF support

@sarojgharat
Copy link

@sarojgharat sarojgharat commented May 8, 2019

@andrewhiles
Copy link

@andrewhiles andrewhiles commented May 14, 2019

Do we know if there's been any movement with this?

@JoeAlamo
Copy link

@JoeAlamo JoeAlamo commented May 21, 2019

This would be a killer feature to have.

@coultn coultn moved this from Researching to We're Working On It in containers-roadmap May 21, 2019
@paddie
Copy link

@paddie paddie commented Jun 17, 2019

Yep, blocked by this as well. I think that this article summarises pretty clearly why this is not a trivial thing for CF to handle:

It turns out that there’s actually a very good reason why this is not a supported configuration. CodeDeploy is going to be referring to an Auto Scaling Group, and it’s likely that you have defined that ASG in your CloudFormation template, and used !Ref to link the deployment group to it.

The problem comes when you do your first blue-green deployment. CodeDeploy does a blue-green deployment by cloning the ASG to make a new one with the same parameters, but whose instances run the newer version of your application. Once the new version is deployed and healthy, it winds down the old ASG. Then, it updates the deployment group configuration to point to the new ASG, and deletes the old ASG.

This is fine until you go back to your CloudFormation template and need to make a change that touches the ASG. CloudFormation will try to update the ASG that it created, only to find it doesn’t exist any more. Boom, template update failure and rollback.

@ScOut3R
Copy link

@ScOut3R ScOut3R commented Jun 17, 2019

In regards to ECS blue/green deployments CodeDeploy uses two already existing Target Groups to switch traffic back and forth. While the referenced article holds correct information it does not relate to ECS blue/green deployments but EC2 based ones.
I have a service where I deployed almost everything via CloudFormation, including the two Target Groups used for blue/green deployments. I provisioned the CodeDeploy side with Terraform and everything works quite neatly. Since CodeDeploy does not destroy/clone/recreate the Target Groups both parties are happy.
Apart from capacity concerns as a consumer I don't see any blockers for ECS support.

@okram999
Copy link

@okram999 okram999 commented Jul 16, 2019

@ScOut3R - how do the service know that the deployment type is bluegreen, if you configure them using cloudformation. The option to specify deployment type as bluegreen is what where the challenge starts. no?

@ScOut3R
Copy link

@ScOut3R ScOut3R commented Jul 16, 2019

@okram999 The blue/green CodeDeploy configuration is deployed by Terraform. I pass in as input variables the ECS Service and the two Target Groups (which are built by CloudForamtion) to Terraform which builds the CodeDeploy and CodePipeline configuration.

@noahxp
Copy link

@noahxp noahxp commented Aug 22, 2019

+1 For CF support

1 similar comment
@pferretti
Copy link

@pferretti pferretti commented Sep 2, 2019

+1 For CF support

@ngamradt-turner
Copy link

@ngamradt-turner ngamradt-turner commented Sep 6, 2019

Is there an update on this feature?

@mwarkentin
Copy link

@mwarkentin mwarkentin commented Sep 26, 2019

Probably helpful to add your votes and follow on the issues on the cloudformation roadmap:

@dslove
Copy link

@dslove dslove commented Oct 9, 2019

I need this too.

@ramanasatyavolu
Copy link

@ramanasatyavolu ramanasatyavolu commented Nov 4, 2019

Is there a feature provided in CloudFormation to launch blue green deployment through ECS

@bubeamos
Copy link

@bubeamos bubeamos commented Jan 21, 2020

Hi @coultn I see that there a property value CODE_DEPLOY in the "DeploymentController:" property docs here does this mean that this is now available ? And this issue need to be updated ?

@mwarkentin
Copy link

@mwarkentin mwarkentin commented Feb 7, 2020

Also hoping that this will work with the new CodeDeploy canary deployment strategies!

#229

@richardmaltais
Copy link

@richardmaltais richardmaltais commented Feb 7, 2020

@paddie For that specific matter, using Blue-Green with CodeDeploy on EC2 instances, we used a custom setup to keep the original ASG, then CodeDeploy-generated ASGs are switched over as usual, drained when needed and the updated LaunchConfiguration is also "copied" over through a Lambda, so that CloudFormation doesn't freak out when comes the time to update it. However, it comes with a few limitations:

  • not possible to change the capacity of the ASG (the CodeDeploy-generated one)
  • we have to have set the capacity to 0 when applying changes to the LaunchConfiguration and the ASG. Then, the Lambdas do the rest

However, we didn't try to experiment with resource import for the CodeDeploy generated ASG.

@tuttlem
Copy link

@tuttlem tuttlem commented May 12, 2020

+1 For CF support

@KiamarzFallahi
Copy link

@KiamarzFallahi KiamarzFallahi commented May 19, 2020

We are excited to announce you can now use AWS CloudFormation to perform Amazon ECS blue/green and canary deployments through AWS CodeDeploy. Blue/green deployments are a safe deployment strategy provided by AWS CodeDeploy for minimizing interruptions caused by changing application versions.

To learn more visit our announcement and the user guide.

containers-roadmap automation moved this from Coming Soon to Just Shipped May 19, 2020
@sethstone
Copy link

@sethstone sethstone commented May 27, 2020

I've been tracking this ticket for a few months and I was excited to see it close, however, the solution provided did not solve the problem I expected. It seems that the new capability allows one to use Cloudformation as the deployment controller - in particular you would be monitoring the deployment in Cloudformation rather than the CodeDeploy console and release changes through stack updates. This may be exactly what others were looking for, but I was expecting was to have the ability to create a deployment group (with CF) using the ECS BlueGreen deployment type and be able to manage the deployment process (once created by CF) in the CodeDeploy console.

Here's an example that I hoped would work after this change

 DeploymentGroup:
    Type: AWS::CodeDeploy::DeploymentGroup
    Properties: 
      ApplicationName: !Ref CodeDeployApplication
      AutoRollbackConfiguration: 
        Enabled: true
        Events:
          - DEPLOYMENT_FAILURE
      BlueGreenDeploymentConfiguration: 
        DeploymentReadyOption: 
          ActionOnTimeout: CONTINUE_DEPLOYMENT
          WaitTimeInMinutes: 0
        TerminateBlueInstancesOnDeploymentSuccess:
          Action: TERMINATE
          TerminationWaitTimeInMinutes: 5
      DeploymentConfigName: CodeDeployDefault.ECSAllAtOnce
      DeploymentGroupName: !Ref CodeDeployDeploymentGroupName
      DeploymentStyle: 
        DeploymentOption: WITH_TRAFFIC_CONTROL
        DeploymentType: BLUE_GREEN
      LoadBalancerInfo: 
        TargetGroupPairInfoList: 
          TargetGroups:
            - Name: !Sub ${ProjectPrefix}-tg1
            - Name: !Sub ${ProjectPrefix}-tg2
          ProdTrafficRoute: 
            ListenerArns: !Ref ALBProductionListener
      ServiceRoleArn: !GetAtt CodeDeployServiceRole.Arn
      ECSServices:
        - ServiceName: !Sub ${ProjectPrefix}-service
        - ClusterName: !Sub ${ProjectPrefix}-ecs-fargate-cluster

Based on the Youtube video I thought this is what we were trying to accomplish because it shows several Deploy stages with ECS Blue/Green Deploy (not a Cloudformation deploy).

Could someone let me know if this is an issue that is being worked on on the roadmap and/or am I doing something wrong to expect this to work? (I based this CF template on a working CLI command that I have)

Thanks!

PS: The error I get is "Encountered unsupported property BlueGreenDeploymentConfiguration"

@KiamarzFallahi
Copy link

@KiamarzFallahi KiamarzFallahi commented May 27, 2020

@sethstone You are correct. This feature does not support creating and configuring a CodeDeploy deployment group to naively perform blue/green deployments for ECS. That is a coverage gap being tracked: aws-cloudformation/cloudformation-coverage-roadmap#37

@clareliguori
Copy link
Member

@clareliguori clareliguori commented May 27, 2020

That issue was also closed:
aws-cloudformation/cloudformation-coverage-roadmap#37 (comment)

@sethstone A new issue was created on the CloudFormation roadmap here requesting coverage in AWS::CodeDeploy::DeploymentGroup for ECS blue-green, it would be great for you to comment there with your example template snippet. Thanks!
aws-cloudformation/cloudformation-coverage-roadmap#483

@mwarkentin
Copy link

@mwarkentin mwarkentin commented Jun 2, 2020

There are a couple of limitations which mean that this feature won't work with our ECS platform (convox):

  • Declaring output values or importing values from other stacks is not currently supported for templates defining blue/green ECS deployments.
  • You cannot use the AWS::CodeDeploy::BlueGreen hook in a template that includes nested stack resources.
  • You cannot use the AWS::CodeDeploy::BlueGreen hook in a nested stack.

Are there plans / work in progress to handle these use cases? Or for more complex use cases like this would it be meant to be handled by the CodeDeploy DeploymentGroup support described above?

@Lasim
Copy link

@Lasim Lasim commented Jun 2, 2020

Thanks for the analysis @mwarkentin.

it's meant to me that there is no sense yet to use this feature if these functions are not implemented.

@vinay-nadig-0042
Copy link

@vinay-nadig-0042 vinay-nadig-0042 commented Jul 6, 2020

I see that the issue has been closed since the Cloudformation support has been announced. But, as @mwarketin pointed it out, the limitations are quite crippling. Is there a separate issue where these limitations and their fixes are tracked?

@zzenonn
Copy link

@zzenonn zzenonn commented Jul 6, 2020

I see that the issue has been closed since the Cloudformation support has been announced. But, as @mwarketin pointed it out, the limitations are quite crippling. Is there a separate issue where these limitations and their fixes are tracked?

I opened this issue on the CloudFormation roadmap

aws-cloudformation/cloudformation-coverage-roadmap#483

ligaz added a commit to ligaz/aws-cdk that referenced this issue Sep 21, 2020
This change adds the option to set the `DeploymentController` on `ApplicationLoadBalanced`,
`NetworkLoadBalanced` and `QueueProcessing` Fargate services. `MultipleTargetGroups` services are not updated
because this option might not be applicable for them - one service might be using the default deployment
controller while others might use `CODE_DEPLOY` or `EXTERNAL`.

By default if this options is not passed it will use the default one from the `FargateService` construct  which
is `ECS`'s rolling updates.

Related to aws/containers-roadmap#130
mergify bot pushed a commit to aws/aws-cdk that referenced this issue Dec 14, 2020
…es (#10452)

This change adds the option to set the `DeploymentController` on `ApplicationLoadBalanced`,
`NetworkLoadBalanced` and `QueueProcessing` ECS Services (both EC2 and Fargate). `MultipleTargetGroups` services are not updated because this option might not be applicable for them - one service might be using the default deployment
controller while others might use `CODE_DEPLOY` or `EXTERNAL`.

By default if this options is not passed it will use the default one from the respective Service construct  which
is `ECS`'s rolling updates.

Related to aws/containers-roadmap#130

Closes #10971.

----

*By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license*
flochaz pushed a commit to flochaz/aws-cdk that referenced this issue Jan 5, 2021
…es (aws#10452)

This change adds the option to set the `DeploymentController` on `ApplicationLoadBalanced`,
`NetworkLoadBalanced` and `QueueProcessing` ECS Services (both EC2 and Fargate). `MultipleTargetGroups` services are not updated because this option might not be applicable for them - one service might be using the default deployment
controller while others might use `CODE_DEPLOY` or `EXTERNAL`.

By default if this options is not passed it will use the default one from the respective Service construct  which
is `ECS`'s rolling updates.

Related to aws/containers-roadmap#130

Closes aws#10971.

----

*By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license*
@simi-obs
Copy link

@simi-obs simi-obs commented Jun 24, 2022

June 24th 2022.

Does anybody know how to control blue/green ECS deployment using Cloudformation(CF) without all the stupid limitations that the hook imposes (no outputs, no dynamic references, no other updates ...)?

I mean at this point I am OK with implementing my own custom resource or even custom resource type. I am willing to bend and punish CF in ugly ways, the only thing I want is a way to perform ECS blue/green deployment as a part of CF deployment (update) process without all the stupid limits.

Edit: I was thinking about implementing custom Cloudformation resource type. This resource would be in control of performing the deployment (lets call it "control resource"). It would use aws code deploy sdk to control the deployment. Do you think it would be possible, or do you see any obstacles?

What I am worried about is erroneous cases when for example the stack update is interrupted due to something and the "control resource" is in progress. I am not sure how does Cloudformation behave in such case (will the "control resource" be notified that the stack update was interrupted?). And there are more cases like that. I guess this implementation would require deep dive into Cloudformation's internal functioning, which might be pretty exhausting. And it still might end up being dead-end.

@teekennedy
Copy link

@teekennedy teekennedy commented Jun 28, 2022

@simi-obs I can assure you that using a custom resource as a workaround for this is not a dead end. I implemented one myself (my very first lambda actually) back in 2019 when I started watching this issue and it's been used thousands of times over in production with a high success rate.

I unfortunately can't post or give you the code without express permission from my employer, but I'll try to give you the rundown of the overall process and the gotchas / lessons learned along the way.

In 2019, it was not possible to create a CloudFormation custom resource that was responsible for just the deployment itself (the "control resource" as you call it), because some the configuration required to create a CodeDeploy enabled ECS service was not allowed by the existing AWS::ECS::Service resource type. I ended up replacing the ECS service resource itself with my custom resource lambda. In addition to managing the ECS service, it managed the CodeDeploy application, and deployment group resources. Whether that was required or just made my life easier I can't recall specifically anymore.

I suggest you approach writing your lambda on an event-by-event basis:

  • The create and delete events are likely no-ops unless you end up managing other resources.
  • The update event should:
    • Use some kind of locking or retry mechanism to avoid creating a new deployment while one is already in progress. I ended up enforcing that all deployments in the CI/CD pipeline go through CloudFormation to try to prevent this situation, and then wrote a simple polling loop to wait for any in-progress deployments to finish before initiating a new one as a fallback measure. There are better ways, but this solution has yet to cause issues so it is still in use.
    • Collect the parameters for the deployment from the properties of the resource, including at least partially generating the app spec for the deployment.
    • Create the deployment, then wait for it to enter the Failed or Succeeded status.
    • Return the corresponding error or success signal to CloudFormation.

The caveats to watch out for:

  • Make sure you send a response to CloudFormation in every scenario. Not sending a response means waiting for CloudFormation to time out against the custom resource, which IIRC defaults to 4 hours. If you end up in this situation, there's nothing you can do but wait, so do everything you can to avoid it. Catch any exceptions and return an error response. Send an error response if the lambda is about to time out. Carefully review the logic used to form and send the response to ensure it always sends something.
  • As mentioned above, make sure you handle the use case of overlapping deployments gracefully. The details of this depend on how the deployments fit into the overall release process.
  • Deployments can run for much longer than lambdas. There are some CloudFormation custom resource libraries that chain multiple lambda executions together using continuation passing lambda events to work around this. You could also stop the in-progress deployment if the lambda is getting too close to timing out.

Hope this helps!

@simi-obs
Copy link

@simi-obs simi-obs commented Jun 28, 2022

@teekennedy Very nice breakdown, thank you.
Some of the caveats you mentioned have solutions nowadays:

  • I.e. instead of using the Lambda backed custom resource, I will implement Cloudformation custom resource type, which does not have some of the limitations (done that in the past).

Regarding the overlapping deployments:

  • We are using isolated environments, which are updated only using Cloudformation. What I intend to do is to make sure(enforce) that if the Cloudformation stack update is not ongoing, NO code deploy deployment (of the stack's services) is ongoing. This is in line with what you have mentioned: I will make sure I always wait for the code deploy deployment to either Fail or Succeed.
    If I can enforce this, I should be able to enforce no overlapping deployments (since by default there cannot be overlapping Cloudformation stack updates).

Also I believe I will still have to handle the ECS:Service as a part of the custom resource. Otherwise, I might run into the problem with Cloudformation trying to replace the service (which would probably end up in error) once the TaskDefinition changes.

Overall, thank you for all the tips, will let the community here know how I ended up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ECS
Projects
containers-roadmap
  
Just Shipped
Development

No branches or pull requests