Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CodeDeploy: Blue/Green deployments and In-Place deployments with traffic control. #1162

Merged

Conversation

niclic
Copy link
Contributor

@niclic niclic commented Jul 15, 2017

Issue: #504

This PR adds Blue/Green Deployment support to CodeDeploy Deployment Groups, as well as supporting In-Place deployments with Traffic Control (i.e. Load Balancer).

It was originally opened on February 4, 2017, as hashicorp/terraform#11700.

This PR adds the following resources to a Deployment Group:

  • DeploymentStyle
  • LoadBalancerInfo
  • BlueGreenDeploymentConfiguration

The original work is contained in a single new commit. The original PR retains all original commits and discussion.

These enhancements can also be used for in-place deployments with traffic control.

deployment_style {
  deployment_option = "WITH_TRAFFIC_CONTROL"
  deployment_type = "IN_PLACE"
}

Since this scenario was not supported when the original PR was created, I created a new commit that includes a test for this specific configuration, as well as updating the documentation to reflect this recent change. No code changes were required to support this scenario.

  - Adds DeploymentStyle, LoadBalancerInfo, and BlueGreenDeploymentConfiguration
…affic control.

  - There was no explicit test for this scenario and the docs referred only to a blue/green configuration.
@radeksimko radeksimko added the enhancement Requests to existing resources that expand the functionality or scope. label Jul 20, 2017

* `deployment_type` - (Optional) Indicates whether to run a standard deployment or a blue/green deployment. Valid Values are `IN_PLACE` or `BLUE_GREEN`.

* `IN_PLACE` deployment type is not supported with the `WITH_TRAFFIC_CONTROL` deployment option.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is no longer true, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are correct. I will fix this.

  - change "standard" to "in-place"
  - remove deployment_type restrictions
@niclic
Copy link
Contributor Author

niclic commented Jul 27, 2017

Documentation updated to reflect support of traffic control with in-place deployments.

@zdobrenen
Copy link

Will this PR be merged in the next version release? If so, when can that be expected to happen? This functionality will be much appreciated.

@aredridel
Copy link

Ooh, this just bit us, too!

@niclic
Copy link
Contributor Author

niclic commented Aug 22, 2017

Thanks for the heads up. @LegNeato

There is now an optional targetGroupInfoList array on loadBalancerInfo to support using Application Load Balancer as part of a deployment.

I can add support for this feature, although I'd prefer if the PR was merged as is. It has been six months after all. ;)

@PauloMigAlmeida
Copy link

It just hit us too.

Is there any plan for when this will be merged to master?

  - Use an ALB in a deployment by specifying a Target Group.
@niclic
Copy link
Contributor Author

niclic commented Sep 10, 2017

Added support for using an Application Load Balancer in a deployment.

A note on load_balancer_info exception tests.

  • two elb_info?

    • aws_codedeploy_deployment_group.foo_group: InvalidLoadBalancerInfoException: The specification for load balancing in the deployment group is invalid. The elbInfoList string contains more than one load balancer name, but only one load balancer is supported per deployment group.
  • two target_group_info?

    • aws_codedeploy_deployment_group.foo_group: InvalidLoadBalancerInfoException: The specification for load balancing in the deployment group is invalid. The targetGroupInfoList string contains more than one target group name, but only one target group is supported per deployment group.
  • one each?

    • aws_codedeploy_deployment_group.foo_group: InvalidLoadBalancerInfoException: The specification for load balancing in the deployment group is invalid. Both a load balancer and a target group have been specified in loadBalancerInfo, but only one can be used in a single deployment.

These exceptions are mentioned in the documentation for this resource.

@PauloMigAlmeida
Copy link

PauloMigAlmeida commented Sep 12, 2017

@niclic Good work! It is undoubtedly a long-awaited feature for both CodeDeploy and Terraform. 🥇

ping @radeksimko: is there anything else that needs to be done in order to get this merged?

Copy link
Member

@radeksimko radeksimko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @niclic
thank you for taking the time to submit this PR and being thorough in testing. It is looking very good and very close to mergeable state.

I left you comments there, some are more important (about optional fields & nil checks), some are just nitpicks.

I'm not feeling overly comfortable about making all the fields Computed, but it seems there's no way to remove loadBalancerInfo neither blueGreenDeploymentConfiguration. I'll raise a support ticket with AWS about this.

That said I believe that deployment_style doesn't really need to be Computed.

One more note - the codebase may not be entirely consistent at this point, but we have a naming convention we use for functions which convert SDK structs to schema-compatible []interface{} etc. and back. We call them flatteners and expanders.

Using this convention buildDeploymentStyle would be called expandDeploymentStyle, deploymentStyleToMap would be called flattenDeploymentStyle.

It's no big deal, but it helps to keep it aligned.

Let me know what you think.

if len(list) == 1 {
style := list[0].(map[string]interface{})
result.DeploymentOption = aws.String(style["deployment_option"].(string))
result.DeploymentType = aws.String(style["deployment_type"].(string))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that these options are both optional I think we'll need to wrap this in something like this

if v, ok := style["deployment_option"]; ok {
  result.DeploymentOption = aws.String(v.(string))
}

m := attr.([]interface{})[0].(map[string]interface{})
deploymentReadyOption := &codedeploy.DeploymentReadyOption{
ActionOnTimeout: aws.String(m["action_on_timeout"].(string)),
WaitTimeInMinutes: aws.Int64(int64(m["wait_time_in_minutes"].(int))),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned above - these are also optional fields, so we should only be passing them to the API if they're actually defined in the config.

m := attr.([]interface{})[0].(map[string]interface{})
blueInstanceTerminationOption := &codedeploy.BlueInstanceTerminationOption{
Action: aws.String(m["action"].(string)),
TerminationWaitTimeInMinutes: aws.Int64(int64(m["termination_wait_time_in_minutes"].(int))),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned above - these are also optional fields, so we should only be passing them to the API if they're actually defined in the config.


item := make(map[string]interface{})
item["deployment_option"] = *style.DeploymentOption
item["deployment_type"] = *style.DeploymentType
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure that none of these options will ever be nil? Otherwise this would cause a crash here.


lbInfo := make(map[string]interface{})
lbInfo["elb_info"] = schema.NewSet(loadBalancerInfoHash, elbs)
lbInfo["target_group_info"] = schema.NewSet(loadBalancerInfoHash, targetGroups)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may look unintuitive I think we should be able to just use []interface{} here, i.e. elbs and targetGroups directly. We can avoid the custom hashing function that way.

blueInstanceTerminationOption["action"] = *config.TerminateBlueInstancesOnDeploymentSuccess.Action
blueInstanceTerminationOption["termination_wait_time_in_minutes"] = *config.TerminateBlueInstancesOnDeploymentSuccess.TerminationWaitTimeInMinutes
m["terminate_blue_instances_on_deployment_success"] = append(c, blueInstanceTerminationOption)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure that none of these options will ever be nil? Otherwise this would cause a crash here.

@@ -670,6 +1015,21 @@ func resourceAwsCodeDeployTriggerConfigHash(v interface{}) int {
return hashcode.String(buf.String())
}

func loadBalancerInfoHash(v interface{}) int {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This custom hash function shouldn't be necessary as the nested schema seems quite simple, so the default one (which is used when Set is not defined) should suffice.

@radeksimko
Copy link
Member

Just attaching an example config which ends up in a crash:

resource "aws_codedeploy_deployment_group" "example" {
  app_name              = "${aws_codedeploy_app.example.name}"
  deployment_group_name = "example-group"
  service_role_arn      = "${aws_iam_role.example.arn}"

  deployment_style {
    deployment_option = "WITH_TRAFFIC_CONTROL"
  }

  load_balancer_info {
    elb_info {
      name = "example-elb"
    }
  }

  blue_green_deployment_config {
    deployment_ready_option {
      action_on_timeout    = "STOP_DEPLOYMENT"
    }
  }
}
panic: runtime error: index out of range
2017-09-15T11:27:00.454+0100 [DEBUG] plugin.terraform-provider-aws:
2017-09-15T11:27:00.454+0100 [DEBUG] plugin.terraform-provider-aws: goroutine 159 [running]:
2017-09-15T11:27:00.454+0100 [DEBUG] plugin.terraform-provider-aws: github.com/terraform-providers/terraform-provider-aws/aws.buildBlueGreenDeploymentConfig(0xc42012ff50, 0x1, 0x1, 0x26239e0)
2017-09-15T11:27:00.454+0100 [DEBUG] plugin.terraform-provider-aws: 	/Users/radeksimko/gopath/src/github.com/terraform-providers/terraform-provider-aws/aws/resource_aws_codedeploy_deployment_group.go:788 +0x79d

@radeksimko radeksimko added the waiting-response Maintainers are waiting on response from community or contributor. label Sep 15, 2017
@niclic
Copy link
Contributor Author

niclic commented Sep 15, 2017

@radeksimko

Thank you so much for the review! I really appreciate getting great feedback.

I wil review your comments in detail and address them with new changes as soon as possible.

Naming
I chose to be consistent with the naming convention of similar functions in this file, but I was aware that they were inconsistent with those used elsewhere. I am happy to change them to be consistent with the desired convention.

Question: Do you think I should change the names of only the functions in this PR, or for all similar functions in this resource file?

Optional v Required
The optional or required nature of things has bothered me right from the very start of this PR. I commented on this in the original PR.

All types (and their sub types) are indicated as being optional in the SDK, which obviously is not the case (your simple test proves that). However, this was the reason why I hesitated marking them as required in terraform - to avoid a mismatch between the official documentation and terraform, which might be confusing. But if they are going to be indicated as being optional to users of terraform, then more robust null checking is mandatory.

Question: For these cases, would it be better to mark them with Required: true using schema attributes, rather than adding null checks in the code? The former option seems simpler and requires less code, but the latter has the added advantage of remaining consistent with the SDK.

@radeksimko
Copy link
Member

Question: Do you think I should change the names of only the functions in this PR, or for all similar functions in this resource file?

I'd only change the ones you have created as part of this PR and deal with the rest in a separate PR as this one is already getting quite big in terms of LOC. 😅

Related to this - we also tend to store all flatteners & expanders in here: https://github.com/terraform-providers/terraform-provider-aws/blob/master/aws/structure.go to keep the LOC in resource_* files with the actual CRUD logic as low as possible.

Again - no big deal, but it's a convention we developed over time and it's nice to stick to it, if we can.

Question: For these cases, would it be better to mark them with Required: true using schema attributes, rather than adding null checks in the code? The former option seems simpler and requires less code, but the latter has the added advantage of remaining consistent with the SDK.

I'm honestly not sure, but I remember I marked some fields Required in the past in a different API, despite them being documented as optional to later discover they actually were optional, just under certain context - e.g. in combination with some other fields and/or values of those fields. I don't know if that's the case here, but I'd double check this with AWS support before drifting away from documentation.

Keeping it optional and adding nil checks seems safer at this point, but if you can get a confirmation of docs being wrong by AWS, then making fields required is obviously much cleaner solution.

@niclic
Copy link
Contributor Author

niclic commented Sep 23, 2017

TODO

  • Don't pass "optional" fields to the api:

    • buildDeploymentStyle
    • buildBlueGreenDeploymentConfig
  • nil checks on "optional" fields:

    • deploymentStyleToMap
    • blueGreenDeploymentConfigToMap
  • Rename flatteners and expanders to be consistent with established convention.

  • Remove loadBalancerInfoHash function.

  • Does deployment_style need to be Computed?

  • Move flatteners and expanders to structure.go?

  - Don't pass "optional" fields to the api.
  - Check for nil on "optional" fields.
  - be consistent about returning nil when there is no valid input
  - returning empty objects can produce exceptions for required fields
@niclic
Copy link
Contributor Author

niclic commented Sep 25, 2017

@radeksimko

RE: loadBalancerInfo
Can you clarify which option you are suggesting, since two options are presented:

  • use schema.HashString in place of loadBalancerInfoHash, or
  • model elb_info and target_group_info as TypeList (with MaxItems = 1), instead of as schema.TypeSet

On reflection, I'm not sure why elb_info is TypeSet and not a TypeList, so I think I will change that. I guess I got tired of dealing with lists that could only have one item. Is there a better way of modelling such simple structs?

Also, I did not add any nil checks for the name field on elb or targetGroup, since you didn't explicitly call those out. This field is supposed to be required, even though it appears as optional in the docs. When making the above changes, I can add these additional checks to expandLoadBalancerInfo and flattenLoadBalancerInfo.

@radeksimko
Copy link
Member

@niclic What I meant originally is to keep both fields as TypeSet (because ordering doesn't/shouldn't matter) and set value as []interface{} so we can remove the custom hash function and let helper/schema use the default one.

lbInfo["elb_info"] = elbs
lbInfo["target_group_info"] = targetGroups

See how we do the slice -> set conversion here: https://github.com/hashicorp/terraform/blob/master/helper/schema/field_writer_map.go#L271-L309
and how we use the default hash function here:
https://github.com/hashicorp/terraform/blob/004f6cc9e29ccecb25676f162584db113813e91c/helper/schema/schema.go#L275-L281 which is the best we can do in this particular case anyway.

schema.HashString would be useful in case we had something like this in the schema:

"elb_info": &schema.Schema{
	Type:     schema.TypeSet,
	Optional: true,
	Set:      schema.HashString,
	Elem:     &schema.Schema{Type: schema.TypeString},
},

The nested structure in our case here isn't a simple string though, it's a Resource with name field.

Let me know if that makes sense and/or whether you need any further help.

@niclic
Copy link
Contributor Author

niclic commented Sep 28, 2017

@radeksimko

Okay. When I remove the custom hash function, I get this error in any test that includes load_balancer_info:

* Invalid address to set: []string{"load_balancer_info", "0", "elb_info"}

I may be missing something, but if I change elb_Info and target_group_info to TypeList all tests pass. Kinda like in this prior issue.

See the sequence of commits above.

Note that currently, only one elb_info or target_group_info is allowed by the SDK.

@niclic
Copy link
Contributor Author

niclic commented Sep 28, 2017

Also, I'm considering removing the flatten/expand function tests. Here are the tests I added for this PR (there are many others form earlier contributions, never mind all the brittle validation functions as well, but that's for another time).

TestAWSCodeDeployDeploymentGroup_expandDeploymentStyle
TestAWSCodeDeployDeploymentGroup_flattenDeploymentStyle
TestAWSCodeDeployDeploymentGroup_expandLoadBalancerInfo
TestAWSCodeDeployDeploymentGroup_flattenLoadBalancerInfo
TestAWSCodeDeployDeploymentGroup_expandBlueGreenDeploymentConfig
TestAWSCodeDeployDeploymentGroup_flattenBlueGreenDeploymentConfig

I'm not sure if these sorts of tests add much value. I wrote them to guide the writing of flatten and expand functions. There are plenty of acceptance tests covering these resources now and besides, despite their lines of code, the tests are very basic, and do not cover every possible code path.

Thoughts?

@niclic
Copy link
Contributor Author

niclic commented Sep 28, 2017

@radeksimko

RE: Computed on DeploymentStyle
I can't get the TestAccAWSCodeDeployDeploymentGroup_deploymentStyle_delete test to pass without using Computed: true,. If I send nil, I get the error below:

--- FAIL: TestAccAWSCodeDeployDeploymentGroup_deploymentStyle_delete (34.05s)
  testing.go:427: Step 1 error: After applying this step, the plan was not empty:

    DIFF:

    UPDATE: aws_codedeploy_deployment_group.foo_group
      deployment_style.#:                   "1" => "0"
      deployment_style.0.deployment_option: "WITH_TRAFFIC_CONTROL" => ""
      deployment_style.0.deployment_type:   "BLUE_GREEN" => ""

    STATE:

    aws_codedeploy_app.foo_app:
      ID = 44aaa267-7e15-4642-a5a4-3552f34ff7a5:foo-app-18afv
      name = foo-app-18afv
    aws_codedeploy_deployment_group.foo_group:
      ID = a75207ec-a2b7-4acd-98d6-9a292ab2c5b9
      alarm_configuration.# = 0
      app_name = foo-app-18afv
      auto_rollback_configuration.# = 0
      autoscaling_groups.# = 0
      blue_green_deployment_config.# = 0
      deployment_config_name = CodeDeployDefault.OneAtATime
      deployment_group_name = foo-group-18afv
      deployment_style.# = 1
      deployment_style.0.deployment_option = WITH_TRAFFIC_CONTROL
      deployment_style.0.deployment_type = BLUE_GREEN

If I send an empty &codedeploy.DeploymentStyle{}, I get an exception:

* aws_codedeploy_deployment_group.foo_group:
InvalidDeploymentStyleException: Deployment type or deployment option cannot be null or empty

I'm probably missing something here. But it seems that removing deployment_style from my configuration does not trigger a reversion to the default configuration, but instead retains the previous state.

I have sent a question to AWS SDK support regarding the using of Required: No in the SDK documentation. Have yet to hear back from them. I should probably try a more direct line of communication.

@radeksimko
Copy link
Member

When I remove the custom hash function, I get this error in any test that includes load_balancer_info:

You're right and I apologise for providing misleading info and letting you spend the extra time on this! 😞
As I discovered myself + after chatting with colleague from the core team we only support this behaviour for Sets in the 1st level, not nested sets, like here. I frankly wasn't aware of this limitation.

if I change elb_Info and target_group_info to TypeList all tests pass.

One of the reasons we should use TypeSet instead of TypeList here is because the names of ELBs or Target Groups may come from the API in an unpredictable order (AFAIK) and the user would see spurious diff because ordering in the config doesn't match the one in the API response.

To sum this up - do you mind removing the last two commits? Thanks and sorry again!


Also, I'm considering removing the flatten/expand function tests.

I think they're useful, I'd keep them there.

There are plenty of acceptance tests covering these resources now and besides, despite their lines of code, the tests are very basic, and do not cover every possible code path.

I think the intention of a test isn't to cover every possible codepath, that's almost impossible in reality. It's however useful to have tests which test different parts/functions of the program, so that if that small part breaks we can have a small, isolated test making it obvious which part is broken.

If you had an acceptance test failure you'd need to dig deeper to understand that it's caused by flattener/expander.


I can't get the TestAccAWSCodeDeployDeploymentGroup_deploymentStyle_delete test to pass without using Computed: true,

I see, technically Terraform does what it's told - Computed means "anything is ok", including the previous value user had in config prior to update (because that's what comes from the API). Whether that's what the user expects is questionable. Either way I have an open AWS support ticket about inability to remove/reset blue-green configuration.

Here's what I received on 17th September:

I am engaging with our internal team to optimize the output of deployment group for you. Right now there seems no way to remove B/G configuration from a deployment group once added. I will update this ticket when I have the feedback from the internal team.

I don't expect quick response or solution there, so I'd keep it Computed for the time being.


I have sent a question to AWS SDK support regarding the using of Required: No in the SDK documentation.

How do you feel about adding the extra nil checks and keeping it Optional, so that we can merge this PR and eventually follow up later, once AWS comes back with an answer?

@radeksimko radeksimko removed the waiting-response Maintainers are waiting on response from community or contributor. label Oct 4, 2017
@niclic
Copy link
Contributor Author

niclic commented Oct 4, 2017

To sum this up - do you mind removing the last two commits? Thanks and sorry again!

No problem!

How do you feel about adding the extra nil checks and keeping it Optional, so that we can merge this PR and eventually follow up later, once AWS comes back with an answer?

Sounds good to me.

I will remove the last couple of commits and revisit the overall PR before signing off.

Thanks again for all your help. @radeksimko

@radeksimko
Copy link
Member

@niclic Hey,
I can take this across the finish line if you want. I know you already spent quite some time on this and part of that was because of my mistake, so I feel I should pay it back 😅

I'd like to cut next release of the provider this week and have this PR in it, so let me know if you're ok with me making the proposed changes and pushing them to your branch.

@niclic
Copy link
Contributor Author

niclic commented Oct 10, 2017

@radeksimko

I had planned on getting this completed this past weekend, but best laid plans, etc.

I am totally fine with you completing this and finally seeing it merged. Thanks for all your help!

@radeksimko radeksimko force-pushed the codedeploy-blue-green-deployments-redux branch from 06258ff to deccd09 Compare October 11, 2017 10:30
@radeksimko radeksimko force-pushed the codedeploy-blue-green-deployments-redux branch from deccd09 to 78777df Compare October 11, 2017 10:34
@radeksimko radeksimko merged commit ca04743 into hashicorp:master Oct 11, 2017
@JimtotheB
Copy link

JimtotheB commented Feb 22, 2018

@radeksimko or @niclic Since you guys seem to be behind this functionality, what is the preferred way to keep track of the ASG created during a deploy with

green_fleet_provisioning_option {
      action = "COPY_AUTO_SCALING_GROUP"
}

Currently I am bringing in a ASG from another module, which is destroyed when I deploy the app. I have tried to terraform import the new ASG in, but of course running terraform apply attempts to recreate the original named ASG from the other module. Is there a correct way that either of you guys had in mind on how to keep track of this?

@SvanBoxel
Copy link

Hey @JimtotheB! Did you find out how to fix this? Running into the same problem.

@niclic
Copy link
Contributor Author

niclic commented Sep 10, 2018

Please see this comment, which explains this issue in detail, plus some options for working around it.

@ghost
Copy link

ghost commented Apr 3, 2020

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. Thanks!

@hashicorp hashicorp locked and limited conversation to collaborators Apr 3, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement Requests to existing resources that expand the functionality or scope.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants