Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure desired_vcpus is set before including it in update #4855

Merged
merged 3 commits into from Jun 25, 2020

Conversation

hectcastro
Copy link
Contributor

Fixes #4671 #4788

When updating an AWS Batch compute environment, do not specify DesiredvCpus in the call to UpdateComputeEnvironment unless desired_vcpus is set.

Output from acceptance testing:

$ make testacc TESTARGS='-run=TestAccAWSBatchComputeEnvironment_createEc2'
==> Checking that code complies with gofmt requirements...
TF_ACC=1 go test ./... -v -run=TestAccAWSBatchComputeEnvironment_createEc2 -timeout 120m
?       github.com/terraform-providers/terraform-provider-aws   [no test files]
=== RUN   TestAccAWSBatchComputeEnvironment_createEc2
--- PASS: TestAccAWSBatchComputeEnvironment_createEc2 (49.55s)
=== RUN   TestAccAWSBatchComputeEnvironment_createEc2WithTags
--- PASS: TestAccAWSBatchComputeEnvironment_createEc2WithTags (46.84s)
=== RUN   TestAccAWSBatchComputeEnvironment_createEc2WithoutComputeResources
--- PASS: TestAccAWSBatchComputeEnvironment_createEc2WithoutComputeResources (21.39s)
PASS
ok      github.com/terraform-providers/terraform-provider-aws/aws       117.838s

When updating an AWS Batch compute environment, do not specify
`DesiredvCpus` in the call to `UpdateComputeEnvironment` unless
`desired_vcpus` is set.
@ghost ghost added the size/XS Managed by automation to categorize the size of a PR. label Jun 16, 2018
@bflad bflad added bug Addresses a defect in current functionality. service/batch Issues and PRs that pertain to the batch service. labels Jun 18, 2018
@bflad
Copy link
Member

bflad commented Jun 18, 2018

Hey @hectcastro! Can you add an acceptance test which tries to cover this behavior? I have a feeling the proposed fix should actually be adding a DiffSuppressFunc to the desired_vcpus attribute (returning true when the new value is 0) rather than trying to adjust the read/update logic. Please feel free to reach out if you have any questions.

Here's some relevant documentation:

@bflad bflad added the waiting-response Maintainers are waiting on response from community or contributor. label Jun 18, 2018
Set min_vcpus to something greater than the default number of desired
vCPUs when unset.
@ghost ghost added size/M Managed by automation to categorize the size of a PR. and removed size/XS Managed by automation to categorize the size of a PR. labels Jun 19, 2018
@hectcastro
Copy link
Contributor Author

I think I've got a decent test case for not setting desired_vcpus and setting min_vcpus to greater than zero:

$ make testacc TESTARGS='-run=TestAccAWSBatchComputeEnvironment_updateMinvCpus'               
==> Checking that code complies with gofmt requirements...
TF_ACC=1 go test ./... -v -run=TestAccAWSBatchComputeEnvironment_updateMinvCpus -timeout 120m                                                                                    
?       github.com/terraform-providers/terraform-provider-aws   [no test files]
=== RUN   TestAccAWSBatchComputeEnvironment_updateMinvCpus
--- FAIL: TestAccAWSBatchComputeEnvironment_updateMinvCpus (63.36s)
        testing.go:518: Step 1 error: Error applying: 1 error(s) occurred:

                * aws_batch_compute_environment.ec2: 1 error(s) occurred:

                * aws_batch_compute_environment.ec2: : Error executing request, Exception : desiredvCpus should be between minvCpus and maxvCpus, RequestId: 7f8fdc3b-73b1-11e8-bbb8-d18d8e472e6e
                        status code: 400, request id: 7f8fdc3b-73b1-11e8-bbb8-d18d8e472e6e                                                                                       
FAIL
FAIL    github.com/terraform-providers/terraform-provider-aws/aws       63.403s
make: *** [testacc] Error 1

I had a hard time implementing a fix with DiffSuppressFunc (thanks for the documentation pointers—I didn't know they existed, and they were helpful). Even if I return true all of the time from that function, this error still occurs. I think that's because although it suppress the diff, it still passes along the value via the API call.

In trying to make a better sense of desired_vcpus, it behaves a bit like a computed value (AWS can change it out from under you in a MANAGED CE depending on jobs in the queue). From the Terraform configuration DSL, we've been able to suppress errors by looking up the current desired_vcpu count via the AWS CLI and then threading that through a variable that is set as the value for desired_vcpu.

@@ -447,7 +473,8 @@ resource "aws_iam_role_policy_attachment" "aws_ec2_spot_fleet_role" {
########## security group ##########

resource "aws_security_group" "test_acc" {
name = "tf_acc_test_batch_sg_%d"
name = "tf_acc_test_batch_sg_%d"
vpc_id = "${aws_vpc.test_acc.id}"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was added to help produce a valid CE when desired_vcpu is non-zero. If I made code changes to the resource such that it wouldn't send a value for desired vCPUs, the CE would get into an invalid state citing that its associated security group wasn't in the specified VPC.

@ratanos
Copy link

ratanos commented Jun 20, 2018

@hectcastro Thanks. I was planning to file a bug and came across this issue.
My terraform configuration snippet

min_vcpus = 2
desired_vcpus = 2
max_vcpus = 8
lifecycle {
ignore_changes=["desired_vcpus"]
}

Even if AWS Batch does increase desired_vcpus,
terraform apply should ignore desired_vcpus.

For some reason,
Terraform doesn't ignore "desired_vcpus".

Any idea if this is the expected behaviour or related the issue you're fixing ?

@euclideansphere
Copy link
Contributor

seems like the value needs to be excluded from the api request (or included if we pass back the value retrieved from aws). any update on this? anything I can do to move this along?

@rajbettaswamy
Copy link

Is there any current workaround for this issue ?

@theothertom
Copy link

theothertom commented Nov 9, 2018

Is there any current workaround for this issue ?

I do something like this:
-var "batch_desired_cpu_workaround=$(aws batch describe-compute-environments --region eu-west-2 | jq '.computeEnvironments[0].computeResources.desiredvCpus')"
and then pass the desired CPU count to my batch compute environment

@rajbettaswamy
Copy link

Thanks for the workaround @theothertom. This solves my issue. But I had to modify the implementation to fit my requirement for couple of reasons, like:

  1. I use TF in CI/CD pipeline.
  2. We have multiple AWS Batch compute environments, so I can't really query by computeEnvironments[0,1,2...n] using jq.
  3. Also, all my AWS Batch compute environments name ends with random strings due to following issue name prefix request to aws_batch_compute_environment #3207 . So I can't even query my compute environment by name.

So I used external data source template in Terraform:

data "external" "aws_batch_workaround" {
  program = ["bash", "${path.module}/scripts/desired_vcpus.sh"]

  query = {
    environment = "${var.environment}"
    name        = "${var.config["name"]}"
  }
}

Script:

cat desired_vcpus.sh

#!/bin/bash

function check_dependency() {
  test -f $(which jq) || error_exit "jq command not be detected in the path, please install it"
}

function parse_input() {
  eval "$(jq -r '@sh "export ENVIRONMENT=\(.environment) NAME=\(.name)"')"
  if [[ -z "${ENVIRONMENT}" ]]; then export ENVIRONMENT=none; fi
  if [[ -z "${NAME}" ]]; then export NAME=none; fi
}

function fetch_desired_vcpus() {
  desired_vcpus_contents=`aws batch describe-compute-environments --compute-environments $(aws batch describe-compute-environments | grep 'computeEnvironmentName' | awk '{print $2}' | grep '${ENVIRONMENT}-${NAME}.*' | cut -c 2- | rev | cut -c 3- | rev) | jq '.computeEnvironments[0].computeResources.desiredvCpus'`
}

function output() {
  jq -n \
    --arg desired_vcpus "$desired_vcpus_contents" \
    '{"desired_vcpus":$desired_vcpus}'
}

# main ()
check_dependency
parse_input
fetch_desired_vcpus
output

And referenced like this:

resource "aws_batch_compute_environment" "batch" {
  ...
  compute_resources {
  ...
    desired_vcpus      = "${data.external.aws_batch_workaround.result.desired_vcpus}"
  }
}

Inspired from this gist

@ratanos
Copy link

ratanos commented Nov 11, 2018

@rajbettaswamy This one ignores desired_vcpus,
resource "aws_batch_compute_environment" "batch" {
lifecycle {
ignore_changes = ["compute_resources.0.desired_vcpus"]
}
}

@rajbettaswamy
Copy link

@ratanos Wow ! This solves my issue and implementation is way cleaner and simple compared to mine. Thanks for your help.

@kthompson
Copy link

Using the lifecycle ignore changes no longer appears to work in 0.12 of terraform

ie

  lifecycle {
    ignore_changes = [
      "compute_resources.0.desired_vcpus"
    ]
  }

@ghost ghost removed the waiting-response Maintainers are waiting on response from community or contributor. label May 30, 2019
@ratanos
Copy link

ratanos commented May 30, 2019

@hectcastro @bflad Can we merge this PR?

@aeschright aeschright requested a review from a team June 25, 2019 21:31
@chasezimmy
Copy link

chasezimmy commented Jul 29, 2019

If anyone is on Terraform 0.12 and looking for a solution => hashicorp/terraform#21433 (comment)

ignore_changes docs

@nediml
Copy link

nediml commented Sep 27, 2019

If anyone is on Terraform 0.12 and looking for a solution => hashicorp/terraform#21433 (comment)

ignore_changes docs

This only solves the issue when compute env is scaling down (not in all cases too).
When scaling up happens it is needed to set new value for desired vCPUs which has to be equal or higher than minimum vCPUs, and if we ignore desired vCPUs value, 0 will be used since it is default value (which is lower than current minimum vCPUs value).

Therefore, we still need this PR merged ASAP.

@bflad bflad added this to the v2.68.0 milestone Jun 25, 2020
Copy link
Member

@bflad bflad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi everyone 👋 Apologies for the delay with fixing this bug. Using this additional test (thanks @hectcastro) and some minor logic adjustments (ignore sending 0 on creation and only send the value on update if the value is updated) the resource will now support fully optional desired_vcpus.

Output from acceptance testing:

--- PASS: TestAccAWSBatchComputeEnvironment_createEc2WithoutComputeResources (29.11s)
--- PASS: TestAccAWSBatchComputeEnvironment_createSpotWithoutBidPercentage (32.25s)
--- PASS: TestAccAWSBatchComputeEnvironment_createUnmanaged (57.26s)
--- PASS: TestAccAWSBatchComputeEnvironment_createUnmanagedWithComputeResources (58.43s)
--- PASS: TestAccAWSBatchComputeEnvironment_createEc2WithTags (59.85s)
--- PASS: TestAccAWSBatchComputeEnvironment_createSpotWithAllocationStrategy (63.04s)
--- PASS: TestAccAWSBatchComputeEnvironment_createSpot (63.87s)
--- PASS: TestAccAWSBatchComputeEnvironment_createWithNamePrefix (72.05s)
--- PASS: TestAccAWSBatchComputeEnvironment_createEc2 (73.24s)
--- PASS: TestAccAWSBatchComputeEnvironment_launchTemplate (75.56s)
--- PASS: TestAccAWSBatchComputeEnvironment_updateState (87.29s)
--- PASS: TestAccAWSBatchComputeEnvironment_UpdateLaunchTemplate (100.51s)
--- PASS: TestAccAWSBatchComputeEnvironment_updateInstanceType (100.57s)
--- PASS: TestAccAWSBatchComputeEnvironment_updateComputeEnvironmentName (103.34s)
--- PASS: TestAccAWSBatchComputeEnvironment_ComputeResources_MaxVcpus (115.60s)
--- PASS: TestAccAWSBatchComputeEnvironment_ComputeResources_DesiredVcpus_Computed (159.84s)
--- PASS: TestAccAWSBatchComputeEnvironment_ComputeResources_MinVcpus (225.13s)

bflad added a commit that referenced this pull request Jun 25, 2020
@bflad bflad merged commit d416c66 into hashicorp:master Jun 25, 2020
c4po added a commit to c4po/terraform-provider-aws that referenced this pull request Jun 25, 2020
* origin/master: (59 commits)
  Update CHANGELOG for hashicorp#13935
  resource/aws_batch_compute_environment: Remove resource from Terraform state when not found instead of returning error (hashicorp#13935)
  resource/aws_dynamodb_table: Return error instead of panic on empty CreateTable response (hashicorp#13925)
  Update CHANGELOG for hashicorp#13918
  New Data Source: aws_efs_access_points (hashicorp#13918)
  tests/resource/aws_instance: Ensure sweeper has dependencies on resources that manage EC2 Instances (hashicorp#13917)
  Update CHANGELOG for hashicorp#13937
  Update CHANGELOG for hashicorp#5448
  resource/aws_cloudtrail: Handle single event selector in cloudtrail with non-default read_write_type (hashicorp#5448)
  Update CHANGELOG for hashicorp#13892
  correct retry message to match in error handling
  correct import resource name
  accept empty string in volume_type validation
  Update CHANGELOG for hashicorp#4855
  resource/aws_batch_compute_environment: Support fully optional desired_vcpus and wait for updates
  add retry error handling for SLR
  remove unused WebACL resource name in disappears test
  reference current AWS partition in iam role policy stmt
  remove duplicated disappears test step
  Update CHANGELOG for hashicorp#13926
  ...
@ghost
Copy link

ghost commented Jun 26, 2020

This has been released in version 2.68.0 of the Terraform AWS provider. Please see the Terraform documentation on provider versioning or reach out if you need any assistance upgrading.

For further feature requests or bug reports with this functionality, please create a new GitHub issue following the template for triage. Thanks!

@ghost
Copy link

ghost commented Jul 25, 2020

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. Thanks!

@ghost ghost locked and limited conversation to collaborators Jul 25, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Addresses a defect in current functionality. service/batch Issues and PRs that pertain to the batch service. size/M Managed by automation to categorize the size of a PR.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

resource aws_batch_compute_environment gives error if optional desired_vcpus is not set
9 participants