Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

ClientException: Too many concurrent attempts to create a new revision of the specified family. #9777

Open
Dzhuneyt opened this issue Aug 15, 2019 · 16 comments
Labels
bug Addresses a defect in current functionality. service/ecs Issues and PRs that pertain to the ecs service.

Comments

@Dzhuneyt
Copy link
Contributor

Community Note

  • Please vote on this issue by adding a 馃憤 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Terraform Version

Terraform v0.12.5

Affected Resource(s)

  • aws_ecs_task_definition

Terraform Configuration Files

data "template_file" "task_definition__backend" {
  template = file("${path.module}/task_definitions/backend.json")

  vars = {
    image_url = "1111111111111111.dkr.ecr.us-east-1.amazonaws.com/my-repo-here/backend:${var.version_tag}"
    container_name = "backend"

    log_group_region = data.aws_region.current.name
    log_group_name = aws_cloudwatch_log_group.app.name
  }
}

data "template_file" "task_definition__frontend" {
  template = file("${path.module}/task_definitions/frontend.json")

  vars = {
    image_url = "1111111111111111.dkr.ecr.us-east-1.amazonaws.com/my-repo-here/frontend:${var.version_tag}"
    container_name = "frontend"

    log_group_region = data.aws_region.current.name
    log_group_name = aws_cloudwatch_log_group.app.name
  }
}

resource "aws_ecs_task_definition" "backend" {
  family = local.ecs_cluster_name
  container_definitions = data.template_file.task_definition__backend.rendered
  network_mode = "awsvpc"
}
resource "aws_ecs_task_definition" "frontend" {
  family = local.ecs_cluster_name
  container_definitions = data.template_file.task_definition__frontend.rendered
  network_mode = "awsvpc"
}

resource "aws_ecs_service" "backend" {
  name = "${local.ecs_cluster_name}_backend"
  cluster = aws_ecs_cluster.ecs_cluster.id
  task_definition = aws_ecs_task_definition.backend.arn
  desired_count = "1"
  deployment_minimum_healthy_percent = 100
  deployment_maximum_percent = 300
  network_configuration {
    subnets = aws_subnet.private_subnet.*.id
    security_groups = [
      aws_security_group.sg_for_ec2_instances.id]
  }

  load_balancer {
    # Register the ECS service within the ALB target group
    # This makes the service participate in health checks
    # and receive traffic when healthy
    target_group_arn = aws_alb_target_group.target_group_backend.arn
    container_name = "backend"
    container_port = "80"
  }

  service_registries {
    registry_arn = aws_service_discovery_service.service_discovery.arn
    container_name = "backend"
    container_port = 80
  }

  depends_on = [
    aws_alb_listener.http_traffic,
  ]
}
resource "aws_ecs_service" "frontend" {
  name = "${local.ecs_cluster_name}_frontend"
  cluster = aws_ecs_cluster.ecs_cluster.id
  task_definition = aws_ecs_task_definition.frontend.arn
  desired_count = "2"
  deployment_minimum_healthy_percent = 100
  deployment_maximum_percent = 300
  network_configuration {
    subnets = aws_subnet.private_subnet.*.id
    security_groups = [
      aws_security_group.sg_for_ec2_instances.id]
  }

  load_balancer {
    target_group_arn = aws_alb_target_group.target_group_frontend.arn
    container_name = "frontend"
    container_port = "80"
  }

  service_registries {
    registry_arn = aws_service_discovery_service.service_discovery.arn
    container_name = "frontend"
    container_port = 80
  }

  depends_on = [
    aws_alb_listener.http_traffic,
    aws_ecs_service.backend,
  ]
}

Expected Behavior

Running terraform apply again and again should not cause any errors. I expect that AWS task definitions get updated properly.

Actual Behavior

AWS task definitions don't get updated and an error is thrown approximately 1 out of 5 attempts. If I rerun terraform apply another time, it usually works.

Error: ClientException: Too many concurrent attempts to create a new revision of the specified family.
        status code: 400, request id: efce29cc-a021-4d6b-b603-d84c8b7a91fa

Steps to Reproduce

  1. terraform apply

Important Factoids

Nothing special. Just two ECS services and the corresponding task definitions for them. It's worth noting that they are both within the same "family". Maybe this has some impact?

@ghost ghost added the service/ecs Issues and PRs that pertain to the ecs service. label Aug 15, 2019
@github-actions github-actions bot added the needs-triage Waiting for first response or review from a maintainer. label Aug 15, 2019
@juls
Copy link

juls commented Aug 31, 2019

I'm running into the same issue. I reduced the Terraform configuration to make it easier to reproduce it (the left out facts are the same as in the initial post):

Terraform Version

Terraform v0.12.7
+ provider.aws v2.26.0

Terraform Configuration Files

resource "aws_ecs_task_definition" "this" {
  count  = 2
  family = "test-family"

  container_definitions = jsonencode([{
    name   = "test"
    image  = "dummy"
    memory = 512
  }])
}

Important Factoids

The error can be circumvented by running terraform apply -parallelism=1, but this slows down the execution time up to factor 10 compared to the default parallelism.

When you set count = 1 it applies without errors, but of course only generates a single resource.

@zeik
Copy link

zeik commented Nov 26, 2019

I had similar issue. I was able to fix it by using different family for each task definition.
Using for example for_each on a map instead of count, then
family = "local.ecs_cluster_name-${each.key}"

@hatch-carl
Copy link

hatch-carl commented Dec 11, 2019

You should have two task definitions with different values for family. One for the frontend, one for the backend.
https://docs.aws.amazon.com/AmazonECS/latest/userguide/task_definition_parameters.html#family

When you register a task definition, you give it a family, which is similar to a name for multiple versions of the task definition.

Task definition has nothing to do with your cluster, you can use the same in many clusters, or on many services. But if each service runs a different set of containers, that's a different task definition.

@justinTM
Copy link

justinTM commented Oct 14, 2021

still an issue 2 years later lol

for terragrunt, use:
--terragrunt-parallelism 4

see https://terragrunt.gruntwork.io/docs/features/execute-terraform-commands-on-multiple-modules-at-once/#limiting-the-module-execution-parallelism

@justinretzolk
Copy link
Member

Hey y'all 馃憢 Thank you for taking the time to file this issue and for the continued discussion! Given that there's been a number of AWS provider releases since this was initially filed (and since the last update), can anyone confirm whether you're still experiencing this behavior?

@justinretzolk justinretzolk added waiting-response Maintainers are waiting on response from community or contributor. and removed needs-triage Waiting for first response or review from a maintainer. labels Dec 9, 2021
@github-actions github-actions bot removed the waiting-response Maintainers are waiting on response from community or contributor. label Dec 17, 2021
@andir
Copy link

andir commented Dec 23, 2021

I've just seen this issue when deploying via Terraform Cloud :(
TF version is 1.0 and the AWS provider is specified as "~> 3.63.0"

@kuritz
Copy link

kuritz commented Dec 28, 2021

Yes, I'm also still seeing this. Just hit it now actually which brought me here. I have a root module that deploys 2 target groups of tasks with different versions of the same task family so we can switch back and forth via the load balancer if needed for a blue/green style deployment. Whenever there is a change to our terraform code and both container groups are active we run into this issue.

@justinretzolk justinretzolk added the bug Addresses a defect in current functionality. label Jan 13, 2022
@sjsadowski
Copy link

Still an issue, ran into it today with provider 4.1

@tim-habitat
Copy link

tim-habitat commented Jul 22, 2022

Still an issue with hashicorp/aws 4.14
Re-applying a couple of times made it work for me (some tasks were created at each apply).

@mswezey23
Copy link

Still running into this issue with creating 1 ECS cluster. Running the apply back to back usually gets over it.

@ga-tb
Copy link

ga-tb commented Sep 15, 2022

Still an issue when using hashicorp/aws v4.15.1.

@promenadeviki
Copy link

Yep, still receiving this issue as well with TF 1.2.9 and the latest aws provider. Just have to re-run the apply to fix it.

@github-gael-soude
Copy link

Same for me. The only solution would be to create many families like proposed on other comments

@luispabon
Copy link

Still a problem on provider v4.32.0

@prashankprince
Copy link

Same issue for for me as well.
Re running works fine with same family.

@drewdunne
Copy link

drewdunne commented Jan 10, 2023

Encountered this problem today as well while applying just three containers sharing the same family into a single cluster. Had to give each of them unique family IDs to circumvent which, admittadly, is not a terrible workaround.

Edit: Appending an incremental integer to each family name did not work for me. Hm...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Addresses a defect in current functionality. service/ecs Issues and PRs that pertain to the ecs service.
Projects
None yet
Development

No branches or pull requests