Skip to content

Conversation

taufort
Copy link
Contributor

@taufort taufort commented Sep 15, 2025

Sometimes, it happens that Terraform tries to recreate the security group of the ECS service whereas the VPC did not actually change.

To avoid this issue, let's use the dependency inversion principle (described here https://developer.hashicorp.com/terraform/language/modules/develop/composition#dependency-inversion) by passing the VPC ID as an input.

Before this MR, here is what could happen during the plan:

  # module.ecs_service_backend.module.service.data.aws_subnet.this[0] will be read during apply
  # (depends on a resource or a module with changes pending)
 <= data "aws_subnet" "this" {
      + arn                                            = (known after apply)
      + assign_ipv6_address_on_creation                = (known after apply)
      + availability_zone                              = (known after apply)
      + availability_zone_id                           = (known after apply)
      + available_ip_address_count                     = (known after apply)
      + cidr_block                                     = (known after apply)
      + customer_owned_ipv4_pool                       = (known after apply)
      + default_for_az                                 = (known after apply)
      + enable_dns64                                   = (known after apply)
      + enable_lni_at_device_index                     = (known after apply)
      + enable_resource_name_dns_a_record_on_launch    = (known after apply)
      + enable_resource_name_dns_aaaa_record_on_launch = (known after apply)
      + id                                             = "subnet-123456abcdef"
      + ipv6_cidr_block                                = (known after apply)
      + ipv6_cidr_block_association_id                 = (known after apply)
      + ipv6_native                                    = (known after apply)
      + map_customer_owned_ip_on_launch                = (known after apply)
      + map_public_ip_on_launch                        = (known after apply)
      + outpost_arn                                    = (known after apply)
      + owner_id                                       = (known after apply)
      + private_dns_hostname_type_on_launch            = (known after apply)
      + region                                         = (known after apply)
      + state                                          = (known after apply)
      + tags                                           = (known after apply)
      + vpc_id                                         = (known after apply)
    }

# ...

# module.ecs_service_backend.module.service.aws_security_group.this[0] must be replaced
+/- resource "aws_security_group" "this" {
      ~ arn                    = "arn:aws:ec2:eu-west-1:123456789123:security-group/sg-abcd123" -> (known after apply)
      ~ egress                 = [
          - {
              - cidr_blocks      = [
                  - "0.0.0.0/0",
                ]
              - description      = "Example"
              - from_port        = 443
              - ipv6_cidr_blocks = []
              - prefix_list_ids  = []
              - protocol         = "tcp"
              - security_groups  = []
              - self             = false
              - to_port          = 443
            },
        ] -> (known after apply)
      ~ id                     = "sg-abcd123" -> (known after apply)
      ~ ingress                = [
          - {
              - cidr_blocks      = []
              - description      = "Ingress example"
              - from_port        = 8080
              - ipv6_cidr_blocks = []
              - prefix_list_ids  = []
              - protocol         = "tcp"
              - security_groups  = [
                  - "sg-edfgh456",
                ]
              - self             = false
              - to_port          = 8080
            },
        ] -> (known after apply)
      ~ name                   = "example" -> (known after apply)
      ~ owner_id               = "123456789123" -> (known after apply)
      ~ vpc_id                 = "vpc-ijkl789" -> (known after apply) # forces replacement
        # (4 unchanged attributes hidden)
    }

Because Terraform needs the datasource aws_subnet to get the VPC ID, it forces the replacement of the aws_security_group. Passing the VPC ID as input of the module fixes this issue.

How Has This Been Tested?

  • I have updated at least one of the examples/* to demonstrate and validate my change(s)
  • I have tested and validated these changes using one or more of the provided examples/* projects
  • I have executed pre-commit run -a on my pull request
  • I have checked the Security Group is no longer recreated in the Terraform plans on my environments using this Terraform module

@taufort taufort changed the title feat: add new vpc_id input feat: Add new vpc_id input Sep 15, 2025
@taufort taufort force-pushed the feat/network-dependency-inversion branch from 81eac13 to 4d16e1f Compare September 15, 2025 10:58
@taufort
Copy link
Contributor Author

taufort commented Sep 15, 2025

Hey @antonbabenko 👋

I submitted this PR to improve the service module. When you have a few minutes, it'd be super cool to have your review 😸

Thanks a lot.

@bryantbiggs
Copy link
Member

Do you have a reproduction of the issue?

@taufort
Copy link
Contributor Author

taufort commented Sep 15, 2025

Do you have a reproduction of the issue?

Hi @bryantbiggs 👋

Yeah, this issue is triggered when you upgrade from AWS provider v5 to v6. I was able to reproduce the issue from this repository on one of my accounts with examples/fargate root module:

  • Checkout tag v5.12.1
  • Apply code in root module examples/fargate
  • Checkout master
  • Re-apply the code in root module examples/fargate and you'll get this in the plan:
# ...

  # module.ecs_service.aws_security_group.this[0] must be replaced
+/- resource "aws_security_group" "this" {
      ~ arn                    = "arn:aws:ec2:XXXXX:XXXXX:security-group/sg-00bad43955f5c1eac" -> (known after apply)
      ~ egress                 = [
          - {
              - cidr_blocks      = [
                  - "0.0.0.0/0",
                ]
              - from_port        = 0
              - ipv6_cidr_blocks = []
              - prefix_list_ids  = []
              - protocol         = "-1"
              - security_groups  = []
              - self             = false
              - to_port          = 0
                # (1 unchanged attribute hidden)
            },
        ] -> (known after apply)
      ~ id                     = "sg-00bad43955f5c1eac" -> (known after apply)
      ~ ingress                = [
          - {
              - cidr_blocks      = []
              - description      = "Service port"
              - from_port        = 3000
              - ipv6_cidr_blocks = []
              - prefix_list_ids  = []
              - protocol         = "tcp"
              - security_groups  = [
                  - "sg-06e0a07950ba06f91",
                ]
              - self             = false
              - to_port          = 3000
            },
        ] -> (known after apply)
      ~ name                   = "ex-fargate-2025091513463824650000000f" -> (known after apply)
      ~ owner_id               = "XXXXX" -> (known after apply)
        tags                   = {
            "Example"    = "ex-fargate"
            "Name"       = "ex-fargate"
            "Repository" = "https://github.com/terraform-aws-modules/terraform-aws-ecs"
        }
      ~ vpc_id                 = "vpc-004b7cd5593b0695f" -> (known after apply) # forces replacement
        # (5 unchanged attributes hidden)
    }

# ...

@bryantbiggs
Copy link
Member

you can work around that with:

terraform apply -target='module.ecs_service.data.aws_subnet.this[0]' \
  -target="module.ecs_service.data.aws_caller_identity.current" \
  -target="module.ecs_service.data.aws_partition.current" \
  -target="module.ecs_service.data.aws_region.current" \
  -target="module.ecs_task_definition.data.aws_caller_identity.current" \
  -target="module.ecs_task_definition.data.aws_partition.current" \
  -target="module.ecs_task_definition.data.aws_region.current"

that ensures any data sources have been updated before proceeding with resource updates. data sources are updated due to the new AWS provider version (v6.0) and the new attributes that the provider has added

@bryantbiggs
Copy link
Member

to be clear, you are migrating across a major version of both the module as well as the AWS provider which both are breaking changes. we aim to minimize disruption as much as possible but it is not guaranteed - some manual intervention may be required during upgrades

@taufort
Copy link
Contributor Author

taufort commented Sep 16, 2025

to be clear, you are migrating across a major version of both the module as well as the AWS provider which both are breaking changes. we aim to minimize disruption as much as possible but it is not guaranteed - some manual intervention may be required during upgrades

This may be due to the migration to version 6 in this case but we had this bug before with AWS provider v5 and we had to fork this repository to fix the issue by adding a vpc_id input. I'll try to provide you a reproduction of this problem.

@dpandotmr
Copy link

I also experiment the issue sporadically, but unsure what causes it (I thought it could be due to the ordering of subnet_ids as I'm using a list comprehension to generate the value, but changing the order doesn't trigger it). Always using latest version 6 of the aws provider for this project.

It is slightly annoying when it happens as terraform re-creates the ECS service as well.

@bryantbiggs
Copy link
Member

any update on a reproduction? is depends_on being specified in the module configuration?

taufort and others added 2 commits October 1, 2025 09:21
Sometimes, it happens that Terraform tries to recreate the security group of the ECS service whereas the VPC did not actually change. To avoid this issue, let's use the dependency inversion principle (described here https://developer.hashicorp.com/terraform/language/modules/develop/composition#dependency-inversion) by passing the VPC ID as an input.
@bryantbiggs bryantbiggs changed the title feat: Add new vpc_id input feat: Allow specifying VPC ID used by service security group in lieu of deriving from subnets provided Oct 1, 2025
@bryantbiggs bryantbiggs force-pushed the feat/network-dependency-inversion branch from 4d16e1f to 139222e Compare October 1, 2025 14:26
@bryantbiggs bryantbiggs merged commit ac8f420 into terraform-aws-modules:master Oct 1, 2025
12 checks passed
antonbabenko pushed a commit that referenced this pull request Oct 1, 2025
## [6.6.0](v6.5.0...v6.6.0) (2025-10-01)

### Features

* Allow specifying VPC ID used by service security group in lieu of deriving from subnets provided ([#353](#353)) ([ac8f420](ac8f420))
@antonbabenko
Copy link
Member

This PR is included in version 6.6.0 🎉

@taufort
Copy link
Contributor Author

taufort commented Oct 1, 2025

Hi @bryantbiggs.

Sorry for the delay, I'm currently on vacation, I was not able to reproduce the issue with AWS provider version. 6 yet.

Anyway, thanks a lot for merging this new feature. It will allow us to avoid the problem if it comes back.

Copy link

@JoeyBG JoeyBG left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello! I ran into an issue tonight when deploying, and I just realized this was newly released. I think there is an issue with the default logic.. I was migrating services and a security group ended up being created in the default VPC and not the correct VPC (where the subnets are located). It took me a while to figure this issue out.. I managed to fix it by specifying the VPC manually but I think this breaks otherwise.

name_prefix = var.security_group_use_name_prefix ? "${local.security_group_name}-" : null
description = var.security_group_description
vpc_id = data.aws_subnet.this[0].vpc_id
vpc_id = try(data.aws_subnet.this[0].vpc_id, var.vpc_id)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be the other way around

Suggested change
vpc_id = try(data.aws_subnet.this[0].vpc_id, var.vpc_id)
vpc_id = coalesce(var.vpc_id, data.aws_subnet.this[0].vpc_id)


data "aws_subnet" "this" {
count = local.create_security_group ? 1 : 0
count = local.create_security_group && var.vpc_id != null ? 1 : 0
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
count = local.create_security_group && var.vpc_id != null ? 1 : 0
count = local.create_security_group && var.vpc_id == null ? 1 : 0

And this only when we don't set the vpc_id

@bryantbiggs
Copy link
Member

I think you should provide a reproduction before assuming - makes it easier to triage 😬

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants