Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

terraform should destroy and create resources if there is change in settings of aks cluster #389

Closed
1 task done
nayaksuraj opened this issue Jun 8, 2023 · 21 comments
Closed
1 task done
Labels
breaking-change bug Something isn't working
Milestone

Comments

@nayaksuraj
Copy link

nayaksuraj commented Jun 8, 2023

Is there an existing issue for this?

  • I have searched the existing issues

Greenfield/Brownfield provisioning

brownfield

Terraform Version

1.4.6

Module Version

7.1.0

AzureRM Provider Version

3.58.0

Affected Resource(s)/Data Source(s)

azurerm_kubernetes_cluster

Terraform Configuration Files

resource "random_id" "prefix" {
  byte_length = 8
}

resource "azurerm_resource_group" "main" {
  count = var.create_resource_group ? 1 : 0

  location = var.location
  name     = coalesce(var.resource_group_name, "${random_id.prefix.hex}-rg")
}

locals {
  resource_group = {
    name     = var.create_resource_group ? azurerm_resource_group.main[0].name : var.resource_group_name
    location = var.location
  }
}

resource "azurerm_virtual_network" "test" {
  address_space       = ["10.52.0.0/16"]
  location            = local.resource_group.location
  name                = "${random_id.prefix.hex}-vn"
  resource_group_name = local.resource_group.name
}

resource "azurerm_subnet" "test" {
  address_prefixes                               = ["10.52.0.0/24"]
  name                                           = "${random_id.prefix.hex}-sn"
  resource_group_name                            = local.resource_group.name
  virtual_network_name                           = azurerm_virtual_network.test.name
  enforce_private_link_endpoint_network_policies = true
}

locals {
  nodes = {
    for i in range(1) : "w${i}" => {
      name           = substr("worker${i}${random_id.prefix.hex}", 0, 8)
      vm_size        = "Standard_D2s_v3"
      node_count     = 1
      vnet_subnet_id = azurerm_subnet.test.id
    }
  }
}

module "aks" {
  source = "../.."

  prefix                        = "prefix-${random_id.prefix.hex}"
  resource_group_name           = local.resource_group.name
  os_disk_size_gb               = 50
  public_network_access_enabled = true
  sku_tier                      = "Standard"
  rbac_aad                      = false
  vnet_subnet_id                = azurerm_subnet.test.id
  node_pools                    = {}
  agents_pool_name = "np"
}

tfvars variables values

variable "create_resource_group" {
  type     = bool
  default  = false
  nullable = false
}

variable "location" {
  default = "westeurope"
}

variable "resource_group_name" {
  type    = string
  default = "common"
}

Debug Output/Panic Output

module.aks.azurerm_kubernetes_cluster.main: Creating...
╷
│ Error: A resource with the ID "/subscriptions/ggjjkkkk-333-44-5555-cfrwwwwww/resourceGroups/core-common/providers/Microsoft.ContainerService/managedClusters/aks-demo-cls" already exists - to be managed via Terraform this resource needs to be imported into the State. Please see the resource documentation for "azurerm_kubernetes_cluster" for more information.
│ 
│   with module.aks.azurerm_kubernetes_cluster.main,
│   on .terraform/modules/aks/main.tf line 17, in resource "azurerm_kubernetes_cluster" "main":
│   17: resource "azurerm_kubernetes_cluster" "main" {
│

Expected Behaviour

It should first delete the cluster and create. Now it's trying to create the cluster.

Actual Behaviour

No response

Steps to Reproduce

  1. First create the cluster using one example available in the repo
  2. try to change any value regarding the cluster. For example, you could change the default node group name and apply. You will see a mentioned error where Terraform will try to create a cluster first instead of destroy and create.

Important Factoids

No response

References

No response

@nayaksuraj nayaksuraj added the bug Something isn't working label Jun 8, 2023
@zioproto
Copy link
Contributor

Hello @nayaksuraj

can you please confirm you are using the same terraform.state file when you run Terraform for the second time ?

@nayaksuraj
Copy link
Author

Hello @nayaksuraj

can you please confirm you are using the same terraform.state file when you run Terraform for the second time ?

Hi @zioproto ,

Yes, I am using same tf state file. I tried several times with the main and 7.0.0 tags the result is the same.

@zioproto
Copy link
Contributor

Hello,

I am not able to reproduce this problem with the information you provided.

In the error Terraform complains about a cluster called aks-demo-cls already existing in the subscription.

However in your Terraform configuration files shared above you have:

  prefix                        = "prefix-${random_id.prefix.hex}"

This will create a cluster called prefix-<random>-aks.

Also the error refers to a resource group core-common but the code you shared would create a resource group called common.

Could you please check the data shared and provide a minimal example to reproduce the problem ?

Thanks

@lonegunmanb
Copy link
Member

Hi @nayaksuraj ,

module "aks" {
  source = "../.."

  prefix                        = "prefix-${random_id.prefix.hex}"
  resource_group_name           = local.resource_group.name
  os_disk_size_gb               = 50
  public_network_access_enabled = true
  sku_tier                      = "Standard"
  rbac_aad                      = false
  vnet_subnet_id                = azurerm_subnet.test.id
  node_pools                    = {}
  agents_pool_name = "np"
}

Your source is "../..", is that folder correct? Are you playing your own example inside the module's example folder? I just want to make sure that you're using the right module.

According to module's implementation, when var.cluster_name has been set, the cluster name won't be decorated with a prefix. According to your error message:

Error: A resource with the ID "/subscriptions/ggjjkkkk-333-44-5555-cfrwwwwww/resourceGroups/core-common/providers/Microsoft.ContainerService/managedClusters/aks-demo-cls" already exists

It looks like you've set var.cluster_name to aks-demo-cls, but we cannot see that in the config you've posted.

@avekrivoy
Copy link

I'm having the same error. Should prefix variable contain random characters in order for cluster to be replaced?

@lonegunmanb
Copy link
Member

Hi @avekrivoy would you please give us a minimum example that could reproduce your issue so we can try? Thanks!

@avekrivoy
Copy link

Hi @avekrivoy would you please give us a minimum example that could reproduce your issue so we can try? Thanks!

I'm using terraform 1.5.4 (tried on 1.4.6 at first as well).
Here is my module call definition:

module "aks-cluster" {
  source                                     = "Azure/aks/azurerm"
  version                                    = "7.1.0"
  resource_group_name                        = azurerm_resource_group.rg.name
  location                                   = azurerm_resource_group.rg.location
  kubernetes_version                         = var.kubernetes_version
  orchestrator_version                       = var.kubernetes_version
  private_cluster_enabled                    = var.private_cluster_enabled
  public_network_access_enabled              = var.public_network_access_enabled
  prefix                                     = local.aks_prefix

  agents_pool_name                           = "system"
  agents_size                                = var.system_nodepool_agents_size
  agents_count                               = var.system_nodepool_agents_count
  agents_max_pods                            = var.system_nodepool_max_pods
  agents_availability_zones                  = var.system_nodepool_agents_availability_zones
  agents_labels                              = var.system_nodepool_agents_labels
  only_critical_addons_enabled               = var.only_critical_addons_enabled
  rbac_aad                                   = var.rbac_aad
  role_based_access_control_enabled          = var.role_based_access_control_enabled
  rbac_aad_managed                           = var.rbac_aad_managed
  create_role_assignment_network_contributor = var.create_role_assignment_network_contributor

  network_plugin                             = var.network_plugin
  network_policy                             = var.network_policy
  vnet_subnet_id                             = local.aks_vnet_subnet_id
  #pod_subnet_id                              = local.aks_pod_subnet_id
  ###
  net_profile_service_cidr                   = var.net_profile_service_cidr
  net_profile_dns_service_ip                 = var.net_profile_dns_service_ip
  attached_acr_id_map                        = local.attached_acr_id_map
  maintenance_window                         = var.maintenance_window
  public_ssh_key                             = var.public_ssh_key
  # Addons
  key_vault_secrets_provider_enabled         = var.key_vault_secrets_provider_enabled
  secret_rotation_enabled                    = var.secret_rotation_enabled
  secret_rotation_interval                   = var.secret_rotation_interval
  oidc_issuer_enabled                        = var.oidc_issuer_enabled
  workload_identity_enabled                  = var.workload_identity_enabled

  node_pools                                 = local.node_pools

  tags                                       = local.common_tags

  depends_on = [
    module.vnet.vnet_subnets_name_id
  ]
}

If I change network_policy variable, say, from "azure" to "calico" terraform displays that aks cluster in module is going to be replaced. But then apply fails with the following error

module.aks-cluster.azurerm_kubernetes_cluster.main: Creating...
╷
│ Error: A resource with the ID "/subscriptions/0000000-111111-00000-11111-111111/resourceGroups/DEV-rg/providers/Microsoft.ContainerService/managedClusters/development-aks" already exists - to be managed via Terraform this resource needs to be imported into the State. Please see the resource documentation for "azurerm_kubernetes_cluster" for more information.
│ 
│   with module.aks-cluster.azurerm_kubernetes_cluster.main,
│   on .terraform/modules/aks-cluster/main.tf line 17, in resource "azurerm_kubernetes_cluster" "main":
│   17: resource "azurerm_kubernetes_cluster" "main" {

Please let me know if you need more details. Don't really want to paste everything here, since my terraform configuration contains a lot of variables

@zioproto
Copy link
Contributor

zioproto commented Aug 7, 2023

@avekrivoy could you please share your provider block ?

You just have:

provider "azurerm" {
  features {}
}

Or you have any additional feature configuration ?

@avekrivoy
Copy link

I don't have any additional features enabled

provider "azurerm" {
  features {}
  subscription_id = "0000000-111111-00000-11111-111111"
  use_msi = true
}

@avekrivoy could you please share your provider block ?

You just have:

provider "azurerm" {
  features {}
}

Or you have any additional feature configuration ?

@avekrivoy
Copy link

@lonegunmanb @zioproto do you need any additional information?

@lonegunmanb
Copy link
Member

Thanks @avekrivoy , I can reproduce this issue now.

@lonegunmanb
Copy link
Member

lonegunmanb commented Aug 9, 2023

I think I've figured out the reason.

resource "azurerm_kubernetes_cluster_node_pool" "node_pool" {
  for_each = var.node_pools

  ...

  lifecycle {
    create_before_destroy = true

azurerm_kubernetes_cluster_node_pool.node_pool depends on the aks cluster resource, and there is create_before_destroy = true on azurerm_kubernetes_cluster_node_pool.node_pool, so the aks cluster becomes create_before_destroy = true. I can prove it by the following sample code:

terraform {
  required_providers {
    azurerm = {
      version = ">3.56.0"
    }
  }
}
provider "azurerm" {
  features {}
}

resource "azurerm_resource_group" "example" {
  name     = "example-aks-${random_pet.suffix.id}"
  location = "West Europe"
}

resource "random_pet" "suffix" {}

resource "azurerm_kubernetes_cluster" "example" {
  name                = "example-aks-${random_pet.suffix.id}"
  location            = azurerm_resource_group.example.location
  resource_group_name = azurerm_resource_group.example.name
  dns_prefix          = "exampleaks1"
  network_profile {
    network_plugin = "azure"
    network_policy = "calico"
  }

  default_node_pool {
    name       = "default"
    node_count = 1
    vm_size    = "Standard_D2_v2"
  }

  identity {
    type = "SystemAssigned"
  }

  tags = {
    Environment = "Production"
  }
}

After the apply, we change network_policy to azure and run terraform plan:

Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
-/+ destroy and then create replacement

Terraform will perform the following actions:

  # azurerm_kubernetes_cluster.example must be replaced
-/+ resource "azurerm_kubernetes_cluster" "example" {

Please notice -/+ destroy and then create replacement, the default behaivor for replacement is destroy first, then re-create.

But if we added a resource that declared create_before_destroy = true:

terraform {
  required_providers {
    azurerm = {
      version = ">3.56.0"
    }
  }
}
provider "azurerm" {
  features {}
}

resource "azurerm_resource_group" "example" {
  name     = "zjhe-aks"
  location = "West Europe"
}

resource "random_pet" "suffix" {}

resource "azurerm_kubernetes_cluster" "example" {
  name                = "zjhe-aks-${random_pet.suffix.id}"
  location            = azurerm_resource_group.example.location
  resource_group_name = azurerm_resource_group.example.name
  dns_prefix          = "exampleaks1"
  network_profile {
    network_plugin = "azure"
    network_policy = "calico"
  }

  default_node_pool {
    name       = "default"
    node_count = 1
    vm_size    = "Standard_D2_v2"
  }

  identity {
    type = "SystemAssigned"
  }

  tags = {
    Environment = "Production"
  }
}

resource "null_resource" "test" {
  count = 0
  triggers = {
    input = azurerm_kubernetes_cluster.example.name
  }
  lifecycle {
    create_before_destroy = true
  }
}

Please notice that null_resource's count is 0, but this time let's change network_policy to azure and plan again:

Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
+/- create replacement and then destroy

Terraform will perform the following actions:

  # azurerm_kubernetes_cluster.example must be replaced

This time Terraform would try to create the replacement resource first, then destroy the deprecated resource, and that caused the error.

I don't think it is a Terraform Core's bug since once a downstream resource is create_before_destroy, there's no way to create a new downstream resource first without creating a corresponding new upstream resource first if we must re-create them both.

For now, the only solution I can provide is adding a random string as aks cluster's name as we've done for azurerm_kubernetes_cluster_node_pool resource:

resource "azurerm_kubernetes_cluster_node_pool" "node_pool" {
  for_each = var.node_pools

  kubernetes_cluster_id         = azurerm_kubernetes_cluster.main.id
  name                          = "${each.value.name}${substr(md5(jsonencode(each.value)), 0, 4)}"

But that would be considered as a breaking change, so I will provide a feature toggle variable so the module's caller could decide whether to use this random name suffix or not.

@avekrivoy
Copy link

I think I've figured out the reason.

resource "azurerm_kubernetes_cluster_node_pool" "node_pool" {
  for_each = var.node_pools

  ...

  lifecycle {
    create_before_destroy = true

azurerm_kubernetes_cluster_node_pool.node_pool depends on the aks cluster resource, and there is create_before_destroy = true on azurerm_kubernetes_cluster_node_pool.node_pool, so the aks cluster becomes create_before_destroy = true. I can prove it by the following sample code:

terraform {
  required_providers {
    azurerm = {
      version = ">3.56.0"
    }
  }
}
provider "azurerm" {
  features {}
}

resource "azurerm_resource_group" "example" {
  name     = "example-aks-${random_pet.suffix.id}"
  location = "West Europe"
}

resource "random_pet" "suffix" {}

resource "azurerm_kubernetes_cluster" "example" {
  name                = "example-aks-${random_pet.suffix.id}"
  location            = azurerm_resource_group.example.location
  resource_group_name = azurerm_resource_group.example.name
  dns_prefix          = "exampleaks1"
  network_profile {
    network_plugin = "azure"
    network_policy = "calico"
  }

  default_node_pool {
    name       = "default"
    node_count = 1
    vm_size    = "Standard_D2_v2"
  }

  identity {
    type = "SystemAssigned"
  }

  tags = {
    Environment = "Production"
  }
}

After the apply, we change network_policy to azure and run terraform plan:

Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
-/+ destroy and then create replacement

Terraform will perform the following actions:

  # azurerm_kubernetes_cluster.example must be replaced
-/+ resource "azurerm_kubernetes_cluster" "example" {

Please notice -/+ destroy and then create replacement, the default behaivor for replacement is destroy first, then re-create.

But if we added a resource that declared create_before_destroy = true:

terraform {
  required_providers {
    azurerm = {
      version = ">3.56.0"
    }
  }
}
provider "azurerm" {
  features {}
}

resource "azurerm_resource_group" "example" {
  name     = "zjhe-aks"
  location = "West Europe"
}

resource "random_pet" "suffix" {}

resource "azurerm_kubernetes_cluster" "example" {
  name                = "zjhe-aks-${random_pet.suffix.id}"
  location            = azurerm_resource_group.example.location
  resource_group_name = azurerm_resource_group.example.name
  dns_prefix          = "exampleaks1"
  network_profile {
    network_plugin = "azure"
    network_policy = "calico"
  }

  default_node_pool {
    name       = "default"
    node_count = 1
    vm_size    = "Standard_D2_v2"
  }

  identity {
    type = "SystemAssigned"
  }

  tags = {
    Environment = "Production"
  }
}

resource "null_resource" "test" {
  count = 0
  triggers = {
    input = azurerm_kubernetes_cluster.example.name
  }
  lifecycle {
    create_before_destroy = true
  }
}

Please notice that null_resource's count is 0, but this time let's change network_policy to azure and plan again:

Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
+/- create replacement and then destroy

Terraform will perform the following actions:

  # azurerm_kubernetes_cluster.example must be replaced

This time Terraform would try to create the replacement resource first, then destroy the deprecated resource, and that caused the error.

I don't think it is a Terraform Core's bug since once a downstream resource is create_before_destroy, there's no way to create a new downstream resource first without creating a corresponding new upstream resource first if we must re-create them both.

For now, the only solution I can provide is adding a random string as aks cluster's name as we've done for azurerm_kubernetes_cluster_node_pool resource:

resource "azurerm_kubernetes_cluster_node_pool" "node_pool" {
  for_each = var.node_pools

  kubernetes_cluster_id         = azurerm_kubernetes_cluster.main.id
  name                          = "${each.value.name}${substr(md5(jsonencode(each.value)), 0, 4)}"

But that would be considered as a breaking change, so I will provide a feature toggle variable so the module's caller could decide whether to use this random name suffix or not.

So this random suffix should be generated inside the module? Would it help if I passed it outside of the module?

resource "random_id" "aks_prefix" {
  byte_length = 8
}

locals {
  aks_prefix = "${var.customer}-${var.environment}-${random_id.aks_prefix.hex}"
}

module "aks-cluster" {
  source                                     = "Azure/aks/azurerm"
  version                                    = "7.1.0"
  prefix                                     = local.aks_prefix
...

@avekrivoy
Copy link

Nope, didn't work. Just checked

@lonegunmanb
Copy link
Member

Hi @avekrivoy could you try the following approach?:

resource "null_resource" "name_keeper" {
  triggers = {
    network_policy = var.network_policy
  }
}
resource "random_id" "aks_prefix" {
  byte_length = 8
  lifecycle {
    replace_triggered_by = [null_resource.name_keeper.id]
  }
}
locals {
  aks_prefix = "${var.customer}-${var.environment}-${random_id.aks_prefix.hex}"
}

I've tried different ways in the module but none of them could be a non-breaking change. I think maybe I have to defer this patch to our next major version upgrade, it could be months later, so for now it's better to workaround outside this module.

@avekrivoy
Copy link

Hi @avekrivoy could you try the following approach?:

resource "null_resource" "name_keeper" {
  triggers = {
    network_policy = var.network_policy
  }
}
resource "random_id" "aks_prefix" {
  byte_length = 8
  lifecycle {
    replace_triggered_by = [null_resource.name_keeper.id]
  }
}
locals {
  aks_prefix = "${var.customer}-${var.environment}-${random_id.aks_prefix.hex}"
}

I've tried different ways in the module but none of them could be a non-breaking change. I think maybe I have to defer this patch to our next major version upgrade, it could be months later, so for now it's better to workaround outside this module.

Well, looks like it might work, since a new unique id will be generated on changes. But the thing is that network policy is not the only parameter that will trigger cluster replacement. And keeping all parameters in a null_resource as triggers is quite messy in my opinion.

@avekrivoy
Copy link

@lonegunmanb what about making lifecycle create_before_destroy setting for the node pool optional?

create_before_destroy = true

something like this

variable "create_node_pool_before_destroying" {
  default = true
}
...
  lifecycle {
    create_before_destroy = var.create_node_pool_before_destroying
...

could this potentially work?

@lonegunmanb
Copy link
Member

@lonegunmanb what about making lifecycle create_before_destroy setting for the node pool optional?

create_before_destroy = true

something like this

variable "create_node_pool_before_destroying" {
  default = true
}
...
  lifecycle {
    create_before_destroy = var.create_node_pool_before_destroying
...

could this potentially work?

The reason why we're facing this issue is, the Aks resource became create_before_destroy = true when we turn on this toggle for extra node pool resource.

I understand you'd like to add a toggle variable for this argument so the module caller could decide it's destroy behavior, but unfortunately create_before_destroy is a so-called "Meta Argument" that doesn't accept an input variable. And even if we set create_before_destroy = false explicitly for this Aks cluster resource it would not help because the extra node pool's destroy behavior will override it.

For now, I don't see any non-breaking way to solve this issue (we need at least a new null_resource resource as Aks cluster name keeper, as we've done for the node pool resource), but we're planning a new provider and corresponding new resource so we can collect the usage telemetry data for these Azure Verified Terraform modules, that would be a breaking change, so we're about to release a new major version in 1-2 months. Maybe we can solve this issue in the new major version upgrade.

@avekrivoy
Copy link

avekrivoy commented Aug 10, 2023

Why was this meta-argument added in a first place? To upgrade node pool without downtime of apps that are running in that pool?

@lonegunmanb
Copy link
Member

Why was this meta-argument added in a first place? To upgrade node pool without downtime of apps that are running in that pool?

Yes exactly, once we want to change vm size for a node pool we can provision a new pool first, then delete the old pool, and this deletion would trigger eviction for all pods running on the old pool, and launch them in the new pool.

@lonegunmanb
Copy link
Member

I'm closing this issue since I think it would be solved by v8.

Since Terraform core would treat the aks cluster resource as create_before_destroy = true, we can do nothing about it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking-change bug Something isn't working
Projects
Archived in project
Development

No branches or pull requests

4 participants