Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Changing Subscription ID causes existing resources to be untracked and state to go missing #11662

Open
daviddob opened this issue May 11, 2021 · 4 comments

Comments

@daviddob
Copy link

Community Note

  • Please vote on this issue by adding a 馃憤 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Terraform (and AzureRM Provider) Version

terraform -v
Terraform v0.15.2
on darwin_amd64
+ provider registry.terraform.io/hashicorp/azurerm v2.58.0

Affected Resource(s)

All azurerm resources appear to be affected by this issue. Thusfar it has been confirmed with:

  • azurerm_storage_account
  • azurerm_mysql_server
  • azurerm_postgresql_server
  • azurerm_cosmosdb_account

Terraform Configuration Files

# Copy-paste your Terraform configurations here - for large Terraform configs,
# please use a service like Dropbox and share a link to the ZIP file. For
# security, you can also encrypt the files using our GPG public key: https://keybase.io/hashicorp
terraform {
  required_version = ">= 0.14.1"
  required_providers {
    azurerm = {
      version = "~>2.57"
    }
  }
}

provider "azurerm" {
  features {}
  subscription_id = "my-azure-subscription-id-1"
}

resource "azurerm_storage_account" "stg" {
  name                      = "testterraformtmp"
  resource_group_name       = "my-resource-group"
  location                  = "eastus"
  account_tier              = "Standard"
  account_replication_type  = "LRS"
}

Debug Output

Contains sensitive info - reach out if further debug info is required to reproduce.

Panic Output

N/A

Expected Behaviour

When changing the subscription_id from my-azure-subscription-id-1 to my-azure-subscription-id-2 after deploying resources I would expect it to show a valid diff stating that the relevant existing resources need to be replaced where necessary and upon apply the existing resources are destroyed and new resources are created in the new subscription.

Actual Behaviour

After changing the subscription_id and running plan or apply, terraform loses all reference to the existing azurerm resources and presents to the user that any relevant resources will be created (X to add, 0 to change, 0 to destroy). This makes it appear that it is a new clean deployment without existing state. Upon deploying with the new subscription_id, if there is an error (and there often is due to names that must be unique conflicting with the existing resources in the other subscription), then the terraform errors out and updates the tfstate to reflect the "current state" where there are no resources successfully created. This results in a situation where the existing resources from the previous deployment are no longer tracked in the tfstate and the "new resources" cant be deployed. Changing the subscription_id back to the original after the apply does not restore the state and you either need to recover a previous version, or re-import all of the missing resources.

Steps to Reproduce

  1. terraform apply using subscription_id A
  2. Change subscription_id to B
  3. terraform apply using subscription_id B (Notice the diff is all new resources and has no reference to the previously deployed resources)
  4. Inspect missing resources from state after errors during apply due to unique names

Important Factoids

  • Nothing special about the environment other than having multiple subscriptions
  • Similar to the below referenced issue - though the issue is experienced when using the Azure CLI auth method combined with either ARM_SUBSCRIPTION_ID or the subscription_id parameter changing to a new subscription.

References

  • This appears related to Terraform destroys the wrong resources when subscription_id is changed聽#9012 where when changing subscription IDs terraform is not tracking that the existing resources were created in a different subscription and is instead finding resources with the same name (though different ID based on the subscription) and is affecting/destroying them.
  • These two issues have the potential to lose track of existing resources as well as unintentionally and unknowingly destroy or modify terraform and non-terraform deployed resources alike.
@magodo
Copy link
Collaborator

magodo commented May 17, 2021

@daviddob Thank you for openning this issue!

What happens in your case is that when you run terraform apply using sub_b, Terraform will by default do a refresh on the resources recorded in the state by syncing with the remote. During this step, the provider will actually make a GET call against the resource's URL, where the URL is constructed via the ID recorded in the state file. Due to the fact that the current AzureRM provider will always assume the current subscription under is the one configured (i.e. sub_b), despite the existing resource's actual subscription, the provider will check the resource URL constructed by sub_b exist in the Azure or not. As this is not created in your case (but is the case for #9012), the provider will concluded the resource doesn't exist anymore, so will clear that resource in the state file (note that this will not cause a destroy). The next step of the apply is to compare the diff between the cfg file and the (refreshed) state to generate a plan, which in this case is to create a new resource.

Perhaps, one thing we can/should do is to respect the subscription ID when doing operations on the existing resources, while need to put more considerations.

@aristosvo
Copy link
Collaborator

I don't want to interfere here, but for this exact reason I created https://github.com/aristosvo/aztfmove. Without the need to recreate, resources are easily moved around.

@daviddob
Copy link
Author

Hi @magodo - this is pretty much exactly what I figured was occurring. Just wanted to point out that it is a relatively common issue when working in a multi-subscription environment and the fact that terraform does not check when subscriptions are changing has the potential for unexpected behavior. It would be ideal if terraform could check that the subscription is changing and would trigger a destruction of the old resource and build of the new (with proper diff output) in line with the usual expected terraform behavior when changing a parameter requires recreation.

@daveleescc
Copy link

I think I've stumbled across the same issue working with Virtual Network Peerings, so didn't want to raise it as a new issue.

I have a configuration where, if a value is passed into variable "peered_vnet_resource_id" then it creates a VNET peering from the VNET it is deployment and also creates the matching peering in the remote VNET.

This is with 2 separate resources like this:

resource "azurerm_virtual_network_peering" "peering_outbound" {
 
  # Only create the peering if value was passed in the variable peered_vnet_resource_id.  The default of "nopeer" means no peering is setup.
  count               = var.peered_vnet_resource_id != "nopeer" ? 1 : 0
  
  name = "peer-to-${split("/", var.peered_vnet_resource_id)[length(split("/", var.peered_vnet_resource_id))-1]}"
  
  resource_group_name = var.resource_group_name
  virtual_network_name = azurerm_virtual_network.hub_vnet.name
  remote_virtual_network_id = var.peered_vnet_resource_id

  allow_virtual_network_access = true
  allow_forwarded_traffic      = true
}

resource "azurerm_virtual_network_peering" "peering_inbound" {
 
  provider = azurerm.peered_sub

  # Only create the peering on the remote vnet if value was passed in the variable peered_vnet_resource_id.  The default of "nopeer" means no peering is setup.
  count               = var.peered_vnet_resource_id != "nopeer" ? 1 : 0
  
  name = "peer-to-${azurerm_virtual_network.hub_vnet.name}"

  resource_group_name = split("/", var.peered_vnet_resource_id)[4]
  virtual_network_name = split("/", var.peered_vnet_resource_id)[length(split("/", var.peered_vnet_resource_id))-1]
  remote_virtual_network_id = azurerm_virtual_network.hub_vnet.id

  allow_virtual_network_access = true
  allow_forwarded_traffic      = true
}

So that I can deploy the remote vnet peer into the right Azure Subscription, I have an alias on the azurerm provider. I pull the subscription ID out of the peered_vnet_resource_id variable into a local and then use that to configure the "peered_sub" version of the azurerm provider.

locals {
    # Was a Peered Virtual Network resource ID passed?  If so, get the subscription ID it's in (from it's resource ID) so we can use that to deploy the inbound
    # VNET Peering to the remote VNET.  Otherwise, we'll just populate this with the currently subscription ID so that it has something in it, otherwise it would 
    # cause an error when we tried to use it with the the azurerm provider.
    peered_sub_id = var.peered_vnet_resource_id == "nopeer" ? var.subscription_id : split("/", var.peered_vnet_resource_id)[2]
}

#Azure Provider for the Azure Subscription containing the VNET that this hub network will be peered with (if not supplied, assuming we're not 
# peering and use the current subscription again as a placeholder (as calculated in the local block above).
provider "azurerm" {

  subscription_id = local.peered_sub_id
  tenant_id       = var.tenant_id

  alias = "peered_sub"

  features {}
}

The default value for peered_vnet_resource_id is "nopeer" so, if there's nothing passed into that variable, and it is left as the default of "nopeer" then peered_sub_id is set to the current subscription, just so that there's a valid subscription in it otherwise the provider block for "peered_sub" gets unhappy.

This works fine and creates the peerings on both side. I run into issues if I subsequently remove the peering. Terraform destroys the resource azurerm_virtual_network_peering.peering_outbound but doesn't destroy the resource azurerm_virtual_network_peering.peering_inbound. It then loses track of the peering_inbound resource so you can an error if you later want to reestablish the peering, as the resource already exists.

I think this is because the subscription ID has changed in local.peer_sub_id so the subscription that azurerm_virtual_network_peering.peering_inbound is associated with changed. But it didn't trigger Terraform to destroy and recreate the resource (in this case, I would expect a destroy but not a recreate).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants