Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Subnets on same vnet fail due to parrallel setup #3780

Open
dalee-bis opened this issue Jul 3, 2019 · 30 comments
Open

Subnets on same vnet fail due to parrallel setup #3780

dalee-bis opened this issue Jul 3, 2019 · 30 comments

Comments

@dalee-bis
Copy link

dalee-bis commented Jul 3, 2019

Community Note

  • Please vote on this issue by adding a 馃憤 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Terraform (and AzureRM Provider) Version

Terraform v0.12.3
+ provider.azurerm v1.31.0

Affected Resource(s)

  • azurerm_subnet
  • azurerm_subnet_network_security_group_association
  • Possibly azurerm_virtual_network_peering

Terraform Configuration Files

I'm not able to use the full configuration due to sensitive information in them. Please see this example:

resource azurerm_virtual_network vnet {
  name                = "my_vnet"
  location            = "northeurope"
  resource_group_name = "vnet_example"
  address_space       = ["10.0.0.0/24", ]
}

resource azurerm_subnet subnet1 {
  name                      = "subnet1"
  virtual_network_name      = "${azurerm_virtual_network.vnet.name}"
  resource_group_name       = "vnet_example"
  address_prefix            = "10.0.0.0/28"
}

resource azurerm_subnet subnet2 {
  name                      = "subnet2"
  virtual_network_name      = "${azurerm_virtual_network.vnet.name}"
  resource_group_name       = "vnet_example"
  address_prefix            = "10.0.0.64/28"
}

resource azurerm_subnet subnet3 {
  name                      = "subnet3"
  virtual_network_name      = "${azurerm_virtual_network.vnet.name}"
  resource_group_name       = "vnet_example"
  address_prefix            = "10.0.0.128/28"
}

Expected Behavior

The subnets and other configuration are created on the VNet.

Actual Behavior

Terraform seems to be attempting to apply the separate configurations to the VNet at the same time. I've tried adding dependencies between the subnets which solved the problem there, but it appeared again when attempting to use a azurerm_subnet_network_security_group_association resource:

Error: Error updating Route Table Association for Subnet "(redacted)" (Virtual Network "(redacted)" / Resource Group "(redacted)"): network.SubnetsClient#CreateOrUpdate: Failure sending request: StatusCode=0 -- Original Error: autorest/azure: Service returned an error. Status=<nil> Code="AnotherOperationInProgress" Message="Another operation on this or dependent resource is in progress. To retrieve status of the operation use uri: https://management.azure.com/subscriptions/(redacted)." Details=[]

Important Factoids

In our configuration, each subnet, security group combination is in its own module. We have explored using dependencies to ensure they run sequentially but this is not possible with the azurerm_subnet_network_security_group_association resource.

References

Possibly related to these?

@bcline760
Copy link

I'm getting this same error on Terraform 0.11.13 and AzureRM provider 1.33.1

Status=<nil> Code="AnotherOperationInProgress" Message="Another operation on this or dependent resource is in progress. To retrieve status of the operation use uri:

This looks related to #3673

To overcome, we're setting explicit dependencies using "depends_on" but, that's an awful hack.

@bcline760
Copy link

UPDATE
I'm able to replicate the problem even into Terraform 0.12.7 and AzureRM 1.33.1 with a minimal configuration.

Configuration

locals {

}

provider "azurerm" {

}

resource "azurerm_resource_group" "rg" {
  name     = "bug-3780-rg"
  location = "eastus"
}

resource "azurerm_virtual_network" "vnet" {
  name                = "bug-3780-vnet"
  location            = "eastus"
  resource_group_name = "bug-3780-rg"
  address_space       = ["192.168.14.0/24"]
}

resource "azurerm_subnet" "subnet1" {
  name                 = "bug-3780-subnet1"
  address_prefix       = "192.168.14.0/28"
  resource_group_name  = azurerm_resource_group.rg.name
  virtual_network_name = azurerm_virtual_network.vnet.name
}

resource "azurerm_subnet" "subnet2" {
  name                 = "bug-3780-subnet2"
  address_prefix       = "192.168.14.16/28"
  resource_group_name  = azurerm_resource_group.rg.name
  virtual_network_name = azurerm_virtual_network.vnet.name
}
resource "azurerm_subnet" "subnet3" {
  name                 = "bug-3780-subnet3"
  address_prefix       = "192.168.14.32/28"
  resource_group_name  = azurerm_resource_group.rg.name
  virtual_network_name = azurerm_virtual_network.vnet.name
}
resource "azurerm_subnet" "subnet4" {
  name                 = "bug-3780-subnet4"
  address_prefix       = "192.168.14.48/28"
  resource_group_name  = azurerm_resource_group.rg.name
  virtual_network_name = azurerm_virtual_network.vnet.name
}

Output

azurerm_subnet.subnet1: Creation complete after 2s [<Redacted>/resourceGroups/bug-3780-rg/providers/Microsoft.Network/virtualNetworks/bug-3780-vnet/subnets/bug-3780-subnet1]
azurerm_subnet.subnet2: Creation complete after 2s [<Redacted>/resourceGroups/bug-3780-rg/providers/Microsoft.Network/virtualNetworks/bug-3780-vnet/subnets/bug-3780-subnet2]
azurerm_subnet.subnet4: Still creating... [10s elapsed]
azurerm_subnet.subnet4: Creation complete after 12s [<Redacted>/resourceGroups/bug-3780-rg/providers/Microsoft.Network/virtualNetworks/bug-3780-vnet/subnets/bug-3780-subnet4]

Error: Error Creating/Updating Subnet "bug-3780-subnet3" (Virtual Network "bug-3780-vnet" / Resource Group "bug-3780-rg"): network.SubnetsClient#CreateOrUpdate: Failure sending request: StatusCode=0 -- Original Error: autorest/azure: Service returned an error. Status=<nil> Code="AnotherOperationInProgress" Message="Another operation on this or dependent resource is in progress. To retrieve status of the operation use uri: *<Redacted>* Details=[]

@florianrusch
Copy link

Same on terraform destroy:

Error: Error deleting Subnet "test-subnet" (Virtual Network "test-vn" / Resource Group "playground"): network.SubnetsClient#Delete: Failure sending request: StatusCode=409 -- Original Error: Code="AnotherOperationInProgress" Message="Another operation on this or dependent resource is in progress. To retrieve status of the operation use uri: https://management.azure.com/subscriptions/xxx/providers/Microsoft.Network/locations/westeurope/operations/xxx?api-version=2018-12-01." Details=[]

@hbuckle
Copy link
Contributor

hbuckle commented Sep 6, 2019

#3673 has updated the code so that the subnet is locked instead of the vnet, but I think the vnet still needs to be locked as well

 	locks.ByName(virtualNetworkName, virtualNetworkResourceName)
	defer locks.UnlockByName(virtualNetworkName, virtualNetworkResourceName)

@Moeser
Copy link
Contributor

Moeser commented Sep 6, 2019

@bcline760 and @florianrusch I'm curious whether the "dependent resource" referred to in the error is actually the vnet, or the resource group. Do you see the same error if you use a different resource group for each subnet?

I'm not suggesting that as a fix, just as a test to narrow down which resource azure is still modifying behind the scenes.

@Moeser
Copy link
Contributor

Moeser commented Sep 6, 2019

Looks like this error is happening in more places than just terraform. azure-powershell for example: Azure/azure-powershell#1817 . Adding the vnet lock back forces the subnets to create in serial (and running in serial is the workaround the powershell folks seem to be using for now), but might not carry over to modules running in parallel like the original issue reporter mentioned (note that the original report was for v1.31.0 which still had vnet locks in the subnet resource).

Any way to treat these as retryable errors instead of aborting the run? In all cases I've seen so far, they have been temporary.

@suonto
Copy link

suonto commented Sep 10, 2019

I'm also seeing this issue regularly. I have multiple terraform state files for different module tests but they all share the same resource group and vnet. Interestingly, I'm using 1.33.1. Todays test log:

Initializing the backend...

Successfully configured the backend "azurerm"! Terraform will automatically
use this backend unless the backend configuration changes.

Initializing provider plugins...
- Checking for available provider plugins...
- Downloading plugin for provider "azurerm" (hashicorp/azurerm) 1.33.1...
- Downloading plugin for provider "dns" (hashicorp/dns) 2.2.0...
- Downloading plugin for provider "null" (hashicorp/null) 2.1.2...
...

Error: Error updating Route Table Association for Subnet "moduletest-gitlabrunner-subnet" (Virtual Network "moduletest-vnet" / Resource Group "moduletest-rg"): network.SubnetsClient#CreateOrUpdate: Failure sending request: StatusCode=409 -- Original Error: Code="AnotherOperationInProgress" Message="Another operation on this or dependent resource is in progress. To retrieve status of the operation use uri: https://management.azure.com/...

@tombuildsstuff

This comment has been minimized.

@suonto

This comment has been minimized.

@tombuildsstuff

This comment has been minimized.

@Moeser

This comment has been minimized.

@suonto
Copy link

suonto commented Sep 11, 2019

@tombuildsstuff no it does not. I've been experiencing this for a long time and always used the latest provider on ephemeral ci executors. Just wanted to confirm this issue in hope of getting a proper solution at some point. I'm running 10 module tests in parallel each night. They all have their own tfstate but they share the VNET as a data object. Then I see every morning that at least one of the tests has failed with this specific issue. The state of arts has been like this for at least 3 months.

@Moeser I'm not sure if your change caused any new issues since 409 errors were seen both pre and post the change.

@hbuckle

This comment has been minimized.

@juanjojulian

This comment has been minimized.

@tombuildsstuff

This comment has been minimized.

@Moeser
Copy link
Contributor

Moeser commented Sep 12, 2019

For those running 1.33.1 and running into the AnotherOperationInProgress error while creating subnets, try pinning your terraform azurerm provider to 1.33.0 until the next version is released.

@tombuildsstuff

FWIW the SDK exposes the HTTP Status Codes so it should be possible to pull that information out/retry in the resources which need it.

That would be great. I still feel like the Azure API should be adding retry headers when these 409s are retryable, but adding code to the resources would be a much quicker short term fix.

@tombuildsstuff
Copy link
Member

馃憢

The recent locking changes have been rolled back in #4320 which has been merged into master -so that bug will ship in version 1.34.0 of the Azure Provider. As such I'm going to hide the comments about this recent bug to be able to leave this issue focusing on the original issue here

Thanks!

@Clausewitz45
Copy link

My setup is quite similar (and also got the same results with v1.33.0):

Terraform v0.12.8
+ provider.azurerm v1.33.1

I have a single virtual network with 4 subnets, and 4 security group also. I commented out all four group/subnet association - everything runs fine. If add back one security group association, and I run terraform apply, it tries to do the security group association to the subnet. If I run the terraform apply again (without changing the code itself), during the next run it tries to remove the associated security group from the subnet.

@suonto
Copy link

suonto commented Sep 17, 2019

@Clausewitz45 I don't think that's related. And fyi you can fix that by adding lifecycle ignore changes security group id.

@Clausewitz45
Copy link

@suonto thanks for mentioning it - it's a nice workaround for the issue (at least I can continue the provisioning of the infrastructure), but this should not be the default behavior. I thought it is related because I got the same error message, but it is working with just a single subnet - not with 3.

@etaham

This comment has been minimized.

@tnagel1

This comment has been minimized.

@grgouveia-everis
Copy link

Anyone still having this?
I'm using terraform 0.12.29 and azurerm 2.30.0.

Error: Error updating Subnet "subnet-name" (Virtual Network "vnet-name" / Resource Group "BI-Test"): network.SubnetsClient#CreateOrUpdate: Failure sending request: StatusCode=0 -- Original Error: autorest/azure: Service returned an error. Status=<nil> Code="AnotherOperationInProgress" Message="Another operation on this or dependent resource is in progress. To retrieve status of the operation use uri: https://management.azure.com/subscriptions/<subscription-id>providers/Microsoft.Network/locations/brazilsouth/operations/<subscription-id>?api-version=2020-05-01." Details=[]

@ahmddp
Copy link

ahmddp commented Dec 16, 2020

I'm having following error with the 2.33.0 provider:

Error: Error updating Subnet "test-subnet" (Virtual Network "test-vnet" / Resource Group "rg-test): network.SubnetsClient#CreateOrUpdate: Failure sending request: StatusCode=0 -- Original Error: autorest/azure: Service returned an error. Status=<nil> Code="AnotherOperationInProgress" Message="Another operation on this or dependent resource is in progress.

@srinathrangaramanujam
Copy link

confirm happening when trying to create subnets in a loop with terraform 0.14 and azurerm version 2.41.

@MmAtBosch
Copy link

Still having the problem with TF 0.14.7 and azurerm version 2.50...

Need to have different subnets with serviceEndpoints and/or Delegations configured, which must be done in sequence as of ARM.
AnotherOperationInProgress" Message="Another operation on this or dependent resource is in progress.

Only solution ATM using
depends_on = [azurerm_subnet.mysub]

@hterik
Copy link
Contributor

hterik commented Oct 18, 2021

Getting same problems with azurerm 2.65

Found a workaround to use terraform apply --parallelism=1. It's a lot slower but so far it's been working for me.

@AresiusXP
Copy link

AresiusXP commented Jun 16, 2022

This is still an issue happenning when associating subnets to route tables, or even when creating service endpoints. -parallelism=1 is not an option when I'm using Terraform to deploy a full solution. I'm going to try a workaround like this for now, but it'd be nice to have a solution after so many years of the same bug, even now using Terraform 1.1.2 with AzureRM 3.10.0

@Shrishti943
Copy link

I have tried terraform apply --parallelism=1 and depends_on . But nothing seems to work for me. Has anyone found solution for this error?

@epuckop
Copy link

epuckop commented May 8, 2023

Similar error with latest terraform and provider version

Terraform v1.4.6
on linux_amd64

  • provider registry.terraform.io/hashicorp/azuread v2.38.0
  • provider registry.terraform.io/hashicorp/azurerm v3.55.0
  • provider registry.terraform.io/hashicorp/random v3.5.1
  • provider registry.terraform.io/hashicorp/time v0.9.1
  • provider registry.terraform.io/microsoft/azuredevops v0.5.0

During creation "azurerm_dns_a_record" in "for_each" loop receiving error that 409
AA is locked by record set BB, Retry last operation on record set to mitigate.
During second run all is fine, but some times need to run deploy 2 times ...

Is any one know if it possible to limit deploy to specific for_each loop from asynchronous to synchronous?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests