Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

azurerm_private_endpoint fails intermittently with retriable error #2 #21293

Open
1 task done
tjuoz opened this issue Apr 5, 2023 · 2 comments
Open
1 task done

azurerm_private_endpoint fails intermittently with retriable error #2 #21293

tjuoz opened this issue Apr 5, 2023 · 2 comments
Labels
service/network upstream/microsoft Indicates that there's an upstream issue blocking this issue/PR v/3.x

Comments

@tjuoz
Copy link

tjuoz commented Apr 5, 2023

Is there an existing issue for this?

  • I have searched the existing issues

Community Note

  • Please vote on this issue by adding a 馃憤 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Terraform Version

1.2.3

AzureRM Provider Version

3.50.0

Affected Resource(s)/Data Source(s)

azurerm_private_endpoint

Terraform Configuration Files

resource "azurerm_private_endpoint" "pe_1_db" {
  name                = "mwe-test-1db-pe"
  resource_group_name = azurerm_resource_group.main.name
  location            = azurerm_resource_group.main.location
  subnet_id           = azurerm_subnet.subnet_1_db.id

  private_service_connection {
    name = "1db-pe"
    is_manual_connection = false
    private_connection_resource_id = module.database_1_db.sql_server.id
    subresource_names = ["sqlServer"]
  }
}

Debug Output/Panic Output

`Error: waiting for creation of Private Endpoint "mwe-test-1db-pe" (Resource Group "mwe-test-01-rg"): Code="RetryableError" Message="A retryable error occurred." Details=[{"code":"ReferencedResourceNotProvisioned","message":"Cannot proceed with operation because resource /subscriptions/***/resourceGroups/mwe-test-01-rg/providers/Microsoft.Network/virtualNetworks/mwe-test-01-vnet/subnets/mwe-test-1db-subnet used by resource /subscriptions/***/resourceGroups/mwe-test-01-rg/providers/Microsoft.Network/networkInterfaces/mwe-test-1db-pe.nic.771d4852-46ce-48b3-80ed-f98344f7f778 is not in Succeeded state. Resource is in Updating state and the last operation that updated/is updating the resource is PutSubnetOperation."}]`

Expected Behaviour

Either Terraform should automatically retry retriable errors and not fail or PE interactions with Subnet should occur only when Subnet is in a Succeeded state (dependency issue?).

Actual Behaviour

Sometimes during provisioning of a private endpoint, we have seen the following error. Looking into the Azure portal, the Private Endpoint indeed exists and is working. However, we cannot just run terraform apply again, since it does not exist in state. We need to manually delete the PE first (or could manually import it).

Terraform logs show that the subnet resource creation was completed before the creation of the Private Endpoint.

Issue was first encountered when using azurerm version 3.39.1 and also was still present with the latest (at this point) version 3.50.0

Steps to Reproduce

Issue appears randomly and is present both when creating multiple Private Endpoints or a single one.

Important Factoids

As mentioned in #16182 issue - there is a higher chance to encounter the error when multiple Private Endpoints are being created in parallel, but it happens also when creating a single Private Endpoint too. We're trying to workaround the issue by deploying a time_sleep resource, dependent on the Subnet resource and adding a depends_on = property on Private Endpoint resource

References

The bug is pretty much the same as described in an already closed #16182 issue.

@tjuoz
Copy link
Author

tjuoz commented Apr 7, 2023

We added time_sleep between the Subnet resource and the Private Endpoint, and as expected this solved the initial issue.

resource "time_sleep" "pe_1_db_ts" {
  depends_on      = [azurerm_subnet.subnet_1_db]
  create_duration = "60s"
}

Yet another issue with the Subnet resource appeared when creating an azurerm_mssql_virtual_network_rule resource for azurerm_mssql_server by configuring the MSSQL Virtual Network Rule with the Subnet ID of a VM that is created beside the MSSQL Server. Error we're receiving for this is:

Error: creating MSSQL Virtual Network Rule: (Name "mwe-test-2mo-subnet" / Server Name "mwe-test-1db-sql" / Resource Group "mwe-test-01-rg"): sql.VirtualNetworkRulesClient#CreateOrUpdate: Failure sending request: StatusCode=400 -- Original Error: Code="VirtualNetworkRuleBadRequest" Message="Azure SQL Server Virtual Network Rule encountered an user error: Cannot proceed with operation because subnets mwe-test-2mo-subnet of the virtual network /subscriptions/***/resourceGroups/mwe-test-01-rg/providers/Microsoft.Network/virtualNetworks/mwe-test-01-vnet are not provisioned. They are in Updating state."

It seems that even though both Private Endpoint and MSSQL Virtual Network Rule resources wait for their respective Subnet resources creation completion - once Terraform confirms that the Subnet resource creation is complete the resource itself still remains in the Updating state for some time. I suppose this time frame varies hence the intermittent nature of the occurrence of the mentioned errors.

@dannyger97
Copy link

Any sight on this? I still see this error with Private Endpoints on 3.90

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
service/network upstream/microsoft Indicates that there's an upstream issue blocking this issue/PR v/3.x
Projects
None yet
Development

No branches or pull requests

5 participants