-
Notifications
You must be signed in to change notification settings - Fork 9.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: Error destroying aws_ssoadmin resources #33337
Comments
Community NoteVoting for Prioritization
Volunteering to Work on This Issue
|
@justinretzolk I just submitted a PR for this that I think should fix it. Can someone review it when they get a chance? |
Hey @novekm - The error message you shared in the issue body is a failure which can only be present during an update operation for the Here is the function in which the terraform-provider-aws/internal/service/ssoadmin/permission_set.go Lines 290 to 308 in e313e8f
And this is only referenced once during the update operation here: terraform-provider-aws/internal/service/ssoadmin/permission_set.go Lines 214 to 217 in e313e8f
Grepping through all of the SSO Admin resources it does look like some others (boundary_attachment, permission_set_inline_policy, customer_managed_policy_attachment, and managed_policy_attachment) call this function during delete operations, so its possible the fix proposed in #33384 is still valid, but needs to be applied to a different resource. % rg provisionPermissionSet
internal/service/ssoadmin/permissions_boundary_attachment.go
122: if err := provisionPermissionSet(ctx, conn, permissionSetARN, instanceARN, d.Timeout(schema.TimeoutCreate)); err != nil {
184: if err := provisionPermissionSet(ctx, conn, permissionSetARN, instanceARN, d.Timeout(schema.TimeoutDelete)); err != nil {
internal/service/ssoadmin/permission_set_inline_policy.go
96: if err := provisionPermissionSet(ctx, conn, permissionSetARN, instanceARN, d.Timeout(schema.TimeoutCreate)); err != nil {
161: if err := provisionPermissionSet(ctx, conn, permissionSetARN, instanceARN, d.Timeout(schema.TimeoutDelete)); err != nil {
internal/service/ssoadmin/customer_managed_policy_attachment.go
108: if err := provisionPermissionSet(ctx, conn, permissionSetARN, instanceARN, d.Timeout(schema.TimeoutCreate)); err != nil {
175: if err := provisionPermissionSet(ctx, conn, permissionSetARN, instanceARN, d.Timeout(schema.TimeoutDelete)); err != nil {
internal/service/ssoadmin/managed_policy_attachment.go
101: if err := provisionPermissionSet(ctx, conn, permissionSetARN, instanceARN, d.Timeout(schema.TimeoutCreate)); err != nil {
163: if err := provisionPermissionSet(ctx, conn, permissionSetARN, instanceARN, d.Timeout(schema.TimeoutDelete)); err != nil {
internal/service/ssoadmin/permission_set.go
215: if err := provisionPermissionSet(ctx, conn, permissionSetARN, instanceARN, d.Timeout(schema.TimeoutUpdate)); err != nil {
290:func provisionPermissionSet(ctx context.Context, conn *ssoadmin.SSOAdmin, permissionSetARN, instanceARN string, timeout time.Duration) error { Are you able to provide a more complete configuration and/or logs to determine which resource this is failing on during the |
Hi @jar-b, thanks for taking a look into this! Like mentioned, the error listed above appears when running For context, I am using a TF module I created, but I believe the issue persists whether or not I use the module. It also appears others are having the same issue. I'm not sure if they are using a module or not, but it is not my module since it is not public yet, so a module-specific issue can likely be ruled out. Here's my main.tf:
1.
2.
I am not sure why the error mentions a "provision" when the resources are being destroyed. The only way to resolve the error currently is to run Here is my
3. After running
As another note, after re-applying and destroying again, this time only a single error message appears, instead of two errors as listed above:
After a second
Perhaps in addition to using the retry logic I added, the order the in which resources are destroyed could also be modified? Looking at the plan for my destroy, the permission sets are always the last items in the list. Maybe if the permissions sets were deleted first, then the error would likely not appear either? Let me know if you need any more detail. Thanks! |
Thanks for the extra detail. Inspecting the error logs, it looks like the two failing resource types during the first Since it's reaching the waiter step, this implies the detachment of the managed policies is successful, but the provisioning step which occurs after is failing. terraform-provider-aws/internal/service/ssoadmin/managed_policy_attachment.go Lines 152 to 165 in e313e8f
If you inspect the state after the first destroy (do not refresh), I'm guessing you'll see that both the permission set resources AND the managed policy attachment resources are still present. When the second terraform-provider-aws/internal/service/ssoadmin/managed_policy_attachment.go Lines 117 to 123 in e313e8f
Seeing the full resource definition and |
Thanks for the additional context, that makes sense. Upon checking
Running
What is odd to me is that it appears the I will take a look at |
Can you share the resource definitions in your module? The managed policy attachment and permission set resources would be most helpful, along with any other resources they reference. Or an equivalent standalone configuration that produces the same result is fine if you prefer not to share the module content at this time. |
Sure, here they are:
resource "aws_ssoadmin_managed_policy_attachment" "pset_aws_managed_policy" {
# iterate over the permission_sets map of maps, and set the result to be pset_name and pset_index
# ONLY if the policy for each pset_index is valid.
for_each = { for pset in local.pset_aws_managed_policy_maps : "${pset.pset_name}.${pset.policy_arn}" => pset }
instance_arn = local.ssoadmin_instance_arn
managed_policy_arn = each.value.policy_arn
permission_set_arn = aws_ssoadmin_permission_set.pset[each.value.pset_name].arn
}
# - SSO Permission Set -
resource "aws_ssoadmin_permission_set" "pset" {
for_each = var.permission_sets
name = each.key
# lookup function retrieves the value of a single element from a map, when provided it's key.
# if the given key does not exist, the default value (null) is returned instead
instance_arn = local.ssoadmin_instance_arn
description = lookup(each.value, "description", null)
relay_state = lookup(each.value, "relay_state", null) // (Optional) URL used to redirect users within the application during the federation authentication process
session_duration = lookup(each.value, "session_duration", null) // The length of time that the application user sessions are valid in the ISO-8601 standard
tags = lookup(each.value, "tags", {})
}
# - Permission Sets and Policies -
locals {
# - Fetch SSO Instance ARN and SSO Instance ID -
ssoadmin_instance_arn = tolist(data.aws_ssoadmin_instances.sso_instance.arns)[0]
sso_instance_id = tolist(data.aws_ssoadmin_instances.sso_instance.identity_store_ids)[0]
# Iterate over the objects in var.permission sets, then evaluate the expression's 'pset_name'
# and 'pset_index' with 'pset_name' and 'pset_index' only if the pset_index.managed_policies (AWS Managed Policy ARN)
# produces a result without an error (i.e. if the ARN is valid). If any of the ARNs for any of the objects
# in the map are invalid, the for loop will fail.
# pset_name is the attribute name for each permission set map/object
# pset_index is the corresponding index of the map of maps (which is the variable permission_sets)
aws_managed_permission_sets = { for pset_name, pset_index in var.permission_sets : pset_name => pset_index if can(pset_index.aws_managed_policies) }
customer_managed_permission_sets = { for pset_name, pset_index in var.permission_sets : pset_name => pset_index if can(pset_index.customer_managed_policies) }
# ! NOT CURRENTLY SUPPORTED !
# inline_policy_permission_sets = { for pset_name, pset_index in var.permission_sets : pset_name => pset_index if can(pset_index.inline_policy) }
# When using the 'for' expression in Terraform:
# [ and ] produces a tuple
# { and } produces an object, and you must provide two result expressions separated by the => symbol
# The 'flatten' function takes a list and replaces any elements that are lists with a flattened sequence of the list contents
# create pset_name and managed policy maps list. flatten is needed because the result is a list of maps.name
# This nested for loop will run only if each of the managed_policies are valid ARNs.
# - AWS Managed Policies -
pset_aws_managed_policy_maps = flatten([
for pset_name, pset_index in local.aws_managed_permission_sets : [
for policy in pset_index.aws_managed_policies : {
pset_name = pset_name
policy_arn = policy
} if pset_index.aws_managed_policies != null && can(pset_index.aws_managed_policies)
]
])
# - Customer Managed Policies -
pset_customer_managed_policy_maps = flatten([
for pset_name, pset_index in local.customer_managed_permission_sets : [
for policy in pset_index.customer_managed_policies : {
pset_name = pset_name
policy_name = policy
# path = path
} if pset_index.customer_managed_policies != null && can(pset_index.customer_managed_policies)
]
])
# ! NOT CURRENTLY SUPPORTED !
# - Inline Policy -
# pset_inline_policy_maps = flatten([
# for pset_name, pset_index in local.inline_policy_permission_sets : [
# for policy in pset_index.inline_policy : {
# pset_name = pset_name
# inline_policy = policy
# # path = path
# } if pset_index.inline_policy != null && can(pset_index.inline_policy)
# ]
# ])
} I can also create a new standalone configuration and post that here if needed |
Thanks - A minimal configuration would be helpful as it can be re-used for an acceptance test. |
Minimal configuration: # Fetch existing SSO Instance
data "aws_ssoadmin_instances" "sso_instance" {}
locals {
# - Fetch SSO Instance ARN and SSO Instance ID -
ssoadmin_instance_arn = tolist(data.aws_ssoadmin_instances.sso_instance.arns)[0]
sso_instance_id = tolist(data.aws_ssoadmin_instances.sso_instance.identity_store_ids)[0]
}
# Create IAM IDC Group
resource "aws_identitystore_group" "example" {
identity_store_id = local.sso_instance_id
display_name = "Admin"
description = "Admin Group"
}
# Create IAM IDC User
resource "aws_identitystore_user" "example" {
identity_store_id = local.sso_instance_id
display_name = "Naruto Uzumaki"
user_name = "nuzumaki"
name {
given_name = "Naruto"
family_name = "Uzumaki"
}
emails {
value = "nuzumaki@hokage.village"
primary = true
}
}
# Create IAM IDC Group Membership
resource "aws_identitystore_group_membership" "sso_group_membership" {
identity_store_id = local.sso_instance_id
group_id = aws_identitystore_group.example.group_id
member_id = aws_identitystore_user.example.user_id
}
# Create Permission Set
resource "aws_ssoadmin_permission_set" "example" {
name = "ExamplePermissionSet"
instance_arn = local.ssoadmin_instance_arn
description = "ExamplePermissionSet"
session_duration = "PT3H"
}
# Create Managed Policy Attachment
resource "aws_ssoadmin_managed_policy_attachment" "pset_aws_managed_policy" {
instance_arn = local.ssoadmin_instance_arn
managed_policy_arn = "arn:aws:iam::aws:policy/job-function/ViewOnlyAccess"
permission_set_arn = aws_ssoadmin_permission_set.example.arn
}
# Create Account Assignment
resource "aws_ssoadmin_account_assignment" "account_assignment" {
instance_arn = local.ssoadmin_instance_arn
permission_set_arn = aws_ssoadmin_permission_set.example.arn
principal_id = aws_identitystore_group.example.group_id
principal_type = "GROUP"
target_id = "000000000000"
target_type = "AWS_ACCOUNT"
} 1. Apply complete! Resources: 6 added, 0 changed, 0 destroyed. 2. ╷
│ Error: waiting for SSO Permission Set (arn:aws:sso:::permissionSet/ssoins-xxx/ps-xxx) provision: unexpected state 'FAILED', wanted target 'SUCCEEDED'. last error: Received a 404 status error: Permission set provision not found in AWS account 000000000000.
│
│
╵ 3. re-run Plan: 0 to add, 0 to change, 1 to destroy.
aws_ssoadmin_permission_set.example: Destroying... [id=arn:aws:sso:::permissionSet/ssoins-xxx/ps-xxx,arn:aws:sso:::instance/ssoins-xxx]
aws_ssoadmin_permission_set.example: Destruction complete after 0s
Destroy complete! Resources: 1 destroyed. Same issue is happening with simplified configuration as well. |
Thanks @novekm - I was able to reproduce with the configuration above. Reproduction and CauseMy current understanding of the issue is that the deletion of both the managed policy attachment and account assignment simultaneously causes problems when the Delete operation of the policy attachment attempts to re-provision the permission set: terraform-provider-aws/internal/service/ssoadmin/managed_policy_attachment.go Lines 162 to 165 in cc558c7
Because the account assignment no longer exists, the provision step fails with an error like:
SolutionI was able to resolve this by creating an explicit dependency between the two resources using the resource "aws_ssoadmin_managed_policy_attachment" "pset_aws_managed_policy" {
depends_on = [aws_ssoadmin_account_assignment.account_assignment]
instance_arn = local.ssoadmin_instance_arn
managed_policy_arn = "arn:aws:iam::aws:policy/job-function/ViewOnlyAccess"
permission_set_arn = aws_ssoadmin_permission_set.example.arn
} Because of this explicit dependency, the destroy operation will completely destroy Provider ImpactAt this time I'd propose not make provider side changes to ignore or retry this particular error. This appears to be a function of the relationship between the account assignment and managed policy attachment when destruction of both is triggered simultaneously. The meaning of this error could change depending on the combination of resources being destroyed, so suppressing it could result in incorrect behavior under other conditions. Resolution of the issue with an explicit Please let us know if you have any concerns resolving the original issue with this approach. |
Thanks @jar-b for the detailed response! I will try the adding |
Hi @jar-b! Sorry for the delay, last week was quite busy with re:Invent :) I have tested your recommendation and can confirm it resolves the error for me. I have tested both with the simplified configuration I posted above, and also within a module I created. The two affected resources are indeed I have created a PR - # 34751 that updates the public docs for these resources, adding clear documentation on the error and resolution. I have submitted many docs for the AWSCC provider, but this is my first for the AWS provider. It seems is uses a different format/structure in the repo. Let me know if the PR needs to be updated. Thanks again for the help resolving this! This will help many customers. |
This functionality has been released in v5.30.0 of the Terraform AWS Provider. Please see the Terraform documentation on provider versioning or reach out if you need any assistance upgrading. For further feature requests or bug reports with this functionality, please create a new GitHub issue following the template. Thank you! |
I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. |
Terraform Core Version
1.5.2
AWS Provider Version
5.15.0
Affected Resource(s)
aws_ssoadmin_permission_set
Expected Behavior
Successful destroy of resources
Actual Behavior
Failure/Error:
Error: waiting for SSO Permission Set (arn:aws:sso:::permissionSet/ssoins-xxx/ps-xxx) provision: unexpected state 'FAILED', wanted target 'SUCCEEDED'. last error: Received a 404 status error: Permission set provision not found in AWS account 123456789012.
This is related to a fix that was merged in v5.14 but the issue persists. The only way to get past this error is to re-runterraform destroy
a second time.This led me to think through the possibility of adding a retry, as it seems terraform is attempting to destroy a resource that is already destroyed. Adjusting the new
timeouts
block has no effect, as the error occurs within 60sec in my testing.Taking a look deeper into the
permission_set.go
file for the resource I found this:Existing Code
From the docs about retries, it appears that this could be modified to retry if these errors occur instead of just returning the error message. I believe it could look something like this:
Potential New Code
I'd like to try to implement and submit the PR for the fix for this, as it seems it's been open for a while and multiple customers are having this issue. It is also a blocker for a module I created and am trying to release that manages AWS IAM Identity Center resources. I just haven't worked with retry logic in terraform before. Happy for any guidance on testing/implementing this fix.
Relevant Error/Panic Output Snippet
No response
Terraform Configuration Files
Steps to Reproduce
terraform apply
terraform destroy
Debug Output
No response
Panic Output
No response
Important Factoids
No response
References
#23585
Would you like to implement a fix?
Yes
The text was updated successfully, but these errors were encountered: