Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

azurerm_pim_eligible_role_assignment attempts to recreate after circa 45 days - error "already exists" #24118

Closed
1 task done
andy170583 opened this issue Dec 5, 2023 · 40 comments · Fixed by #24524 or #25956
Closed
1 task done

Comments

@andy170583
Copy link

Is there an existing issue for this?

  • I have searched the existing issues

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment and review the contribution guide to help.

Terraform Version

1.5.0

AzureRM Provider Version

3.70.0

Affected Resource(s)/Data Source(s)

azurerm_pim_eligible_role_assignment

Terraform Configuration Files

resource "time_static" "now" {}

resource "azurerm_pim_eligible_role_assignment" "contributor" {
  scope              = data.azurerm_subscription.current.id
  role_definition_id = "${data.azurerm_subscription.current.id}${data.azurerm_role_definition.contributor.id}"
  principal_id       = var.contributor_pim_group

  schedule {
    start_date_time = time_static.now.rfc3339
    expiration {
      duration_hours = 0
    }
  }
}

Debug Output/Panic Output

Error: A resource with the ID "/subscriptions/xxxxxxx|/subscriptions/xxxxxxx/providers/Microsoft.Authorization/roleDefinitions//b24988ac-6180-42a0-ab88-20f7382dd24c|yyyyyyy" already exists - to be managed via Terraform this resource needs to be imported into the State. Please see the resource documentation for "azurerm_pim_eligible_role_assignment" for more information.
│ 
│   with azurerm_pim_eligible_role_assignment.contributor,
│   on main.tf line 549, in resource "azurerm_pim_eligible_role_assignment" "contributor":
│  549: resource "azurerm_pim_eligible_role_assignment" "contributor" {
│ 
│ A resource with the ID
│ "/subscriptions/xxxxxxx|/subscriptions/xxxxxxx/providers/Microsoft.Authorization/roleDefinitions//b24988ac-6180-42a0-ab88-20f7382dd24c|yyyyyyy8"
│ already exists - to be managed via Terraform this resource needs to be
│ imported into the State. Please see the resource documentation for"azurerm_pim_eligible_role_assignment" for more information.

Expected Behaviour

PIM Role assignment should create on first Apply, and on subsequent applies it should exist and recreation not attempted.

Actual Behaviour

After a period of time, circa 45 days Terraform appears to think the assignment has been deleted though it still exists and can be seen in the portal in it's original state. Terraform attempts to recreate it and fails as it already exists.

Steps to Reproduce

Update PIM settings to allow permanent role assignments,
Create a PIM Role assignment without expiry,
Wait circa 45 days, reapply terraform config

Important Factoids

No response

References

No response

@harshavmb
Copy link
Contributor

Hi @andy170583 ,

This is the first time I am seeing 45 days related to resource lifecycle.

Are you sure azurerm_pim_eligible_role_assignment.contributor is present in state file? What does it tell when you run terraform plan?

@andy170583
Copy link
Author

andy170583 commented Dec 5, 2023

Hi @harshavmb Thanks for the quick response,

According to the time stamp the PIM assignments were originally created at 2023-10-16T12:33:28Z;

I have just checked back through the State file version backups and can see the following;

28/11/2023 In State
29/11/2023 In State
29/11/2023, 01:08:38 pm in State
04/12/2023, 12:13:45 pm in State
04/12/2023, 12:16:43 pm in State
04/12/2023, 12:21:59 pm Missing from State

Checking back through the pipelines, the first time we see an issue is when it shows up that the assignments need recreating in a plan 04/12/2023 12:03 output shown below,

# azurerm_pim_eligible_role_assignment.contributor will be created
  + resource "azurerm_pim_eligible_role_assignment" "contributor" {
      + id                 = (known after apply)
      + principal_id       = "xxxx"
      + principal_type     = (known after apply)
      + role_definition_id = "/subscriptions/yyyyy/providers/Microsoft.Authorization/roleDefinitions/b24988ac-6180-42a0-ab88-20f7382dd24c"
      + scope              = "/subscriptions/yyyyy"

      + schedule {
          + start_date_time = "2023-10-16T12:33:28Z"

          + expiration {
              + duration_days  = (known after apply)
              + duration_hours = 0
              + end_date_time  = (known after apply)
            }
        }
    }

The deploy pipeline then failed at 12:16, as per the failure I shared originally

I have three Subscription deployed from identical code, with 3 x separate state files, 2 x were deployed on the 16th October and have both now had this issue, the third was deployed last week and the PIM role assignments are still in the State file and pipelines are running without issue.

@harshavmb
Copy link
Contributor

Hi @andy170583 ,

Thanks for sharing more details. Can you send TRACE logs by setting TF_LOG=1 in your env vars before terraform plan?

@MohnJadden
Copy link

This looks a lot like the issues which I and other people are running into that were reported in #23672, #22513, and #23775 - the PIM assignments seem to somehow drop between when Terraform creates them and when it actually adds them to the state, thus leading to a disconnect where it exists in Azure but not TF.

@andy170583
Copy link
Author

Hi @andy170583 ,

Thanks for sharing more details. Can you send TRACE logs by setting TF_LOG=1 in your env vars before terraform plan?

I have enabled Trace Logs but unfortunately I can't share them as it's a Live environment. Is there anything in particular I could search and extract?

@xuzhang3
Copy link
Contributor

xuzhang3 commented Dec 6, 2023

@harshavmb can you help review #24077 , this issue looks like relate to this PR

@andy170583
Copy link
Author

Quick Update: I have this morning tried to add the resource/s manually back into the State file. copying the blocks from the older backups.

The Plan still attempts to recreate the objects. I have run the plan with Trace logging with the state file updated and tried to pull out anything that looks relevant.

2023-12-06T08:11:24.448Z [TRACE] readResourceInstanceState: reading state for azurerm_pim_eligible_role_assignment.owner
2023-12-06T08:11:24.448Z [TRACE] upgradeResourceState: schema version of azurerm_pim_eligible_role_assignment.owner is still 0; calling provider "azurerm" for any other minor fixups
2023-12-06T08:11:24.448Z [TRACE] GRPCProvider: UpgradeResourceState

2023-12-06T08:11:24.448Z [TRACE] GRPCProvider: UpgradeResourceState
2023-12-06T08:11:24.449Z [TRACE] provider.terraform-provider-azurerm_v3.70.0_x5: Received request: @caller=github.com/hashicorp/terraform-plugin-go@v0.14.3/tfprotov5/tf5server/server.go:708 tf_proto_version=5.3 tf_provider_addr=provider tf_req_id=63671cb7-4c97-7634-e4b7-c7f236348c76 tf_resource_type=azurerm_pim_eligible_role_assignment @module=sdk.proto tf_rpc=UpgradeResourceState timestamp=2023-12-06T08:11:24.449Z
2023-12-06T08:11:24.449Z [TRACE] provider.terraform-provider-azurerm_v3.70.0_x5: Sending request downstream: @caller=github.com/hashicorp/terraform-plugin-go@v0.14.3/tfprotov5/internal/tf5serverlogging/downstream_request.go:17 tf_req_id=63671cb7-4c97-7634-e4b7-c7f236348c76 tf_resource_type=azurerm_pim_eligible_role_assignment tf_rpc=UpgradeResourceState @module=sdk.proto tf_proto_version=5.3 tf_provider_addr=provider timestamp=2023-12-06T08:11:24.449Z
2023-12-06T08:11:24.449Z [TRACE] provider.terraform-provider-azurerm_v3.70.0_x5: Upgrading JSON state: @module=sdk.helper_schema tf_req_id=63671cb7-4c97-7634-e4b7-c7f236348c76 @caller=github.com/hashicorp/terraform-plugin-sdk/v2@v2.26.1/helper/schema/grpc_provider.go:323 tf_provider_addr=provider tf_resource_type=azurerm_pim_eligible_role_assignment tf_rpc=UpgradeResourceState timestamp=2023-12-06T08:11:24.449Z
2023-12-06T08:11:24.449Z [TRACE] provider.terraform-provider-azurerm_v3.70.0_x5: Received downstream response: tf_req_id=63671cb7-4c97-7634-e4b7-c7f236348c76 tf_resource_type=azurerm_pim_eligible_role_assignment tf_rpc=UpgradeResourceState @module=sdk.proto tf_proto_version=5.3 tf_req_duration_ms=0 diagnostic_warning_count=0 tf_provider_addr=provider @caller=github.com/hashicorp/terraform-plugin-go@v0.14.3/tfprotov5/internal/tf5serverlogging/downstream_request.go:37 diagnostic_error_count=0 timestamp=2023-12-06T08:11:24.449Z
2023-12-06T08:11:24.449Z [TRACE] provider.terraform-provider-azurerm_v3.70.0_x5: Served request: @module=sdk.proto tf_proto_version=5.3 tf_provider_addr=provider tf_resource_type=azurerm_pim_eligible_role_assignment @caller=github.com/hashicorp/terraform-plugin-go@v0.14.3/tfprotov5/tf5server/server.go:728 tf_req_id=63671cb7-4c97-7634-e4b7-c7f236348c76 tf_rpc=UpgradeResourceState timestamp=2023-12-06T08:11:24.449Z
2023-12-06T08:11:24.450Z [TRACE] NodeAbstractResouceInstance.writeResourceInstanceState to prevRunState for azurerm_pim_eligible_role_assignment.owner
2023-12-06T08:11:24.450Z [TRACE] NodeAbstractResouceInstance.writeResourceInstanceState: writing state object for azurerm_pim_eligible_role_assignment.owner
2023-12-06T08:11:24.451Z [TRACE] NodeAbstractResouceInstance.writeResourceInstanceState to refreshState for azurerm_pim_eligible_role_assignment.owner
2023-12-06T08:11:24.451Z [TRACE] NodeAbstractResouceInstance.writeResourceInstanceState: writing state object for azurerm_pim_eligible_role_assignment.owner
2023-12-06T08:11:24.451Z [TRACE] NodeAbstractResourceInstance.refresh for azurerm_pim_eligible_role_assignment.owner
azurerm_pim_eligible_role_assignment.owner: Refreshing state... [id=/subscriptions/yyyyyyyyyyyy|/subscriptions/yyyyyyyyyyyy/providers/Microsoft.Authorization/roleDefinitions/8e3af657-a8ff-443c-a75c-2fe8c4bcb635|xxxxxxxxx]
2023-12-06T08:11:24.452Z [TRACE] GRPCProvider: ReadResource

2023-12-06T08:11:27.626Z [WARN]  Provider "registry.terraform.io/hashicorp/azurerm" produced an unexpected new value for azurerm_pim_eligible_role_assignment.owner during refresh.
      - Root resource was present, but now absent

2023-12-06T08:11:27.628Z [TRACE] NodeAbstractResouceInstance.writeResourceInstanceState to refreshState for azurerm_pim_eligible_role_assignment.owner
2023-12-06T08:11:27.628Z [TRACE] NodeAbstractResouceInstance.writeResourceInstanceState: removing state object for azurerm_pim_eligible_role_assignment.owner
2023-12-06T08:11:27.628Z [TRACE] Re-validating config for "azurerm_pim_eligible_role_assignment.owner"

2023-12-06T08:11:27.653Z [TRACE] writeChange: recorded Create change for azurerm_pim_eligible_role_assignment.owner

2023-12-06T08:11:29.965Z [TRACE] DiffTransformer: found Create change for azurerm_pim_eligible_role_assignment.owner
2023-12-06T08:11:29.965Z [TRACE] DiffTransformer: azurerm_pim_eligible_role_assignment.owner will be represented by azurerm_pim_eligible_role_assignment.owner

2023-12-06T08:11:29.973Z [DEBUG] Resource state not found for node "azurerm_pim_eligible_role_assignment.owner", instance azurerm_pim_eligible_role_assignment.owner

@xuzhang3
Copy link
Contributor

xuzhang3 commented Dec 6, 2023

@andy170583 have you assign the roles to the sub resource within this sub recently? Current AzureRM not check the role assignment by scope, one result is it will treat the role assignment on sub resource as assigned on current resource and when you try to import the resource it will report that the resource not exist.

@unique-dominik
Copy link
Contributor

Heavily related to the recent discussion in #23111 👀

@srnebu
Copy link

srnebu commented Jan 8, 2024

Have updated to 3,86 and still facing this issue. Today some ressoruces can not be managed anymore. Even the import fails

@srnebu
Copy link

srnebu commented Jan 8, 2024

Some more details:
I have in my configuration a number of azurerm_pim_eligible_role_assignment items. Most of them were created at several dates. Now some of them are not working like described above. Others are still getting updated. Those which are not found can not be imported as well.

@harshavmb
Copy link
Contributor

Hi @srnebu,

As @xuzhang3 mentioned this appears to be fixed in #24077. My guess is that your tf statefiles were updated (references of resources were removed) before the usage of v3.85.0 azurerm plugin & that's why the issue could be still persistent.

If possible for the ones which isn't working you could fallback the statefile which contains azurerm_pim_eligible_role_assignment & give a try with azurerm versions higher than v3.85.0.

@srnebu
Copy link

srnebu commented Jan 8, 2024

@harshavmb still facing the issue:
what I have done:

  1. recover tfstate with existing resource azurerm_pim_eligible_role_assignment
  2. init (update to 3.86 was made)
  3. plan and/or apply -> both have to shown a new creation is necessary.
  4. Removed resource from tfstate and tried to import but failed as well (seems to be same issue as importing azurerm_pim_eligible_role_assignment fails with resource not existant #23111)
resource "azurerm_pim_eligible_role_assignment" "contributor_teamadminaz" {
principal_id       = data.azuread_group.team_adminaz.object_id
role_definition_id = "${var.current_subscription_id}${data.azurerm_role_definition.role_contributor.id}"
scope              = var.current_subscription_id

  schedule {
    start_date_time = var.pim_start_date
    expiration {

    }
  }
}

@harshavmb
Copy link
Contributor

Hi @srnebu ,

Thanks for sharing the details. In such case I request @xuzhang3 to comment whether the PR referenced #24077 is intended for this request.

@srnebu
Copy link

srnebu commented Jan 8, 2024

Did a quick check with 3.84 to 3.86. it is absolutely the same behaviour.

@unique-dominik
Copy link
Contributor

Folks, I suspect that is not the root cause, the root cause I outlined here for the 45d problem…
#23111 (comment)

I might need to open a new issue for that tomorrow if @xuzhang3 or @MohnJadden reply (or don't) 🚀

@srnebu
Copy link

srnebu commented Jan 10, 2024

any news?

@xuzhang3
Copy link
Contributor

@harshavmb #24077 fixed part of the issue but not all of it. I still trying to reproduce this error.

@unique-dominik
Copy link
Contributor

You can only reproduce it sadly by waiting 45 days 😢

@srnebu
Copy link

srnebu commented Jan 12, 2024

can we help we you with any protocol or some other stuff?

@xuzhang3
Copy link
Contributor

@srnebu One issue I found is that the API used by the current AzreuRM version cannot get role assignments that will be activated in the future, as for the issue mentioned here I cannot reproduce.

@jakubslonxlab
Copy link

We are also experiencing this issue. Our plan wants to create the role assignments, which has been previously been created by terraform and exist in both - the state and in azure portal. We also tried using the import block, but it's being ignored by terraform entirely.

  + resource "azurerm_pim_eligible_role_assignment" "pim_eligible_role_assignment" { 
      + id                 = (known after apply) 
      + principal_id       = "<principal-id>" 
      + principal_type     = (known after apply) 
      + role_definition_id = "<role-definition-id>" 
      + scope              = "<scope>" 
      + schedule { 
          + start_date_time = "2023-10-10T16:00:54Z" 
          + expiration { 
              + duration_days  = 365 
              + duration_hours = (known after apply) 
              + end_date_time  = (known after apply) 
            } 
        } 
    } 

azurerm version:3.87.0
terraform version: 1.5.7

@rikribbers
Copy link
Contributor

Same here; another thing is we cannot remove the PIM Role assignment through the portal manually anymore.

Have opened a support ticket at Microsoft, if anything usefull comes out of it will post it here.

@xuzhang3
Copy link
Contributor

xuzhang3 commented Jan 17, 2024

@rikribbers
Copy link
Contributor

@xuzhang3 You can also remove it from the IAM blade on resource level.

@srnebu
Copy link

srnebu commented Jan 25, 2024

hoping the pull request will be included into the next version :-)

@jakubslonxlab
Copy link

Is there any update on this?

@srnebu
Copy link

srnebu commented Feb 12, 2024

still waiting for this function to work correctly.

@tomaaron
Copy link

Still facing the same issue with provider version v3.99.

@audunsolemdal
Copy link
Contributor

I created new azurerm_pim_eligible_role_assignment resources with provider version 3.94 just after it was released, and I am sorry to say that this week (~45 days) later that this bug is still occurring. This issue should be reopened.

@unique-dominik
Copy link
Contributor

I also still see it 😢

But we should also once mention some kudos for @xuzhang3 and all those that need to deal with PIM. If you read the PIM docs and even use PIM in the portal, you recognise that it's just somehow shabby. Feels like some large enterprises put so much pressure onto M$ (we buy AWS instead if you don't …) that they just stitched on some hobbo entities and processes to normal role assignments with drawbacks left and right and now all API or IaC maintainers need to keep up with this 💩

Thank you all that try to IaC PIM ❤️

@manicminer
Copy link
Member

Improvements and bug fixes to the azurerm_pim_active_role_assignment and azurerm_pim_eligible_role_assignment have been merged for v3.104.0, which includes the resource being able to continue refreshing after the 45 day window (the point in time when one of the APIs stops returning the original request). I believe this should resolve this issue, so I have reopened/reclosed it and moved it to the v3.104.0 milestone.

@tomaaron
Copy link

Thanks for the update @manicminer. Could you share wether we need to recreate the resources or stick with the old resources?

@manicminer
Copy link
Member

@tomaaron I've been able to keep the resource IDs consistent so it should just start working without needing to recreate or reimport.

@tomaaron
Copy link

tomaaron commented May 23, 2024

@manicminer Thanks for getting back. I have partially success with the new provider. On one environment TF could reference and maintain the resources with No changes. Your infrastructure matches the configuration.

But on another one TF unfortunately wants to recreate the resources for no reason:

  # azurerm_pim_eligible_role_assignment.crypto_officer must be replaced
-/+ resource "azurerm_pim_eligible_role_assignment" "crypto_officer" {
      ~ id                 = "<REDACTED>" -> (known after apply)
      + justification      = (known after apply)
      ~ principal_type     = "Group" -> (known after apply)
        # (3 unchanged attributes hidden)

      - schedule {
          - start_date_time = "2024-04-09T05:48:40.389606+00:00" -> null

          - expiration {
              - duration_days  = 0 -> null
              - duration_hours = 0 -> null
            }
        }

      - ticket {} # forces replacement
    }

Plan: X to add, 0 to change, X to destroy.

I have compared both states and there is no real diff except for the IDs ofcourse. The Ticket block is on both states empty. I have also tried to remove the resource from the state and reimport it without luck.
You're help would be much appreciated.

@fgebhardt
Copy link

Same issue as @tomaaron, ticket on azurerm_pim_eligible_role_assignment resource forces a full resource replacement when running a new plan with:

Terraform v1.8.3

  • provider registry.terraform.io/hashicorp/azurerm v3.104.2

We just redeployed everything a couple of days ago, after facing the issue with the 45 days window. Versions used back then:

Terraform v1.8.2

  • provider registry.terraform.io/hashicorp/azurerm v3.101.0

We do not supply any ticket related settings, as the deployment is simply to enable PIM for basic subscription scoped RBAC, in conjunction with a role assignment policy.

Advise or help would be very much appreciated, thanks a lot.

@tomaaron
Copy link

Also to mention: even If you recreated the full resources then the provider wants to recreate it each time you.

@sunevnuahs
Copy link

With reference to the comment by @fgebhardt above, we were facing 135 resources being destroyed and created again using this version of Terraform and the azurerm provider.

Terraform v1.8.3

  • provider registry.terraform.io/hashicorp/azurerm v3.104.2

As per the comment by @tomaaron the ticket is the attribute change which is forcing replacement:

ticket {} # forces replacement

I compared the tfstate and tfstate-prev files from the tfplan and can see this difference:

"ticket": []

and

"ticket": [
  {
    "number": "",
    "system": ""
  }
]

The previous state was from a plan/apply with azurerm 3.102.0 and Terraform 1.8.2.

Even though we do not specify a ticket, these are optional in the docs, the existing and planned tickets differ in their optional state.

This is also the same for the azurerm_pim_active_role_assignment resource type (we had both showing as being removed due to the ticket 'change').

To get around this, as we do not use the ticket, we have added a lifecycle ignore to these resources:

  lifecycle {
    ignore_changes = [
      ticket
    ]
  }

Now when we run a plan there are no changes seen due to the 'changes' of the tickets.

Of course you could also modify our existing state with these new defaults for the ticket, but I would never publicly recommend that!

@manicminer
Copy link
Member

manicminer commented May 24, 2024

Thanks @tomaaron, @fgebhardt, @sunevnuahs for the feedback - your real world usage reports are very much appreciated.

I am hoping to have fixed this persistent diff issue with the ticket block in #26059 - this will be included in the this week's provider release v3.105.0 (which is running a little bit late but should be out shortly today).

@tomaaron
Copy link

@manicminer I can confirm that the v3.105.0 resolved the issue. 🎉 Thanks for your work!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment