Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

workload-identity module triggers unnecessary deletion and re-creation of IAM binding #1935

Closed
AlejandroSanchezGeotab opened this issue May 8, 2024 · 1 comment
Labels
bug Something isn't working Stale

Comments

@AlejandroSanchezGeotab
Copy link

AlejandroSanchezGeotab commented May 8, 2024

TL;DR

When using the workload-identity module to bind an existing GCP service account to a Kubernetes service account, any updates, even if they do not affect the binding, will result in a plan where the IAM binding is deleted and re-created. We've found the behavior to introduce considerable risk to our deployments because innocuous plan application failures of changes that shouldn't affect workload identity can and have resulted in broken applications due to their IAM binding being deleted and not re-created

Expected behavior

If the changes made to deployments using the workload-identity module to bind existing GCP service accounts to Kubernetes service accounts don't affect the IAM binding, it should not be destroyed and re-created.

Observed behavior

Any changes to deployments using the workload-identity module to bind existing GCP service accounts to Kubernetes service account result in the IAM binding being destroyed and re-created even when the changes don't affect it.

Terraform Configuration

resource "kubernetes_config_map" "config" {
  metadata {
    name      = "config"
    namespace = "default"
  }
  data = var.labels
}

resource "kubernetes_namespace" "default" {
  metadata {
    name = var.namespace
  }
}

module "workload_identity" {
  source  = "terraform-google-modules/kubernetes-engine/google//modules/workload-identity"
  version = ">= 30.2.0"

  name                = "k8s-workload-identity"
  namespace           = kubernetes_namespace.default.metadata[0].name
  project_id          = var.project_id
  use_existing_gcp_sa = true
  gcp_sa_name         = var.gservice_account

  depends_on = [kubernetes_config_map.config]
}

Update the value of var.labels and the plan will result in a plan similar to the following:

  # module.workload_identity.data.google_service_account.cluster_service_account[0] will be read during apply
  # (depends on a resource or a module with changes pending)
 <= data "google_service_account" "cluster_service_account" {
      + account_id   = "my-account@my-project.iam.gserviceaccount.com"
      + display_name = (known after apply)
      + email        = (known after apply)
      + id           = (known after apply)
      + member       = (known after apply)
      + name         = (known after apply)
      + project      = "my-project"
      + unique_id    = (known after apply)
    }

  # module.workload_identity.google_service_account_iam_member.main must be replaced
-/+ resource "google_service_account_iam_member" "main" {
      ~ etag               = "BwYX9MzNsSM=" -> (known after apply)
      ~ id                 = "projects/my-project/serviceAccounts/my-account@my-project.iam.gserviceaccount.com/roles/iam.workloadIdentityUser/serviceAccount:my-project.svc.id.goog[mynamespace/k8s-workload-identity]" -> (known after apply)
      ~ service_account_id = "projects/my-project/serviceAccounts/my-account@my-project.iam.gserviceaccount.com" # forces replacement -> (known after apply) # forces replacement
        # (2 unchanged attributes hidden)
    }

  # module.workload_identity.kubernetes_service_account.main[0] will be updated in-place
  ~ resource "kubernetes_service_account" "main" {
        id                              = "mynamespace/k8s-workload-identity"
        # (2 unchanged attributes hidden)

      ~ metadata {
          ~ annotations      = {
              - "iam.gke.io/gcp-service-account" = "my-account@my-project.iam.gserviceaccount.com"
            } -> (known after apply)
            name             = "k8s-workload-identity"
            # (6 unchanged attributes hidden)
        }
    }

While the example may seem convoluted in that no one would make the workload-identity module depend on a random resource, this is the easiest way I could come up with to reproduce what happens when the module that contains the workload-identity module depends on another module or resource that did get updates. As an example, we have a module that creates GCP bindings depending on a module that creates node pools, and when the module that creates node pools gets an update to increase node count, the service account bindings will get destroyed and re-created.

Terraform Version

Terraform v1.8.3

Additional information

We have reviewed the module and pinpointed the issue to the use of the data.google_service_account.cluster_service_account as an input to the google_service_account_iam_member.main resource via the service_account_id field. Because the value of service_account_id comes from a data source and terraform doesn't know the result at plan time, it determines that it should replace the binding resource even if the values it obtains from the data source haven't changed. The same is true for the kubernetes_service_account.main resource, which gets the value for its iam.gke.io/gcp-service-account annotation from the data source; however, this specific resource gets updated in-place rather than deleted and re-created like the binding. Our review of the module suggests that all the information required to derive both the service account id and the service account email is available to the module at plan time through its variables. Our suggestion is to replace the data.google_service_account.cluster_service_account data source's outputs as inputs to other resources with values computed (derived from the module's own variable inputs with some ternaries and regexes) directly in the module instead. If desired, the data source could be kept around as some sort of validation (perhaps as check blocks or lifecycle postconditions) onto whether the pre-existing GCP service account that must be bound actually exists or if the derived values supplied to the resources to create are accurate.

@AlejandroSanchezGeotab AlejandroSanchezGeotab added the bug Something isn't working label May 8, 2024
Copy link

github-actions bot commented Jul 7, 2024

This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 7 days

@github-actions github-actions bot added the Stale label Jul 7, 2024
@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Jul 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Stale
Projects
None yet
Development

No branches or pull requests

1 participant