Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Does tf-controller support workload identity for Azure? #561

Closed
masashi-noguchi opened this issue Apr 3, 2023 · 8 comments · Fixed by #1153
Closed

Does tf-controller support workload identity for Azure? #561

masashi-noguchi opened this issue Apr 3, 2023 · 8 comments · Fixed by #1153
Assignees
Labels
good first issue Good for newcomers kind/docs Improvements or additions to documentation

Comments

@masashi-noguchi
Copy link

Does Terraform Controller support workload identity for Azure?(see https://github.com/Azure/azure-workload-identity)

After helmrelease the terraform controller to Azure, when I deploy the Terraform resource, I get the following authentication error in tf-runner pod.
Appropriate Azure Managed ID privileges are granted to the workload identity for the serviceaccount, so if Terraform Controller supports it, an authentication error should not occur, but is it not currently supported?

❯ kubectl get pod tf-runner -o yaml
    message: |+
      error running Init: rpc error: code = Internal desc = exit status 1

      Error: Failed to get existing workspaces: Error retrieving keys for Storage Account "<storageaccount>": azure.BearerAuthorizer#WithAuthorization: Failed to refresh the Token for request to https://management.azure.com/subscriptions/<subscriptionid>/resourceGroups/<resourceGroups>/providers/Microsoft.Storage/storageAccounts/<storageaccount>/listKeys?api-version=2021-01-01: StatusCode=404 -- Original Error: adal: Refresh request failed. Status Code = '404'. Response body: getting assigned identities for pod <namespace>/tf-runner in CREATED state failed after 16 attempts, retry duration [5]s, error: <nil>. Check MIC pod logs for identity assignment errors
       Endpoint http://xxx.xxx.xxx.xxx/metadata/identity/oauth2/token?api-version=2018-02-01&resource=https%3A%2F%2Fmanagement.azure.com%2F
@k0da
Copy link
Contributor

k0da commented Apr 3, 2023

it looks like a Terraform error. To me it seem like backend isn't configured properly

@masashi-noguchi
Copy link
Author

Thank you immediately answer. When I tried the method using this inject sidecar, I was able to set it to the backend.
(see https://learn.microsoft.com/en-us/azure/aks/workload-identity-migrate-from-pod-identity?source=recommendations#deploy-the-workload-with-migration-sidecar)

kind: Terraform
spec:
  runnerPodTemplate:
    metadata:
      annotations:
        azure.workload.identity/inject-proxy-sidecar: "true"
    spec:
      env:
      - name: ARM_CLIENT_ID
        value: <client_id of MID>

In the above, since the authentication error has disappeared, it seems that the cause is not the backend setting, but the azure workload identity is not supported.

If you have a successful track record of azure workload identity, I would be happy if you could let me know.

@maciekdude
Copy link

It does. I have tf-controller running on AKS with workload identity with no problem.
The trick is to use the OIDC flag and explicitly point to the token.
BTW there's a bug in azurerm 3.44.x. Hence, use anything 3.47 onwards.
Some additional info here hashicorp/terraform-provider-azurerm#20671
Some examples are below. Hope that helps.

Env variables that should be set on the runner pod.

       - name: ARM_USE_OIDC
          value: "true"
        - name: ARM_OIDC_TOKEN_FILE_PATH
          value: "/var/run/secrets/azure/tokens/azure-identity-token"

Example yaml:

apiVersion: infra.contrib.fluxcd.io/v1alpha1
kind: Terraform
metadata:
  name: terraformhello
  namespace: default
spec:
  tfstate:
    forceUnlock: auto
  backendConfig:
    customConfiguration: |
      backend "azurerm" {
        resource_group_name  = "l"
        storage_account_name = ""
        container_name       = "tfstate"
        key                  = "helloworld.tfstate"
        use_oidc             = true
      }
  interval: 1m
  serviceAccountName: service_account_registered_in_aad
  approvePlan: auto
  destroy: true
  path: ./tests/fixture
  sourceRef:
    kind: GitRepository
    name: terraformhello
    namespace: flux-system
  runnerPodTemplate:
    spec:
      image: azure_cli_runner.xxx
      env:
        - name: ARM_USE_OIDC
          value: "true"
        - name: ARM_SUBSCRIPTION_ID
          value: ""
        - name: ARM_TENANT_ID
          value: ""
        - name: ARM_CLIENT_ID
          value: ""
        - name: ARM_OIDC_TOKEN_FILE_PATH
          value: "/var/run/secrets/azure/tokens/azure-identity-token"

@mingmingshiliyu
Copy link

It does. I have tf-controller running on AKS with workload identity with no problem. The trick is to use the OIDC flag and explicitly point to the token. BTW there's a bug in azurerm 3.44.x. Hence, use anything 3.47 onwards. Some additional info here hashicorp/terraform-provider-azurerm#20671 Some examples are below. Hope that helps.

Env variables that should be set on the runner pod.

       - name: ARM_USE_OIDC
          value: "true"
        - name: ARM_OIDC_TOKEN_FILE_PATH
          value: "/var/run/secrets/azure/tokens/azure-identity-token"

Example yaml:

apiVersion: infra.contrib.fluxcd.io/v1alpha1
kind: Terraform
metadata:
  name: terraformhello
  namespace: default
spec:
  tfstate:
    forceUnlock: auto
  backendConfig:
    customConfiguration: |
      backend "azurerm" {
        resource_group_name  = "l"
        storage_account_name = ""
        container_name       = "tfstate"
        key                  = "helloworld.tfstate"
        use_oidc             = true
      }
  interval: 1m
  serviceAccountName: service_account_registered_in_aad
  approvePlan: auto
  destroy: true
  path: ./tests/fixture
  sourceRef:
    kind: GitRepository
    name: terraformhello
    namespace: flux-system
  runnerPodTemplate:
    spec:
      image: azure_cli_runner.xxx
      env:
        - name: ARM_USE_OIDC
          value: "true"
        - name: ARM_SUBSCRIPTION_ID
          value: ""
        - name: ARM_TENANT_ID
          value: ""
        - name: ARM_CLIENT_ID
          value: ""
        - name: ARM_OIDC_TOKEN_FILE_PATH
          value: "/var/run/secrets/azure/tokens/azure-identity-token"

hey,If I want to use it to import existing resources from azure, create/update/delete azure resources and tencentcloud resources, is it workable for me ? the doc is too simple to understand.

@maciekdude
Copy link

I do not see why not. Import existing resources to some tfstate stored on a storage account and that's it. I haven't been looking into the tf controller recently, and I am unsure how to integrate existing infra with TFstate stored in K8s as a secret.

@mingmingshiliyu
Copy link

I do not see why not. Import existing resources to some tfstate stored on a storage account and that's it. I haven't been looking into the tf controller recently, and I am unsure how to integrate existing infra with TFstate stored in K8s as a secret.

i wt find a mature solution,however,so little docs for best practices,and there is npractice and multicloud management best practice

@squaremo
Copy link
Contributor

squaremo commented Nov 2, 2023

Candidate for writing up in "How to tf-controller with Azure", or does it need more investigation @chanwit?

@LappleApple LappleApple added the kind/docs Improvements or additions to documentation label Nov 2, 2023
@chanwit
Copy link
Collaborator

chanwit commented Nov 3, 2023

@squaremo No further investigation needed. We can go straight to writing docs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers kind/docs Improvements or additions to documentation
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants