Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Service principal with working Azure Roles as tf context is unable to authenticate via kubernetes provider block but az aks get-credentials and kubectl get pods -n xy works #20843

Closed
1 task done
slzmruepp opened this issue Mar 8, 2023 · 11 comments · Fixed by #20927

Comments

@slzmruepp
Copy link

Is there an existing issue for this?

  • I have searched the existing issues

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Terraform Version

1.3.7

AzureRM Provider Version

3.45.0

Affected Resource(s)/Data Source(s)

data.azurerm_kubernetes_cluster

Terraform Configuration Files

terraform {
  required_providers {
    azurerm = {
      source  = "hashicorp/azurerm"
      version = ">= 3.39.1"
    }
    azuread = {
      source  = "hashicorp/azuread"
      version = ">= 2.33.0"
    }
    kubernetes = {
      source = "hashicorp/kubernetes"
      version = ">= 2.16.1"
    }
  }
  required_version = ">= 1.3.7"
  backend "azurerm" {
  }
}

data "azurerm_kubernetes_cluster" "aks_provider_config" {
  name                = var.env_config[var.ENV][ "aks_cluster_name" ]
  resource_group_name = var.env_config[var.ENV][ "aks_rg_name" ]
}

provider "kubernetes" {
  host                   = data.azurerm_kubernetes_cluster.aks_provider_config.kube_config.0.host
  username               = data.azurerm_kubernetes_cluster.aks_provider_config.kube_config.0.username
  password               = data.azurerm_kubernetes_cluster.aks_provider_config.kube_config.0.password
  client_certificate     = base64decode(data.azurerm_kubernetes_cluster.aks_provider_config.kube_config.0.client_certificate)
  client_key             = base64decode(data.azurerm_kubernetes_cluster.aks_provider_config.kube_config.0.client_key)
  cluster_ca_certificate = base64decode(data.azurerm_kubernetes_cluster.aks_provider_config.kube_config.0.cluster_ca_certificate)
}

resource "azurerm_role_assignment" "role_cluster_rbac_admin" {
  scope                = "${var.aks_cluster_id}/namespaces/${var.aks_proj_ns}"
  role_definition_name = "Azure Kubernetes Service RBAC Admin"
  principal_id         = azuread_group.sg.id
  depends_on           = [azuread_group.sg]
}

Debug Output/Panic Output

{"error":{"code":"AuthorizationFailed","message":"The client '<<sp-2 objectid>>' with object id '<<sp-2 objectid>>' does not have authorization to perform action 'Microsoft.ContainerService/managedClusters/accessProfiles/listCredential/action' over scope '/subscriptions/<<subscriptionid>>/resourceGroups/<<aks-resource-group-name>>/providers/Microsoft.ContainerService/managedClusters/<<aks-cluster-name>>/accessProfiles/clusterUser' or the scope is invalid. If access was recently granted, please refresh your credentials."}}: timestamp=2023-01-19T15:56:02.124Z
2023-01-19T15:56:02.125Z [ERROR] provider.terraform-provider-azurerm_v3.39.1_x5: Response contains error diagnostic: diagnostic_detail= diagnostic_severity=ERROR tf_rpc=ReadDataSource tf_req_id=1e60db36-ddcc-1dd4-386c-a0cd68dc1a86 @caller=github.com/hashicorp/terraform-plugin-go@v0.14.1/tfprotov5/internal/diag/diagnostics.go:55 @module=sdk.proto diagnostic_summary="retrieving Access Profile for Managed Cluster (Subscription: "<<subscriptionid>>"
Resource Group Name: "<<aks-resource-group-name>>"
Resource Name: "<<aks-cluster-name>>"): managedclusters.ManagedClustersClient#GetAccessProfile: Failure responding to request: StatusCode=403 -- Original Error: autorest/azure: Service returned an error. Status=403 Code="AuthorizationFailed" Message="The client '<<sp-2 objectid>>' with object id '<<sp-2 objectid>>' does not have authorization to perform action 'Microsoft.ContainerService/managedClusters/accessProfiles/listCredential/action' over scope '/subscriptions/<<subscriptionid>>/resourceGroups/<<aks-resource-group-name>>/providers/Microsoft.ContainerService/managedClusters/<<aks-cluster-name>>/accessProfiles/clusterUser' or the scope is invalid. If access was recently granted, please refresh your credentials."" tf_data_source_type=azurerm_kubernetes_cluster tf_proto_version=5.3 tf_provider_addr=provider timestamp=2023-01-19T15:56:02.124Z
2023-01-19T15:56:02.125Z [ERROR] vertex "data.azurerm_kubernetes_cluster.aks_provider_config" error: retrieving Access Profile for Managed Cluster (Subscription: "<<subscriptionid>>"
Resource Group Name: "<<aks-resource-group-name>>"
Resource Name: "<<aks-cluster-name>>"): managedclusters.ManagedClustersClient#GetAccessProfile: Failure responding to request: StatusCode=403 -- Original Error: autorest/azure: Service returned an error. Status=403 Code="AuthorizationFailed" Message="The client '<<sp-2 objectid>>' with object id '<<sp-2 objectid>>' does not have authorization to perform action 'Microsoft.ContainerService/managedClusters/accessProfiles/listCredential/action' over scope '/subscriptions/<<subscriptionid>>/resourceGroups/<<aks-resource-group-name>>/providers/Microsoft.ContainerService/managedClusters/<<aks-cluster-name>>/accessProfiles/clusterUser' or the scope is invalid. If access was recently granted, please refresh your credentials."
2023-01-19T15:56:02.125Z [ERROR] vertex "data.azurerm_kubernetes_cluster.aks_provider_config (expand)" error: retrieving Access Profile for Managed Cluster (Subscription: "<<subscriptionid>>"
Resource Group Name: "<<aks-resource-group-name>>"
Resource Name: "<<aks-cluster-name>>"): managedclusters.ManagedClustersClient#GetAccessProfile: Failure responding to request: StatusCode=403 -- Original Error: autorest/azure: Service returned an error. Status=403 Code="AuthorizationFailed" Message="The client '<<sp-2 objectid>>' with object id '<<sp-2 objectid>>' does not have authorization to perform action 'Microsoft.ContainerService/managedClusters/accessProfiles/listCredential/action' over scope '/subscriptions/<<subscriptionid>>/resourceGroups/<<aks-resource-group-name>>/providers/Microsoft.ContainerService/managedClusters/<<aks-cluster-name>>/accessProfiles/clusterUser' or the scope is invalid. If access was recently granted, please refresh your credentials."
2023-01-19T15:56:02.126Z [INFO]  backend/local: plan operation completed

│ Error: retrieving Access Profile for Managed Cluster (Subscription: "<<subscriptionid>>"
│ Resource Group Name: "<<aks-resource-group-name>>"
│ Resource Name: "<<aks-cluster-name>>"): managedclusters.ManagedClustersClient#GetAccessProfile: Failure responding to request: StatusCode=403 -- Original Error: autorest/azure: Service returned an error. Status=403 Code="AuthorizationFailed" Message="The client '<<sp-2 objectid>>' with object id '<<sp-2 objectid>>' does not have authorization to perform action 'Microsoft.ContainerService/managedClusters/accessProfiles/listCredential/action' over scope '/subscriptions/<<subscriptionid>>/resourceGroups/<<aks-resource-group-name>>/providers/Microsoft.ContainerService/managedClusters/<<aks-cluster-name>>/accessProfiles/clusterUser' or the scope is invalid. If access was recently granted, please refresh your credentials."

│   with data.azurerm_kubernetes_cluster.aks_provider_config,
│   on var-proj.tf line 12, in data "azurerm_kubernetes_cluster" "aks_provider_config":
│   12: data "azurerm_kubernetes_cluster" "aks_provider_config" {

Expected Behaviour

What should have happened?
We want the sp-2 with limited permissions to only be able to see and manage the project namespace for which it has the RBAC Admin rights anyway and only deploy to this namespace kube objects through terraform.
We want the provider configuration to work as documented (the sp-2 of the tf context has Kubernetes User Role which should allow it to download the certs and auth for acting on the specific namespace.

Actual Behaviour

What actually happened?
Despite the sp-2 has the appropriate roles which are verified by using az aks commands and kubectl commands to download kubeconfig and act on the specific namespace it has RBAC Admin role for, the kubernetes provider fails with 403 error.

Steps to Reproduce

  1. We create a aks cluster with service principal sp-1
  2. We create a project service principal sp-2
  3. sp-2 is a project based SP which has no IAM roles except the following:
  4. We grant sp-2 Azure Kubernetes Service Cluster User Role (this should allow it to fetch kubeconfig
  5. We grant sp-2 rbac admin on the project namespace:
resource "azurerm_role_assignment" "role_cluster_rbac_admin" {
  scope                = "${var.aks_cluster_id}/namespaces/${var.aks_proj_ns}"
  role_definition_name = "Azure Kubernetes Service RBAC Admin"
  principal_id         = azuread_group.sg.id
  depends_on           = [azuread_group.sg]
}

(This allows the sp-2 to do everything in its namespace: kubectl list all -n var.aks_proj_ns works, kubectl list all does not work)
This is tested with az login sp-2 and executing kubectl commands in azure pipelines, it works.

  1. If we try to set up a second terraform project, authenticate with sp-2 and with the config I provided, we get the following error:
{"error":{"code":"AuthorizationFailed","message":"The client '<<sp-2 objectid>>' with object id '<<sp-2 objectid>>' does not have authorization to perform action 'Microsoft.ContainerService/managedClusters/accessProfiles/listCredential/action' over scope '/subscriptions/<<subscriptionid>>/resourceGroups/<<aks-resource-group-name>>/providers/Microsoft.ContainerService/managedClusters/<<aks-cluster-name>>/accessProfiles/clusterUser' or the scope is invalid. If access was recently granted, please refresh your credentials."}}: timestamp=2023-01-19T15:56:02.124Z
2023-01-19T15:56:02.125Z [ERROR] provider.terraform-provider-azurerm_v3.39.1_x5: Response contains error diagnostic: diagnostic_detail= diagnostic_severity=ERROR tf_rpc=ReadDataSource tf_req_id=1e60db36-ddcc-1dd4-386c-a0cd68dc1a86 @caller=github.com/hashicorp/terraform-plugin-go@v0.14.1/tfprotov5/internal/diag/diagnostics.go:55 @module=sdk.proto diagnostic_summary="retrieving Access Profile for Managed Cluster (Subscription: "<<subscriptionid>>"
Resource Group Name: "<<aks-resource-group-name>>"
Resource Name: "<<aks-cluster-name>>"): managedclusters.ManagedClustersClient#GetAccessProfile: Failure responding to request: StatusCode=403 -- Original Error: autorest/azure: Service returned an error. Status=403 Code="AuthorizationFailed" Message="The client '<<sp-2 objectid>>' with object id '<<sp-2 objectid>>' does not have authorization to perform action 'Microsoft.ContainerService/managedClusters/accessProfiles/listCredential/action' over scope '/subscriptions/<<subscriptionid>>/resourceGroups/<<aks-resource-group-name>>/providers/Microsoft.ContainerService/managedClusters/<<aks-cluster-name>>/accessProfiles/clusterUser' or the scope is invalid. If access was recently granted, please refresh your credentials."" tf_data_source_type=azurerm_kubernetes_cluster tf_proto_version=5.3 tf_provider_addr=provider timestamp=2023-01-19T15:56:02.124Z
2023-01-19T15:56:02.125Z [ERROR] vertex "data.azurerm_kubernetes_cluster.aks_provider_config" error: retrieving Access Profile for Managed Cluster (Subscription: "<<subscriptionid>>"
Resource Group Name: "<<aks-resource-group-name>>"
Resource Name: "<<aks-cluster-name>>"): managedclusters.ManagedClustersClient#GetAccessProfile: Failure responding to request: StatusCode=403 -- Original Error: autorest/azure: Service returned an error. Status=403 Code="AuthorizationFailed" Message="The client '<<sp-2 objectid>>' with object id '<<sp-2 objectid>>' does not have authorization to perform action 'Microsoft.ContainerService/managedClusters/accessProfiles/listCredential/action' over scope '/subscriptions/<<subscriptionid>>/resourceGroups/<<aks-resource-group-name>>/providers/Microsoft.ContainerService/managedClusters/<<aks-cluster-name>>/accessProfiles/clusterUser' or the scope is invalid. If access was recently granted, please refresh your credentials."
2023-01-19T15:56:02.125Z [ERROR] vertex "data.azurerm_kubernetes_cluster.aks_provider_config (expand)" error: retrieving Access Profile for Managed Cluster (Subscription: "<<subscriptionid>>"
Resource Group Name: "<<aks-resource-group-name>>"
Resource Name: "<<aks-cluster-name>>"): managedclusters.ManagedClustersClient#GetAccessProfile: Failure responding to request: StatusCode=403 -- Original Error: autorest/azure: Service returned an error. Status=403 Code="AuthorizationFailed" Message="The client '<<sp-2 objectid>>' with object id '<<sp-2 objectid>>' does not have authorization to perform action 'Microsoft.ContainerService/managedClusters/accessProfiles/listCredential/action' over scope '/subscriptions/<<subscriptionid>>/resourceGroups/<<aks-resource-group-name>>/providers/Microsoft.ContainerService/managedClusters/<<aks-cluster-name>>/accessProfiles/clusterUser' or the scope is invalid. If access was recently granted, please refresh your credentials."
2023-01-19T15:56:02.126Z [INFO]  backend/local: plan operation completed
╷
│ Error: retrieving Access Profile for Managed Cluster (Subscription: "<<subscriptionid>>"
│ Resource Group Name: "<<aks-resource-group-name>>"
│ Resource Name: "<<aks-cluster-name>>"): managedclusters.ManagedClustersClient#GetAccessProfile: Failure responding to request: StatusCode=403 -- Original Error: autorest/azure: Service returned an error. Status=403 Code="AuthorizationFailed" Message="The client '<<sp-2 objectid>>' with object id '<<sp-2 objectid>>' does not have authorization to perform action 'Microsoft.ContainerService/managedClusters/accessProfiles/listCredential/action' over scope '/subscriptions/<<subscriptionid>>/resourceGroups/<<aks-resource-group-name>>/providers/Microsoft.ContainerService/managedClusters/<<aks-cluster-name>>/accessProfiles/clusterUser' or the scope is invalid. If access was recently granted, please refresh your credentials."
│ 
│   with data.azurerm_kubernetes_cluster.aks_provider_config,
│   on var-proj.tf line 12, in data "azurerm_kubernetes_cluster" "aks_provider_config":
│   12: data "azurerm_kubernetes_cluster" "aks_provider_config" {

If I grant sp-2 Contributor role on the aks-resource group, it works without error, but if we then do:

data "kubernetes_namespace" "example" {
  metadata {
    name = "var.aks_proj_ns"
  }
}

we get the error kubernetes_namespace.example unauthenticated (or similar)

Only if we than change the provider setup to following:

provider "kubernetes" {
  host                   = data.azurerm_kubernetes_cluster.aks_provider_config.kube_admin_config.0.host
  username               = data.azurerm_kubernetes_cluster.aks_provider_config.kube_admin_config.0.username
  password               = data.azurerm_kubernetes_cluster.aks_provider_config.kube_admin_config.0.password
  client_certificate     = base64decode(data.azurerm_kubernetes_cluster.aks_provider_config.kube_admin_config.0.client_certificate)
  client_key             = base64decode(data.azurerm_kubernetes_cluster.aks_provider_config.kube_admin_config.0.client_key)
  cluster_ca_certificate = base64decode(data.azurerm_kubernetes_cluster.aks_provider_config.kube_admin_config.0.cluster_ca_certificate)
}

everything works as expected. But we grant the project sp-2 which should then have limited permissions contributor rights on the aks resource group (which is a no go) and also RBAC admin on the cluster which I don't even know where this comes from, I only suspect that this is inherited from the Contributor role on the resource group.

Important Factoids

This issue was already filed on the kubernetes provider but I was told I should file it here because the azurerm provider creates the bug

References

Linked Issue: #issue-1551239910

@slzmruepp slzmruepp added the bug label Mar 8, 2023
@github-actions github-actions bot removed the bug label Mar 8, 2023
@slzmruepp
Copy link
Author

Linked comment with root cause analysis and advise to file it with azurerm provider: #issuecomment-1435228138

@lonegunmanb
Copy link
Contributor

lonegunmanb commented Mar 9, 2023

Hello @slzmruepp, thanks for opening this issue. According to your error message:

Resource Name: "<<aks-cluster-name>>"): managedclusters.ManagedClustersClient#GetAccessProfile: Failure responding to request: StatusCode=403 -- Original Error: autorest/azure: Service returned an error. Status=403 Code="AuthorizationFailed" Message="The client '<<sp-2 objectid>>' with object id '<<sp-2 objectid>>' does not have authorization to perform action 'Microsoft.ContainerService/managedClusters/accessProfiles/listCredential/action' over scope '/subscriptions/<<subscriptionid>>/resourceGroups/<<aks-resource-group-name>>/providers/Microsoft.ContainerService/managedClusters/<<aks-cluster-name>>/accessProfiles/clusterUser' or the scope is invalid. If access was recently granted, please refresh your credentials."" tf_data_source_type=azurerm_kubernetes_cluster tf_proto_version=5.3 tf_provider_addr=provider timestamp=2023-01-19T15:56:02.124Z
2023-01-19T15:56:02.125Z [ERROR] vertex "data.azurerm_kubernetes_cluster.aks_provider_config" error: retrieving Access Profile for Managed Cluster (Subscription: "<<subscriptionid>>"
Resource Group Name: "<<aks-resource-group-name>>"

According to the provider code:

func dataSourceKubernetesClusterRead(d *pluginsdk.ResourceData, meta interface{}) error {
	client := meta.(*clients.Client).Containers.KubernetesClustersClient
	subscriptionId := meta.(*clients.Client).Account.SubscriptionId
	ctx, cancel := timeouts.ForRead(meta.(*clients.Client).StopContext, d)
	defer cancel()

	id := managedclusters.NewManagedClusterID(subscriptionId, d.Get("resource_group_name").(string), d.Get("name").(string))
	resp, err := client.Get(ctx, id)
	if err != nil {
		if response.WasNotFound(resp.HttpResponse) {
			return fmt.Errorf("%s was not found", id)
		}

		return fmt.Errorf("retrieving %s: %+v", id, err)
	}

	profileId := managedclusters.NewAccessProfileID(subscriptionId, d.Get("resource_group_name").(string), d.Get("name").(string), "clusterUser")
	profile, err := client.GetAccessProfile(ctx, profileId)
	if err != nil {
		return fmt.Errorf("retrieving Access Profile for %s: %+v", id, err)
	}

It looks like your sp do has permission to query cluster's information, but when it tried to get AccessProfile it failed. Unlike azcli, Terraform provider must query more information to make data source complete. Personally I won't consider it as a bug since it's by design (please correct me if HashiCorp has different opinion).

A workaround I can provide is we store these K8s credential data into a KeyVault, then assign sp2 permission to read secret and certificate from it.

@slzmruepp
Copy link
Author

According to root cause analysis by @browley86, terraform azurerm provider queries the wrong (and soon to be deprecated API) which I don't think its "by design". Following the detailed analysis and writedown. This belongs in the terraform-provider-azurerm according to @browley86 because there is the code for the API query. Thats why I moved this issue and linked the original issue. Please read here:

"I just wanted to shed a bit more light on the issue, the TLDR is that Terraform is calling a soon-to-be deprecated API. More specifically, based off the error message, the endpoint is calling: https://management.azure.com/subscriptions/{subscriptionId}/resourceGroups/{resourceGroupName}/providers/Microsoft.ContainerService/managedClusters/{resourceName}/accessProfiles/{roleName}/listCredential which the link above calls out will soon be deprecated and to use either the ListClusterUserCredentials or the ListClusterAdminCredentials API. While normally the "soon-to-be deprecated" will imply time to update the underlying APIs, the issue is that newer Azure Service Principal permissions are going to be scoped against the non-deprecated API which means that newer workflows with newly created service principals, such as calling the Azure az cli, will work and Terraform will fail unless the service principal was created with permissions scoped to the old API. I wanted to post the steps to re-crate but I really struggled to get curl running. Finally a found a post the illustrates az rest and hitting endpoints so I did the following"
Here the link to the originally filed issue in the terraform-provider-kubernetes repo: hashicorp/terraform-provider-kubernetes#1964

Also the tf code is according to the documentation. But it is not working as documented. Therefore I think it would be considered a bug.

@lonegunmanb
Copy link
Contributor

@slzmruepp I got your point, I'll try to reproduce this issue on my side, if everything goes smoothly, I'll try a fix.

@lonegunmanb
Copy link
Contributor

Hi @slzmruepp, I've did some study on Aks's permission. The role you've assigned to your sp2 is Azure Kubernetes Service RBAC Admin, this role contains the following permission:

{
    "id": "/providers/Microsoft.Authorization/roleDefinitions/3498e952-d568-435e-9b2c-8d77e338d7f7",
    "properties": {
        "roleName": "Azure Kubernetes Service RBAC Admin",
        "description": "Lets you manage all resources under cluster/namespace, except update or delete resource quotas and namespaces.",
        "assignableScopes": [
            "/"
        ],
        "permissions": [
            {
                "actions": [
                    "Microsoft.Authorization/*/read",
                    "Microsoft.Resources/subscriptions/operationresults/read",
                    "Microsoft.Resources/subscriptions/read",
                    "Microsoft.Resources/subscriptions/resourceGroups/read",
                    "Microsoft.ContainerService/managedClusters/listClusterUserCredential/action"
                ],
                "notActions": [],
                "dataActions": [
                    "Microsoft.ContainerService/managedClusters/*"
                ],
                "notDataActions": [
                    "Microsoft.ContainerService/managedClusters/resourcequotas/write",
                    "Microsoft.ContainerService/managedClusters/resourcequotas/delete",
                    "Microsoft.ContainerService/managedClusters/namespaces/write",
                    "Microsoft.ContainerService/managedClusters/namespaces/delete"
                ]
            }
        ]
    }
}

I've tried to reproduce this issue on my side but I've got error message does not have authorization to perform action 'Microsoft.ContainerService/managedClusters/read', which according to the permissions of Azure Kubernetes Service RBAC Admin is correct. Have you assigned other roles to sp2?

I've used the following code to reproduce this issue, would you please help me by correcting my code so I can reproduce the error? Thanks:

variable "client_id" {
  default = ""
}
variable "client_secret" {
  default = ""
}
variable "subscription_id" {
  default = ""
}
variable "tenant_id" {
  default = ""
}
provider "azurerm" {
  features {}
  client_id       = var.client_id
  client_secret   = var.client_secret
  subscription_id = var.subscription_id
  tenant_id       = var.tenant_id
}

provider "azuread" {
  client_id       = var.client_id
  client_secret   = var.client_secret
  tenant_id       = var.tenant_id
}

resource "azurerm_resource_group" "example" {
  name     = "f-20843-1"
  location = "West Europe"
}

resource "azurerm_kubernetes_cluster" "example" {
  name                              = "20843-aks1"
  location                          = azurerm_resource_group.example.location
  resource_group_name               = azurerm_resource_group.example.name
  dns_prefix                        = "exampleaks1"
  role_based_access_control_enabled = true

  default_node_pool {
    name       = "default"
    node_count = 1
    vm_size    = "Standard_D2_v2"
  }

  identity {
    type = "SystemAssigned"
  }
}

resource "azuread_application" "temp_aks" {
  display_name = "temp_aks1"
}

resource "azuread_application_password" "example" {
  application_object_id = azuread_application.temp_aks.object_id
}

resource "azuread_service_principal" "sp" {
  application_id = azuread_application.temp_aks.application_id
}

provider "kubernetes" {
  host                   = azurerm_kubernetes_cluster.example.kube_config.0.host
  client_certificate     = base64decode(azurerm_kubernetes_cluster.example.kube_config.0.client_certificate)
  client_key             = base64decode(azurerm_kubernetes_cluster.example.kube_config.0.client_key)
  cluster_ca_certificate = base64decode(azurerm_kubernetes_cluster.example.kube_config.0.cluster_ca_certificate)
}

resource "kubernetes_namespace" "test1" {
  metadata {
    name = "test1"
  }
}

resource "azurerm_role_assignment" "binding" {
  principal_id         = azuread_service_principal.sp.object_id
  scope                = "${azurerm_kubernetes_cluster.example.id}/namespaces/test1"
  role_definition_name = "Azure Kubernetes Service RBAC Admin"
  depends_on           = [kubernetes_namespace.test1]
}

output "azuread_app_id" {
  value = azuread_application.temp_aks.application_id
}

output "azuread_application_password" {
  sensitive = true
  value     = azuread_application_password.example.value
}

@slzmruepp
Copy link
Author

slzmruepp commented Mar 10, 2023

Hi, yes it is in the description but I did not copy it in the tf code. I wrote in the Expected behavior section: "(the sp-2 of the tf context has Kubernetes User Role which should allow it to download the certs and auth for acting on the specific namespace.". So the sp-2 also has assigned the following Role (to a Security Group the SP is member of):

resource "azurerm_role_assignment" "role_cluster_user" {
  scope                = var.aks_cluster_id
  role_definition_name = "Azure Kubernetes Service Cluster User Role"
  principal_id         = azuread_group.sg.id
  depends_on           = [azuread_group.sg]
}

resource "azurerm_role_assignment" "role_cluster_rbac_admin" {
  scope                = "${var.aks_cluster_id}/namespaces/${var.aks_proj_ns}"
  role_definition_name = "Azure Kubernetes Service RBAC Admin"
  principal_id         = azuread_group.sg.id
  depends_on           = [azuread_group.sg]
}

resource "azuread_group" "sg" {
  description      = "This group grants access to the specific namespace in the aks environment"
  display_name     = var.sg_name
  owners           = distinct(concat([data.azuread_client_config.current.object_id],var.sg_owners))
  security_enabled = var.sg_security_enabled
  members          = distinct(concat([azuread_service_principal.sp.object_id],var.sg_members))
}

So the User Role should allow the sp-2 to fetch the cluster config (kubeconfig) according to the documentation. According to @browley86 this works not with the following (soon to be deprecated) API: https://management.azure.com/subscriptions/{subscriptionId}/resourceGroups/{resourceGroupName}/providers/Microsoft.ContainerService/managedClusters/{resourceName}/accessProfiles/{roleName}/listCredential
Because this API does not respect RBAC Roles. But with this API it does work ok because IAM is implemented there:
https://learn.microsoft.com/en-us/rest/api/aks/managedclusters/listclusterusercredentials

Here is the code which tests the assumption:

# [OPTIONAL] Pull latest docker container for run
docker run -it mcr.microsoft.com/azure-cli /bin/bash

# Set ENV for below commands 
export AZURE_CLIENT_ID=<replace w/your Service Principal's clientId>
export AZURE_CLIENT_SECRET=<replace w/your Service Principal's clientSecret>
export AZURE_SUBSCRIPTION_ID=<replace w/your Service Principal's subscriptionId>
export AZURE_TENANT_ID=<replace w/your Service Principal's tenantId>
export AZURE_RESOURCE_GROUP=<replace w/the Resource Group of the AKS cluster>
export AZURE_RESOURCE_NAME=<replace w/name of target AKS cluster>

# Get SP token
az login --service-principal -u $AZURE_CLIENT_ID -p $AZURE_CLIENT_SECRET --tenant $AZURE_TENANT_ID

# Hit accessProfiles endpoint
az rest -m post --header "Accept=application/json" -u 'https://management.azure.com/subscriptions/${AZURE_SUBSCRIPTION_ID}/resourceGroups/${AZURE_RESOURCE_GROUP}/providers/Microsoft.ContainerService/managedClusters/${AZURE_RESOURCE_NAME}/accessProfiles/clusterUser/listCredential?api-version=2022-11-01'
## This results in a 403: 
## Forbidden({"error":{"code":"AuthorizationFailed","message":"The client '<CLIENT>' with object id '<SP_OBJECT_ID>' does not have authorization to perform action 'Microsoft.ContainerService/managedClusters/accessProfiles/listCredential/action' over scope '/subscriptions/<SUBSCRIPTION>/resourceGroups/<RESOURCE_GROUP>/providers/Microsoft.ContainerService/managedClusters/<AZURE_RESOURCE_NAME>/accessProfiles/clusterUser' or the scope is invalid. If access was recently granted, please refresh your credentials."}})

# Hit listClusterUserCredential endpoint
az rest -m post --header "Accept=application/json" -u "https://management.azure.com/subscriptions/${AZURE_SUBSCRIPTION_ID}/resourceGroups/${AZURE_RESOURCE_GROUP}/providers/Microsoft.ContainerService/managedClusters/${AZURE_RESOURCE_NAME}/listClusterUserCredential?api-version=2022-11-01"
## Returns 200 w/JSON {"kubeconfigs":[{"name":"clusterUser","value": "<BASE64 ENCODED KUBECONFIG STRING>"}]}

Unfortunately, the terraform-provider-azurerm is using the former API which does not respect IAM Roles.
So two questions arise:

  1. Is there a workaround (For example add MS Graph API Permissions instead of IAM Roles, and if, what Permissions)
  2. Is there a timeline to switch this API in the Provider.

@lonegunmanb
Copy link
Contributor

Thanks @slzmruepp for your detailed explanation, I'll try a pr to switch the API, but cannot promise a timeline.

@slzmruepp
Copy link
Author

slzmruepp commented Mar 10, 2023

Thanks @slzmruepp for your detailed explanation, I'll try a pr to switch the API, but cannot promise a timeline.

I added the missing pieces to the comment. So basically we create a group, add the service principal as member, and assign the group the two roles Kubernetes User and Kubernetes RBAC Admin for the namespace. We expect the Roles to be inherited by the group members. (The group would act also as breaking glass group to add human users to it in case that kubectl interactions are necessary)

@browley86
Copy link

Hi @lonegunmanb instead of switching the backend API for the existing resource, it might be easier, faster, and less risky to just add a param to specify the API backend in the provider config. That way, for now, the default would be to keep the existing working accessProfiles functionality to the old backend API and, for people who have newer SPs, give them a mechanism to override to use the newer listClusterUserCredential API. This provides a workaround for this issue and also would provide a workaround in the future if Microsoft decides to update its API endpoints. It would also allow testing with the override endpoint to make sure any surrounding code works. Thanks in advance for helping with this.

lonegunmanb added a commit to lonegunmanb/terraform-provider-azurerm that referenced this issue Mar 14, 2023
lonegunmanb added a commit to lonegunmanb/terraform-provider-azurerm that referenced this issue Mar 15, 2023
lonegunmanb added a commit to lonegunmanb/terraform-provider-azurerm that referenced this issue Mar 20, 2023
@github-actions github-actions bot added this to the v3.49.0 milestone Mar 20, 2023
stephybun pushed a commit that referenced this issue Mar 20, 2023
…le (#20927)

* use list API instead of GetAccessProfile to fix #20843

* change GetAccessProfile to List Credentials API for kubernetes cluster resource

* remove unused functions
@github-actions
Copy link

This functionality has been released in v3.49.0 of the Terraform Provider. Please see the Terraform documentation on provider versioning or reach out if you need any assistance upgrading.

For further feature requests or bug reports with this functionality, please create a new GitHub issue following the template. Thank you!

@github-actions
Copy link

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Apr 24, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
5 participants