Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Error building AzureRM Client" with device login #7289

Closed
degerrit opened this issue Jun 10, 2020 · 5 comments
Closed

"Error building AzureRM Client" with device login #7289

degerrit opened this issue Jun 10, 2020 · 5 comments

Comments

@degerrit
Copy link

degerrit commented Jun 10, 2020

We've been using terraform from within our CI for a year or two, using device code (since our CI cannot open a browser window).

az login --use-device-code
az account set --subscription "${SUBSCRIPTION}"
# example of working az command
az resource tag --tags gitlabPipelineId=${CI_PIPELINE_ID} --resource-group "AA" ...
terraform init -input=false
TF_LOG=DEBUG OCI_GO_SDK_DEBUG=v terraform plan -input=false

(We're not using Service Principals in the apply stage, instead relying on the user authentication of the DevOps engineer who takes ownership for the change)

Since a month or so, this (unchanged) setup is broken - Error building AzureRM Client. We upgraded to the latest terraform and latest 1.x azurerm provider, to no avail. This happens on multiple subscriptions (which are corporately managed).

az commands definitely work in our CI (incl write operations) - so we are definitely logged in properly using device code and there is successful communication with the Azure API.

But terraform fails to use those same credentials properly now, for some new reason.

$ az login --use-device-code
 WARNING: To sign in, use a web browser to open the page https://microsoft.com/devicelogin and enter the code AAAAAAAAA to authenticate.
 [
   {
     "cloudName": "AzureCloud",
     "id": "a-x-x-x-x",
     "isDefault": true,
     "name": "MyCompany",
     "state": "Enabled",
     "tenantId": "aa-x-x-x-x",
     "user": {
       "name": "me@mycompany.onmicrosoft.com",
       "type": "user"
     }
   },

Additional testing shows that the exact same terraform config works in Azure's cloud shell. Same user, same terraform versions, same config, and same remote backend (terraform state). The only difference I see is our CI's location (behind a proxy) and its use of device login.

Or, maybe, the Azure API behaves differently, and introduced a breaking change for this combination of terraform and device login.

I obviously lack additional debug output from a successful run in our CI for comparison (i.e. from a month ago).

Comparing debug logs, the main differences I see between success and failure is the missing line Using Managed Service Identity for Authentication in the CI run, which is present in the successful Azure cloud shell run.
Following that missing line, it seems terraform assumes (and fails) to use a service principal.

How does terraform poll what authentication method(s) are available?

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Terraform (and AzureRM Provider) Version

  • Terraform v0.12.26
  • provider "template" (hashicorp/template) 2.1.2
  • provider "azurerm" (hashicorp/azurerm) 1.44.0

Affected Resource(s)

  • No resources yet, just initial authentication.

Terraform Configuration Files

# This is a minimal, slightly obfuscated, config - but I've tested that this also fails in the same way described as our full config

terraform {
  backend "azurerm" {
    storage_account_name = "terraformstateAA"
    container_name       = "tfstate"
    key                  = "terraform.tfstate"
    access_key           = "aaa"
  }
}
provider "azurerm" {
  version = "=1.44.0"
  skip_provider_registration = "true"
  subscription_id = "aa-x-x-x-x"
}
provider "template" {
  version = "~> 2.1"
}
terraform {
  required_version = ">= 0.12"
}
resource "azurerm_application_security_group" "K8sCompute" {
  name                = "AAK8sQualV2Compute"
  location            = "North Europe"
  resource_group_name = "RG_AA"
  tags = {
  }
}

Debug Output

Working Azure Shell terraform run

2020-06-09T16:56:17.350Z [WARN]  plugin.stdio: received EOF, stopping recv loop: err="rpc error: code = Unimplemented desc = unknown service plugin.GRPCStdio"
2020/06/09 16:56:17 [TRACE] GRPCProvider: Configure
2020-06-09T16:56:17.390Z [DEBUG] plugin.terraform-provider-azurerm_v1.35.0_x4: Testing if Service Principal / Client Certificate is applicable for Authentication..
2020-06-09T16:56:17.390Z [DEBUG] plugin.terraform-provider-azurerm_v1.35.0_x4: Testing if Multi Tenant Service Principal / Client Secret is applicable for Authentication..
2020-06-09T16:56:17.390Z [DEBUG] plugin.terraform-provider-azurerm_v1.35.0_x4: Testing if Service Principal / Client Secret is applicable for Authentication..
2020-06-09T16:56:17.390Z [DEBUG] plugin.terraform-provider-azurerm_v1.35.0_x4: Testing if Managed Service Identity is applicable for Authentication..
2020-06-09T16:56:17.390Z [DEBUG] plugin.terraform-provider-azurerm_v1.35.0_x4: Using Managed Service Identity for Authentication
2020-06-09T16:56:17.390Z [DEBUG] plugin.terraform-provider-azurerm_v1.35.0_x4: [DEBUG] Using MSI msiEndpoint "http://localhost:50342/oauth2/token"
2020-06-09T16:56:17.390Z [DEBUG] plugin.terraform-provider-azurerm_v1.35.0_x4: Getting OAuth config for endpoint https://login.microsoftonline.com/ with  tenant aaaaaaaa-bbbb-4444-dddd-777777777777
2020-06-09T16:56:17.390Z [DEBUG] plugin.terraform-provider-azurerm_v1.35.0_x4: [DEBUG] getAuthorizationToken with MSI msiEndpoint "http://localhost:50342/oauth2/token", ClientID "" for msiEndpoint "https://management.azure.com/"
2020-06-09T16:56:17.390Z [DEBUG] plugin.terraform-provider-azurerm_v1.35.0_x4: [DEBUG] getAuthorizationToken with MSI msiEndpoint "http://localhost:50342/oauth2/token", ClientID "" for msiEndpoint "https://graph.windows.net/"

Broken CI terraform run:

2020-06-09T18:53:13.599+0200 [WARN]  plugin.stdio: received EOF, stopping recv loop: err="rpc error: code = Unimplemented desc = unknown service plugin.GRPCStdio"
 2020/06/09 18:53:13 [TRACE] GRPCProvider: Configure
 2020-06-09T18:53:13.654+0200 [DEBUG] plugin.terraform-provider-azurerm_v1.35.0_x4: Testing if Service Principal / Client Certificate is applicable for Authentication..
 2020-06-09T18:53:13.654+0200 [DEBUG] plugin.terraform-provider-azurerm_v1.35.0_x4: Testing if Multi Tenant Service Principal / Client Secret is applicable for Authentication..
 2020-06-09T18:53:13.654+0200 [DEBUG] plugin.terraform-provider-azurerm_v1.35.0_x4: Testing if Service Principal / Client Secret is applicable for Authentication..
 2020-06-09T18:53:13.654+0200 [DEBUG] plugin.terraform-provider-azurerm_v1.35.0_x4: Using Service Principal / Client Secret for Authentication
 2020/06/09 18:53:13 [ERROR] <root>: eval: *terraform.EvalConfigProvider, err: Error building AzureRM Client: 2 errors occurred:
 	* A Client ID must be configured when authenticating as a Service Principal using a Client Secret.
 	* A Tenant ID must be configured when authenticating as a Service Principal using a Client Secret.
 2020/06/09 18:53:13 [ERROR] <root>: eval: *terraform.EvalSequence, err: Error building AzureRM Client: 2 errors occurred:
 	* A Client ID must be configured when authenticating as a Service Principal using a Client Secret.
 	* A Tenant ID must be configured when authenticating as a Service Principal using a Client Secret.

Expected Behavior

Terraform should use the saved az authentication.

Actual Behavior

Terraform could not authenticate.

Steps to Reproduce

  1. az login --use-device-code
  2. Browse to https://microsoft.com/devicelogin and authorize the "device"
  3. terraform plan
  4. fail

Important Factoids

  • There are Azure custom roles in place (corporate) over which we have no control
  • Private peering over Expressroute, internet access (incl. Azure API) travel outwards over a corporate proxy (but note that 'az' commands work)

References

  • #0000
@degerrit
Copy link
Author

Just upgraded the azure-cli package - I had missed that difference - it was 2.0.80 - now 2.7.0
Unfortunately, this does not help.

Anything I can do to check the accessTokens.json? I assume that's what the terraform provider uses somehow?

@degerrit
Copy link
Author

degerrit commented Jun 11, 2020

I'm thinking this could be similar to:

We do have multiple subscriptions.

And I believe I may have recently been added to a new subscription, presumably with the same tenant, which may have triggered this bug?
On the other hand, terraform in the Azure Cloud Shell doesn't have a problem with all this.

@degerrit
Copy link
Author

I think I finally found the problem.

A colleague had set ARM_CLIENT_SECRET in our CI, e.g. stored outside our codebase (in GitLab CI project settings), when he was working with Service Principals in another branch.

Apparently our production branch was inferring credential mechanisms based on that, and failing.

Close? Or can the terraform output be improved to inform us that it has found this variable and intends to use it?

@tombuildsstuff
Copy link
Member

hey @degerrit

Thanks for opening this issue.

When running in CloudShell Terraform uses MSI rather than the Azure CLI for authentication - which is why this behaviour is different. Terraform has a list of supported authentication methods which get tried in turn (if enabled) (shared by both the Azure Backend and the Azure Provider) - as such we'll work down the list working through when whilst we find one.

In this instance, since we're logging Using Service Principal / Client Secret for Authentication it appears that a Client Secret is being specified (since this is the criteria for using this auth method) - so I'd recommend double-checking the Environment Variables being used here (since from the Terraform Configuration it appears that nothing's being passed inline).

There's been some changes to the way that the Azure CLI authentication works throughout the 1.x lifecycle, where we've gone from parsing the accessTokens.json file to shelling out to the Azure CLI instead; whilst this behaviour will change in an upcoming release of 2.x, this doesn't appear to be the root-cause since it appears that a Client Secret is being specified.

It's worth noting that the behaviour of the Azure CLI is unpredictable when being run in a headless environment (for example, we frequently see spurious output from the Azure CLI) - whilst this may work as expected when fully configured (for example, configuring any preferences such as data collection for the Azure CLI prior to running Terraform), since this behaviour can change under-our-feet unfortunately this isn't something we officially support when running in an automated environment (although I can understand your use-case).

Since this should be fixed by removing the Client Secret being used here - I'm going to close this issue for the moment - but should you have further questions I believe you should be able to get an answer for this using one of the Community Resources.

Thanks!

@ghost
Copy link

ghost commented Jul 11, 2020

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. If you feel I made an error 🤖 🙉 , please reach out to my human friends 👉 hashibot-feedback@hashicorp.com. Thanks!

@ghost ghost locked and limited conversation to collaborators Jul 11, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants