Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ISSUE] Azure authentication stopped working with azurerm v2.71.0+ #781

Closed
mark-greene opened this issue Aug 18, 2021 · 13 comments · Fixed by #791
Closed

[ISSUE] Azure authentication stopped working with azurerm v2.71.0+ #781

mark-greene opened this issue Aug 18, 2021 · 13 comments · Fixed by #791
Labels
azure Occurring on Azure cloud

Comments

@mark-greene
Copy link

mark-greene commented Aug 18, 2021

Terraform Version

terraform v1.04
databricks v0.3.7
azurerm v2.71.0+ (works on ≤2.70.0) cc @favoretti

Quick Mitigation: use azurerm ≤2.70.0 or azurerm >= v2.73.0

Terraform Configuration Files

provider "databricks" {
  azure_workspace_resource_id = azurerm_databricks_workspace.this.id
  azure_client_id             = var.client_id
  azure_client_secret         = var.client_secret
  azure_tenant_id             = var.tenant_id
}

or another example

provider "databricks" {
  azure_workspace_resource_id = azurerm_databricks_workspace.databricks_ws.id
  azure_client_id             = azuread_application.databricks.application_id
  azure_client_secret         = "${random_string.password.result}"
  azure_tenant_id             = "${data.azurerm_client_config.current.tenant_id}"
}

Actual Behavior

Error: authentication is not configured for provider. Please configure it
│ through one of the following options:
│ 1. DATABRICKS_HOST + DATABRICKS_TOKEN environment variables.
│ 2. host + token provider arguments.
│ 3. azure_databricks_workspace_id + AZ CLI authentication.
│ 4. azure_databricks_workspace_id + azure_client_id + azure_client_secret + azure_tenant_id for Azure Service Principal authentication.
│ 5. Run `databricks configure --token` that will create ~/.databrickscfg file.
│ 
│ Please check https://registry.terraform.io/providers/databrickslabs/databricks/latest/docs#authentication for details

and logs show Explicit and implicit attributes: azure_client_id, azure_client_secret, azure_tenant_id, though azure_databricks_workspace_id is empty, but it should not be empty.

Steps to Reproduce

Please list the steps required to reproduce the issue, for example:

  1. terraform plan

Important Factoids

Azure methods 3 and 4 were working fine until recently. Now only 1 and 2 are working. I have gone back to v0.3.5 with the same results so I'm guessing this is a change on the Azure side but I'm not positive.

The text azure_databricks_workspace_id only appears in the string for this error message, so it would seem that:

  1. the error message is wrong
  2. there is some other issue that is the cause of the error message

I hit this issue while trying to upgrade our databricks provider from 0.3.2 to 0.3.7. However, that caused TF to also pull in this upgrade:

 provider "registry.terraform.io/hashicorp/azurerm" {
-  version     = "2.67.0"
+  version     = "2.72.0"

And that seems to be the operative change: even if I revert the databricks provider back to 0.3.2, I get this error if I am on the 2.72.0 azurerm, and I don't if I'm on the 2.67.0 azurerm. I did some bisecting, and it seems like it's at 2.71.0 where I get this error. (I don't get it with 2.70.0)

@alexott
Copy link
Contributor

alexott commented Aug 18, 2021

Have you looked through behavior changes documented in changelog?

@mark-greene
Copy link
Author

That's always the first place I look and I did not see anything obvious. There have not been any added since v0.3.4 and this ran successfully with v0.3.6. This is recent breakage. The error was first seen in our Azure DevOps pipeline that deploys / manages our workspace. To debug the issue, I ran our terraform automation locally with the same results. When running in the Azure DevOps pipeline it is using a service principal to authenticate and when running locally it is using my personal azure active directory credentials via the azure cli (az login). When running locally I switched to using a databricks PAT and it started working but we need our automation to use the service principal.

@alexott
Copy link
Contributor

alexott commented Aug 18, 2021

It's interesting. I really thought that it's caused by this one, introduced in 0.3.6:

Azure auth with SPN now uses AAD token by default instead of PAT. Previous behavior (using PAT) could be restored by setting azure_use_pat_for_spn to true (#721)

but it was running with 0.3.6, then it's something else. So we need to look what could be wrong there.

@alexott alexott changed the title Azure authentication stopped working Provider bug [ISSUE] Azure authentication stopped working Provider bug Aug 19, 2021
@nfx
Copy link
Contributor

nfx commented Aug 19, 2021

It might be related to Google OIDC authorizer then, which was added in 0.3.7

Rollback to 0.3.6, for now

@nfx
Copy link
Contributor

nfx commented Aug 19, 2021

@mark-greene please run TF_LOG=DEBUG terraform apply to enable debug mode through the TF_LOG environment variable. Look specifically for "Explicit and implicit attributes" lines, that should indicate authentication attributes used. Please paste those here

@alexott
Copy link
Contributor

alexott commented Aug 19, 2021

Interesting. I just retested 0.3.7 with SPN auth (with AAD & PAT), and also AZCLI authentication - it works for me.

For SPN auth you should see

2021-08-19T08:38:01.144+0200 [INFO]  provider.terraform-provider-databricks_v0.3.7: Using Azure Service Principal client secret authentication: timestamp=2021-08-19T08:38:01.144+0200
2021-08-19T08:38:01.144+0200 [INFO]  provider.terraform-provider-databricks_v0.3.7: Generating AAD token for Azure Service Principal: timestamp=2021-08-19T08:38:01.144+0200
2021-08-19T08:38:01.144+0200 [DEBUG] provider.terraform-provider-databricks_v0.3.7: Getting Workspace ID via management token.: timestamp=2021-08-19T08:38:01.144+0200

For SPN auth with PAT you should see:

2021-08-19T08:32:46.463+0200 [INFO]  provider.terraform-provider-databricks_v0.3.7: Using Azure Service Principal client secret authentication: timestamp=2021-08-19T08:32:46.463+0200
2021-08-19T08:32:46.463+0200 [INFO]  provider.terraform-provider-databricks_v0.3.7: Generating PAT token Azure Service Principal client secret authentication: timestamp=2021-08-19T08:32:46.463+0200
2021-08-19T08:32:46.463+0200 [DEBUG] provider.terraform-provider-databricks_v0.3.7: Getting Workspace ID via management token.: timestamp=2021-08-19T08:32:46.463+0200
2021-08-19T08:32:47.004+0200 [DEBUG] provider.terraform-provider-databricks_v0.3.7: GET https://management.azure.com/subscriptions/....

@mark-greene
Copy link
Author

mark-greene commented Aug 19, 2021

@nfx Here is the output of the plan. It is failing on the plan so I'm not trying an apply. It is not showing any output like @alexott mentioned.

2021-08-19T09:59:32.053-0400 [INFO]  provider.terraform-provider-databricks_v0.3.7: Explicit and implicit attributes:: timestamp=2021-08-19T09:59:32.053-0400
2021-08-19T09:59:32.058-0400 [INFO]  provider.terraform-provider-databricks_v0.3.7: ~/.databrickscfg not found on current host: timestamp=2021-08-19T09:59:32.

Again I can get a plan to work with methods 1,2 and 5 (databricks), just not with 3 or 4 (azure).

I have tried it both with explicit and implicit Azure credentials.

provider "databricks" {
  azure_workspace_resource_id = var.workspace_resource_id
  # azure_client_id          = "***********"
  # azure_client_secret      = "***********"
  # azure_tenant_id          = "***********"
}

@alexott
Copy link
Contributor

alexott commented Aug 20, 2021

Please send us debug logs (with TF_LOG=DEBUG DATABRICKS_DEBUG_TRUNCATE_BYTES=250000 set) for both implicit & explicit authentication (make sure that client secret is removed from the logs for explicit auth). You can send them to alexey.ott at databricks.com

@nfx
Copy link
Contributor

nfx commented Aug 20, 2021

@nfx nfx changed the title [ISSUE] Azure authentication stopped working Provider bug [ISSUE] Azure authentication stopped working with azurerm v2.71.0+ Aug 20, 2021
@favoretti
Copy link

favoretti commented Aug 20, 2021

// cc @tombuildsstuff
// cc holiday-mode @WodansSon

@nfx
Copy link
Contributor

nfx commented Aug 20, 2021

researching those things now

@mark-greene
Copy link
Author

With the release of hashicorp/azurerm v2.73.0 everything is now working as expected.

@nfx
Copy link
Contributor

nfx commented Aug 20, 2021

I'll improve error reporting in the next release and it'll contain field/env variable names.

@nfx nfx linked a pull request Aug 24, 2021 that will close this issue
@nfx nfx added the azure Occurring on Azure cloud label Feb 23, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
azure Occurring on Azure cloud
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants