Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ISSUE] Issue with databricks_metastore resource #2615

Closed
probrittle opened this issue Aug 25, 2023 · 21 comments
Closed

[ISSUE] Issue with databricks_metastore resource #2615

probrittle opened this issue Aug 25, 2023 · 21 comments

Comments

@probrittle
Copy link

Hello, I'm trying to deploy the databricks_metastore resource to our Databricks account but I'm encountering a weird error when trying to run terraform apply.

Configuration

provider "databricks" {
  alias = "workspace"
  host  = module.workspace.workspace_url
}

provider "databricks" {
  alias      = "account"
  host       = module.workspace.workspace_url
  account_id = module.workspace.account_id
}

resource "databricks_metastore" "this" {
  name          = "primary"
  storage_root  = "s3://${data.aws_s3_bucket.this.id}/metastore"
  owner         = "admin"
  region        = "us-east-1"
  force_destroy = true

  provider = databricks.account
}

Expected Behavior

Metastore resource is created in Databricks account.

Actual Behavior

Error: cannot create metastore: invalid character '<' looking for beginning of value

  with databricks_metastore.this,
  on root.tf line 60, in resource "databricks_metastore" "this":
 60: resource "databricks_metastore" "this" {

Steps to Reproduce

1. terraform plan
2. terraform apply

Terraform and provider versions

terraform {
  required_version = "~> 1.5.0"
}

provider "registry.terraform.io/databricks/databricks" {
  version     = "1.24.0"
}

provider "registry.terraform.io/hashicorp/aws" {
  version     = "4.67.0"
}
@nkvuong
Copy link
Contributor

nkvuong commented Aug 25, 2023

@probrittle could you check 2 things

  • your account provider is pointing to workspace_url host = module.workspace.workspace_url
  • the identity TF uses is an account admin

@probrittle
Copy link
Author

@nkvuong Yes to both.

@probrittle
Copy link
Author

@nkvuong Correction on my last comment:

  1. for the host section in the account alias dabricks provider it is pointing to our databricks account URL

  2. for the owner section of the metastore resource creation it is pointing to the account admin email.

@nkvuong
Copy link
Contributor

nkvuong commented Aug 29, 2023

@probrittle

you also need to configure the account provider alias with an account admin authentication (usually an OAuth token generated from the account console)

the error message you are getting is because the authentication has failed

@probrittle
Copy link
Author

@nkvuong I added the authentication piece:

provider "databricks" {
  alias      = "account"
  host       = var.host
  account_id = module.workspace.account_id
  client_id = var.client_id
  client_secret = var.client_secret
}

However, I'm still getting the same error:

│ Error: cannot create metastore: invalid character '<' looking for beginning of value
│ 
│   with databricks_metastore.this,
│   on main.tf line 8, in resource "databricks_metastore" "this":
│    8: resource "databricks_metastore" "this" {

@probrittle
Copy link
Author

@nkvuong I've also tried with the username & password fields, as well as the token field by itself in the account databricks provider section and I still got the same error.

@probrittle
Copy link
Author

@nkvuong I was able to create the metastore resource by using the workspace level databricks provider as opposed to account like this:

provider "databricks" {
  alias = "workspace"
  host  = module.workspace.workspace_url
}

resource "databricks_metastore" "this" {
  name          = "primary"
  storage_root  = "s3://${data.aws_s3_bucket.this.id}/metastore"
  owner         = var.owner
  region        = "us-east-1"
  force_destroy = true

  provider = databricks.workspace
}

This is a bug correct? Asking because you mentioned that the databricks provider needs to be at the account level.

Also, I'm now encountering the error message invalid character '<' looking for beginning of value when creating the metastore assignment resource. Here is my current configuration:

provider "databricks" {
  alias      = "account"
  host       = var.host
  account_id = module.workspace.account_id
}

provider "databricks" {
  alias = "workspace"
  host  = module.workspace.workspace_url
}

resource "databricks_metastore" "this" {
  name          = "primary"
  storage_root  = "s3://${data.aws_s3_bucket.this.id}/metastore"
  owner         = var.owner
  region        = "us-east-1"
  force_destroy = true

  provider = databricks.workspace
}

resource "databricks_metastore_assignment" "this" {
  metastore_id = databricks_metastore.this.id
  workspace_id = module.workspace.workspace_id

  provider = databricks.account
}

Please let me know what can be done to resolve this.

@nkvuong
Copy link
Contributor

nkvuong commented Sep 4, 2023

@probrittle the error message shows that the provider cannot authenticate with the account-level API - could you check that you are able to do this with just the sdk or api?

@probrittle
Copy link
Author

@nkvuong This works using an account admin token for SDK and API.

Additionally, this works in terraform IF we use a workspace-level databricks provider as opposed to a account-level databricks provider block.

However, these resources are supposed to be deployed using an account-level databricks provider block correct?

Or have I misunderstood and these resources are supposed to be created using a workspace-level databricks provider block?

@nkvuong
Copy link
Contributor

nkvuong commented Sep 6, 2023

@probrittle from 1.24.0 onwards, these resources can be deployed (and it is recommended) using an account-level databricks provider block

@probrittle
Copy link
Author

@nkvuong I'm running with the databricks provider version of 1.24.0 and I'm still having issues deploying these resources using an account-level databricks provider block:

terraform {
  required_version = "~> 1.5.0"
}

provider "registry.terraform.io/databricks/databricks" {
  version     = "1.24.0"
}

provider "registry.terraform.io/hashicorp/aws" {
  version     = "4.67.0"
}

@probrittle
Copy link
Author

@nkvuong Has there been any update on this?

Additionally, I want add one thing to what I mentioned about SDK and API.

This works using an account admin token for SDK and API but only if you use the workspace URL, if you use the account URL then this does not work.

Could you please let us know as soon as this gets resolved? This seems more and more like a bug and this is blocking us from being able to deploy Unity Catalog/Databricks resources via Terraform at the account-level.

Thanks.

@nkvuong
Copy link
Contributor

nkvuong commented Sep 8, 2023

This works using an account admin token for SDK and API but only if you use the workspace URL, if you use the account URL then this does not work.

there is clearly an issue with the token you are using to auth with the account-level, and the Terraform provider just returns the error message. I am not sure if there is anything to be done on this project to help

@grusin-db
Copy link

grusin-db commented Sep 12, 2023

I think issue if with provider setup:

provider "databricks" {
  alias      = "account"
  host       = module.workspace.workspace_url
  account_id = module.workspace.account_id
}

the host there should point to account console host, not the workspace host as it points, hence that's why you are getting auth error (the '<' error is auth error).

for azure, this should look like this, the account endpoint host is static per cloud, hence you can hardcode it if you don't plan to run code on multiple clouds.

provider "databricks" {
  alias       = "account"
  host       = "https://accounts.azuredatabricks.net"
  account_id = "c3d0c960-58a1-4b23-b7f5-de6ca6fc1e2b"
}

for Azure docs are here: https://registry.terraform.io/providers/databricks/databricks/latest/docs/guides/unity-catalog-azure#provider-initialization, see other guides for AWS and Google Cloud, all having the right endpoint for account console there.

@probrittle
Copy link
Author

@grusin-db

We are using the account console host for the host variable. I specified this at an earlier comment:

provider "databricks" {
  alias      = "account"
  host       = var.host
  account_id = module.workspace.account_id
}

Where var.host is the account console host URL just like the example you gave me regarding Azure and we are still getting the authentication error.

@grusin-db
Copy link

grusin-db commented Sep 12, 2023

indeed you did mention you changed it, i did not notice it. could you please confirm what is the exact value in the var.host?

@grusin-db
Copy link

also, please try running in debug mode, it should help to see what API calls are made:

TF_LOG=DEBUG DATABRICKS_DEBUG_TRUNCATE_BYTES=250000 terraform apply -no-color 2>&1 |tee tf-debug.log

more hints here: https://registry.terraform.io/providers/databricks/databricks/latest/docs/guides/troubleshooting

@probrittle
Copy link
Author

@grusin-db

Here is the exact value of var.host:

provider "databricks" {
  alias      = "account"
  host       = "https://accounts.cloud.databricks.com"
  account_id = module.workspace.account_id
}

@probrittle
Copy link
Author

We also ran the debug and I cannot provide the full output as it showing personal credentials and account id.

However, the reoccurring error is this: Error: default auth: cannot configure default credentials.

@bhupendra-patil
Copy link

We were able to use the CLI with cfg and verified that the client_secret, id combination works, however terraform provider still seems to run into the authentication issue

@probrittle
Copy link
Author

We are no longer running into the authentication issue for terraform.

We found out that there were conflicts in authentication due to us stating variables in both the databricks provider block and also exporting those same variables as environment variables.

Example:

provider "databricks" {
  alias      = "account"
  host       = "https://accounts.cloud.databricks.com"
  account_id = module.workspace.account_id
  client_id = "12345"
  client_secret = "abcdefg"
}

export DATABRICKS_CLIENT_ID="12345"
export DATABRICKS_CLIENT_SECRET="abcdefg"

We resolved this by keeping the alias, host, and account_id variables in the databricks provider block, and removing the client_id and client_secret variables from the block and exporting them as environment variables instead.

@bhupendra-patil thank you for all the help!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants