In both of the specified staging and prod workspaces, this module:
- Creates and configures a service principal with appropriate permissions and entitlements to run CI/CD for a project.
- Creates a workspace directory as a container for project-specific resources
The service principals are granted CAN_MANAGE
permissions on the created workspace directories.
NOTE:
- This module is in preview so it is still experimental and subject to change. Feedback is welcome!
- The Databricks providers that are passed into the module must be configured with workspace admin permissions.
- The module assumes that the MLOps AWS Infrastructure Module has already been applied, namely that service principal groups with token usage permissions have been created with the default name
"mlops-service-principals"
or by specifying theservice_principal_group_name
field. - The service principal tokens are created with a default expiration of 100 days (8640000 seconds), and the module will need to be re-applied after this time to refresh the tokens.
provider "databricks" {
alias = "staging" # Authenticate using preferred method as described in Databricks provider
}
provider "databricks" {
alias = "prod" # Authenticate using preferred method as described in Databricks provider
}
module "mlops_aws_project" {
source = "databricks/mlops-aws-project/databricks"
providers = {
databricks.staging = databricks.staging
databricks.prod = databricks.prod
}
service_principal_name = "example-name"
project_directory_path = "/dir-name"
}
Usage example with MLOps AWS Infrastructure Module
provider "databricks" {
alias = "dev" # Authenticate using preferred method as described in Databricks provider
}
provider "databricks" {
alias = "staging" # Authenticate using preferred method as described in Databricks provider
}
provider "databricks" {
alias = "prod" # Authenticate using preferred method as described in Databricks provider
}
module "mlops_aws_infrastructure" {
source = "databricks/mlops-aws-infrastructure/databricks"
providers = {
databricks.dev = databricks.dev
databricks.staging = databricks.staging
databricks.prod = databricks.prod
}
staging_workspace_id = "123456789"
prod_workspace_id = "987654321"
additional_token_usage_groups = ["users"] # This field is optional.
}
module "mlops_aws_project" {
source = "databricks/mlops-aws-project/databricks"
providers = {
databricks.staging = databricks.staging
databricks.prod = databricks.prod
}
service_principal_name = "example-name"
project_directory_path = "/dir-name"
service_principal_group_name = module.mlops_aws_infrastructure.service_principal_group_name
# The above field is optional, especially since in this case service_principal_group_name will be mlops-service-principals either way,
# but this also serves to create an implicit dependency. Can also be replaced with the following line to create an explicit dependency:
# depends_on = [module.mlops_aws_infrastructure]
}
This can be helpful for common use cases such as Git authorization for Remote Git Jobs.
data "databricks_current_user" "staging_user" {
provider = databricks.staging
}
data "databricks_current_user" "prod_user" {
provider = databricks.prod
}
provider "databricks" {
alias = "staging_sp"
host = data.databricks_current_user.staging_user.workspace_url
token = module.mlops_aws_project.staging_service_principal_token
}
provider "databricks" {
alias = "prod_sp"
host = data.databricks_current_user.prod_user.workspace_url
token = module.mlops_aws_project.prod_service_principal_token
}
resource "databricks_git_credential" "staging_git" {
provider = databricks.staging_sp
git_username = var.git_username
git_provider = var.git_provider
personal_access_token = var.git_token # This should be configured with `repo` scope for Databricks Repos.
}
resource "databricks_git_credential" "prod_git" {
provider = databricks.prod_sp
git_username = var.git_username
git_provider = var.git_provider
personal_access_token = var.git_token # This should be configured with `repo` scope for Databricks Repos.
}
Name | Version |
---|---|
terraform | >=1.1.6 |
databricks | >=0.5.8 |
Name | Description | Type | Default | Required |
---|---|---|---|---|
service_principal_name | The display name for the service principals. | string | N/A | yes |
project_directory_path | Path/Name of Databricks workspace directory to be created for the project. NOTE: The parent directories in the path must already be created. | string | N/A | yes |
service_principal_group_name | The name of the service principal group in the staging and prod workspace. The created service principals will be added to this group. | string | "mlops-service-principals" |
no |
Name | Description | Type | Sensitive |
---|---|---|---|
project_directory_path | Path/Name of Databricks workspace directory created for the project. | string | no |
staging_service_principal_application_id | Application ID of the created Databricks service principal in the staging workspace. | string | no |
staging_service_principal_token | Sensitive personal access token (PAT) value of the created Databricks service principal in the staging workspace. | string | yes |
prod_service_principal_application_id | Application ID of the created Databricks service principal in the prod workspace. | string | no |
prod_service_principal_token | Sensitive personal access token (PAT) value of the created Databricks service principal in the prod workspace. | string | yes |
Name | Authentication | Use |
---|---|---|
databricks.staging | Provided by the user. | Create group, directory, and service principal module in the staging workspace. |
databricks.prod | Provided by the user. | Create group, directory, and service principal module in the prod workspace. |
Name | Type |
---|---|
databricks_group.staging_sp_group | data source |
databricks_group.prod_sp_group | data source |
databricks_directory.staging_directory | resource |
databricks_permissions.staging_directory_usage | resource |
databricks_directory.prod_directory | resource |
databricks_permissions.prod_directory_usage | resource |
aws-service-principal.staging_sp | module |
aws-service-principal.prod_sp | module |