This module sets up multi-workspace model registry between a development (dev) workspace, a staging workspace, and a production (prod) workspace, allowing READ access from dev/staging workspaces to staging & prod model registries.
The module performs this setup by creating AAD applications and associating them with newly created Azure Databricks service principals in the staging and prod workspaces, then giving them READ-only access to their respective model registries. It will also create secret scopes and store the necessary secrets in the dev and staging workspaces, and only give READ access to this secret scope to the "users"
group and the generated service principals group. The output of this module will be the secret scope names and prefixes since these values are needed to be able to access the remote model registry.
NOTE:
- This module is in preview so it is still experimental and subject to change. Feedback is welcome!
- The Databricks providers that are passed into the module must be configured with workspace admin permissions.
- The Azure Active Directory (AzureAD) provider that is passed into the module must be configured with Application.ReadWrite.All permissions to allow AAD application creation to link to an Azure Databricks service principal. This provider can be authenticated via an AAD service principal with the Application.ReadWrite.All permission.
- In order to create tokens for service principals, they are added to a group, which is then given
token_usage
permission. However, in order to set this permission, there must be at least 1 personal access token in the workspace, and this permission strictly overwrites existing permissions. Currently, running this module will overwrite permissions to allow token usage only for members of the generated service principals group in the staging and prod workspaces. If additional groups are desired to havetoken_usage
permissions, they can be set via theadditional_token_usage_groups
input variable. - The service principal tokens stored for remote model registry access are created with a default expiration of 100 days (8640000 seconds), and the module will need to be re-applied after this time to refresh the tokens.
provider "databricks" {
alias = "dev" # Authenticate using preferred method as described in Databricks provider
}
provider "databricks" {
alias = "staging" # Authenticate using preferred method as described in Databricks provider
}
provider "databricks" {
alias = "prod" # Authenticate using preferred method as described in Databricks provider
}
provider "azuread" {} # Authenticate using preferred method as described in AzureAD provider
module "mlops_azure_infrastructure_with_sp_creation" {
source = "databricks/mlops-azure-infrastructure-with-sp-creation/databricks"
providers = {
databricks.dev = databricks.dev
databricks.staging = databricks.staging
databricks.prod = databricks.prod
azuread = azuread
}
staging_workspace_id = "123456789"
prod_workspace_id = "987654321"
azure_tenant_id = "a1b2c3d4-e5f6-g7h8-i9j0-k9l8m7n6o5p4"
additional_token_usage_groups = ["users"] # This field is optional.
}
Usage example with MLOps Azure Project Module with Service Principal Creation
provider "databricks" {
alias = "dev" # Authenticate using preferred method as described in Databricks provider
}
provider "databricks" {
alias = "staging" # Authenticate using preferred method as described in Databricks provider
}
provider "databricks" {
alias = "prod" # Authenticate using preferred method as described in Databricks provider
}
provider "azuread" {} # Authenticate using preferred method as described in AzureAD provider
module "mlops_azure_infrastructure_with_sp_creation" {
source = "databricks/mlops-azure-infrastructure-with-sp-creation/databricks"
providers = {
databricks.dev = databricks.dev
databricks.staging = databricks.staging
databricks.prod = databricks.prod
azuread = azuread
}
staging_workspace_id = "123456789"
prod_workspace_id = "987654321"
azure_tenant_id = "a1b2c3d4-e5f6-g7h8-i9j0-k9l8m7n6o5p4"
additional_token_usage_groups = ["users"] # This field is optional.
}
module "mlops_azure_project_with_sp_creation" {
source = "databricks/mlops-azure-project-with-sp-creation/databricks"
providers = {
databricks.staging = databricks.staging
databricks.prod = databricks.prod
azuread = azuread
}
service_principal_name = "example-name"
project_directory_path = "/dir-name"
azure_tenant_id = "a1b2c3d4-e5f6-g7h8-i9j0-k9l8m7n6o5p4"
service_principal_group_name = module.mlops_azure_infrastructure_with_sp_creation.service_principal_group_name
# The above field is optional, especially since in this case service_principal_group_name will be mlops-service-principals either way,
# but this also serves to create an implicit dependency. Can also be replaced with the following line to create an explicit dependency:
# depends_on = [module.mlops_azure_infrastructure_with_sp_creation]
}
Name | Version |
---|---|
terraform | >=1.1.6 |
databricks | >=0.5.8 |
azuread | >=2.15.0 |
python | >=3.8 |
Name | Description | Type | Default | Required |
---|---|---|---|---|
staging_workspace_id | Workspace ID of the staging workspace (can be often found in the URL) used for remote model registry setup. | string | N/A | yes |
prod_workspace_id | Workspace ID of the prod workspace (can be often found in the URL) used for remote model registry setup. | string | N/A | yes |
azure_tenant_id | The Azure tenant ID of the AAD subscription. Must match the one used for the AzureAD Provider. | string | N/A | yes |
additional_token_usage_groups | List of groups that should have token usage permissions in the staging and prod workspaces, along with the created service principal group (mlops-service-principals ). |
list(string) | [] | no |
Name | Description | Type | Sensitive |
---|---|---|---|
dev_secret_scope_name_for_staging | The name of the secret scope created in the dev workspace that is used for remote model registry access to the staging workspace. | string | no |
dev_secret_scope_name_for_prod | The name of the secret scope created in the dev workspace that is used for remote model registry access to the prod workspace. | string | no |
staging_secret_scope_name_for_prod | The name of the secret scope created in the staging workspace that is used for remote model registry access to the prod workspace. | string | no |
dev_secret_scope_prefix_for_staging | The prefix used in the dev workspace secret scope for remote model registry access to the staging workspace. | string | no |
dev_secret_scope_prefix_for_prod | The prefix used in the dev workspace secret scope for remote model registry access to the prod workspace. | string | no |
staging_secret_scope_prefix_for_prod | The prefix used in the staging workspace secret scope for remote model registry access to the prod workspace. | string | no |
service_principal_group_name | The name of the service principal group created in the staging and prod workspace. | string | no |
Name | Authentication | Use |
---|---|---|
databricks.dev | Provided by the user. | Generate all resources in the dev workspace. |
databricks.staging | Provided by the user. | Generate all resources in the staging workspace. |
databricks.prod | Provided by the user. | Generate all resources in the prod workspace. |
azuread | Provided by the user. Can be authenticated via azure_client_id, azure_client_secret, azure_tenant_id. | Create an AAD application and client secret for the service principal. |
databricks.staging_sp | Authenticated via host and generated AAD token for service principal. | Obtain service principal PAT. |
databricks.prod_sp | Authenticated via host and generated AAD token for service principal. | Obtain service principal PAT. |
Name | Type |
---|---|
databricks_current_user.staging_user | data source |
databricks_current_user.prod_user | data source |
databricks_group.staging_sp_group | resource |
databricks_permissions.staging_token_usage | resource |
databricks_group.prod_sp_group | resource |
databricks_permissions.prod_token_usage | resource |
databricks_token.staging_sp_token | resource |
databricks_token.prod_sp_token | resource |
azure-create-service-principal.create_staging_sp | module |
azure-create-service-principal.create_prod_sp | module |
remote-model-registry.remote_model_registry_dev_to_staging | module |
remote-model-registry.remote_model_registry_dev_to_prod | module |
remote-model-registry.remote_model_registry_staging_to_prod | module |
- AAD token generation occasionally fails with
"HTTP Error 400: Bad Request"
but the query is not actually invalid.- Solution: Re-run
terraform apply
and the error should disappear.
- Solution: Re-run