terraform-azurerm-res-databricks-workspace

Manages a Databricks Workspace

Requirements

The following requirements are needed by this module:

terraform (>= 1.6.0)
azurerm (>= 3.71.0)
random (>= 3.5.0)

Providers

The following providers are used by this module:

azurerm (>= 3.71.0)
random (>= 3.5.0)

Resources

The following resources are used by this module:

azurerm_databricks_access_connector.this (resource)
azurerm_databricks_virtual_network_peering.this (resource)
azurerm_databricks_workspace.this (resource)
azurerm_databricks_workspace_root_dbfs_customer_managed_key.this (resource)
azurerm_management_lock.this (resource)
azurerm_monitor_diagnostic_setting.this (resource)
azurerm_private_endpoint.this (resource)
azurerm_private_endpoint_application_security_group_association.this (resource)
azurerm_resource_group_template_deployment.telemetry (resource)
azurerm_role_assignment.this (resource)
random_id.telem (resource)
azurerm_resource_group.parent (data source)

Required Inputs

The following input variables are required:

name

Description: Specifies the name of the Databricks Workspace resource. Changing this forces a new resource to be created.

Type: string

resource_group_name

Description: The name of the Resource Group in which the Databricks Workspace should exist. Changing this forces a new resource to be created.

Type: string

sku

Description: The 'sku' value must be one of 'standard', 'premium', or 'trial'.
NOTE: Downgrading to a trial sku from a standard or premium sku will force a new resource to be created.

Type: string

Optional Inputs

The following input variables are optional (have default values):

access_connector

Description:
Configuration options for the Databricks Access Connector resource. This map includes the following attributes:

name (Required): Specifies the name of the Databricks Access Connector resource. Changing this forces a new resource to be created.
resource_group_name (Optional): The name of the Resource Group in which the Databricks Access Connector should exist. Defaults to the resource group of the databricks instance.
location (Optional): Specifies the supported Azure location where the resource has to be created. Defaults to the location of the databricks instance.
identity (Optional): An identity block. This block supports the following:
- type (Required): Specifies the type of Managed Service Identity that should be configured on the Databricks Access Connector. Possible values include SystemAssigned or UserAssigned.
- identity_ids (Optional): Specifies a list of User Assigned Managed Identity IDs to be assigned to the Databricks Access Connector. Only one User Assigned Managed Identity ID is supported per Databricks Access Connector resource. Note: identity_ids are required when type is set to UserAssigned.
tags (Optional): A mapping of tags to assign to the resource.

Type:

map(object({
    name                = string
    resource_group_name = optional(string, null)
    location            = optional(string, null)
    identity = optional(object({
      type         = string
      identity_ids = optional(list(string))
    }))
    tags = optional(map(string))
  }))

Default: {}

custom_parameters

Description: A map of custom parameters for configuring the Databricks Workspace. This object allows for detailed configuration, with each attribute representing a specific setting:

machine_learning_workspace_id - (Optional) The ID of an Azure Machine Learning workspace to link with the Databricks workspace.
nat_gateway_name - (Optional) Name of the NAT gateway for Secure Cluster Connectivity (No Public IP) workspace subnets. Defaults to 'nat-gateway'.
public_ip_name - (Optional) Name of the Public IP for No Public IP workspace with managed vNet. Defaults to 'nat-gw-public-ip'.
no_public_ip - (Optional) Specifies whether public IP Addresses are not allowed. Defaults to false. Note: Updating this parameter is only allowed if the value is changing from false to true and only for VNet-injected workspaces.
public_subnet_name - (Optional) The name of the Public Subnet within the Virtual Network.
public_subnet_network_security_group_association_id - (Optional) The resource ID of the azurerm_subnet_network_security_group_association which is referred to by the public_subnet_name field.
private_subnet_name - (Optional) The name of the Private Subnet within the Virtual Network.
private_subnet_network_security_group_association_id - (Optional) The resource ID of the azurerm_subnet_network_security_group_association which is referred to by the private_subnet_name field.
storage_account_name - (Optional) Default Databricks File Storage account name. Defaults to a randomized name.
storage_account_sku_name - (Optional) Storage account SKU name. Defaults to 'Standard_GRS'.
virtual_network_id - (Optional) The ID of a Virtual Network where the Databricks Cluster should be created. More information about VNet injection can be found here.
vnet_address_prefix - (Optional) Address prefix for Managed virtual network. Defaults to '10.139'.

Note: Databricks requires that a network security group is associated with the public and private subnets when a virtual_network_id has been defined.

Type:

object({
    machine_learning_workspace_id                        = optional(string, null)
    nat_gateway_name                                     = optional(string)
    public_ip_name                                       = optional(string)
    no_public_ip                                         = optional(bool, false)
    public_subnet_name                                   = optional(string, null)
    public_subnet_network_security_group_association_id  = optional(string, null)
    private_subnet_name                                  = optional(string, null)
    private_subnet_network_security_group_association_id = optional(string, null)
    storage_account_name                                 = optional(string, null) # Defaults to a randomized name
    storage_account_sku_name                             = optional(string, "Standard_GRS")
    virtual_network_id                                   = optional(string, null)
    vnet_address_prefix                                  = optional(string)
  })

Default: {}

customer_managed_key_enabled

Description: Is the workspace enabled for customer managed key encryption? If true this enables the Managed Identity for the managed storage account.
Possible values are true or false. Defaults to false.
This field is only valid if the Databricks Workspace sku is set to premium.

Type: bool

Default: false

dbfs_root_cmk_key_vault_key_id

Description: The ID of the customer-managed key for DBFS root.
This is required when customer_managed_key_enabled is set to true.

Type: string

Default: null

diagnostic_settings

Description: A map of diagnostic settings to create on the storage account. The map key is deliberately arbitrary to avoid issues where map keys maybe unknown at plan time.

name - (Optional) The name of the diagnostic setting. One will be generated if not set, however this will not be unique if you want to create multiple diagnostic setting resources.
log_categories - (Optional) A set of log categories to send to the log analytics workspace. Defaults to [].
log_groups - (Optional) A set of log groups to send to the log analytics workspace. Defaults to ["allLogs"].
metric_categories - (Optional) A set of metric categories to send to the log analytics workspace. Defaults to ["AllMetrics"].
log_analytics_destination_type - (Optional) The destination type for the diagnostic setting. Possible values are Dedicated and AzureDiagnostics. Defaults to Dedicated.
workspace_resource_id - (Optional) The resource ID of the log analytics workspace to send logs and metrics to.
storage_account_resource_id - (Optional) The resource ID of the storage account to send logs and metrics to.
event_hub_authorization_rule_resource_id - (Optional) The resource ID of the event hub authorization rule to send logs and metrics to.
event_hub_name - (Optional) The name of the event hub. If none is specified, the default event hub will be selected.
marketplace_partner_resource_id - (Optional) The full ARM resource ID of the Marketplace resource to which you would like to send Diagnostic LogsLogs.

Type:

map(object({
    name                                     = optional(string, null)
    log_categories                           = optional(set(string), [])
    log_groups                               = optional(set(string), ["allLogs"])
    metric_categories                        = optional(set(string), [])
    log_analytics_destination_type           = optional(string, "Dedicated")
    workspace_resource_id                    = optional(string, null)
    storage_account_resource_id              = optional(string, null)
    event_hub_authorization_rule_resource_id = optional(string, null)
    event_hub_name                           = optional(string, null)
    marketplace_partner_resource_id          = optional(string, null)
  }))

Default: {}

enable_telemetry

Description: This variable controls whether or not telemetry is enabled for the module.
For more information see https://aka.ms/avm/telemetryinfo.
If it is set to false, then no telemetry will be collected.

Type: bool

Default: true

infrastructure_encryption_enabled

Description: By default, Azure encrypts storage account data at rest. Infrastructure encryption adds a second layer of encryption to your storage account's data
Possible values are true or false. Defaults to false.
This field is only valid if the Databricks Workspace sku is set to premium.
Changing this forces a new resource to be created.

Type: bool

Default: false

load_balancer_backend_address_pool_id

Description: Resource ID of the Outbound Load balancer Backend Address Pool for Secure Cluster Connectivity (No Public IP) workspace. Changing this forces a new resource to be created.

Type: string

Default: null

location

Description: Azure region where the resource should be deployed. If null, the location will be inferred from the resource group location.

Type: string

Default: null

lock

Description: The lock level to apply to the databricks workspace. Default is None. Possible values are None, CanNotDelete, and ReadOnly.

Type:

object({
    name = optional(string, null)
    kind = optional(string, "None")
  })

Default: {}

managed_disk_cmk_key_vault_key_id

Description: Customer managed encryption properties for the Databricks Workspace managed disks.

Once the Databricks Workspace is created, the managed disk encryption set must be added to the key vault access policy, this can be found in the managed resource group under the name 'databricks-encryption-set-'.
This resource ID can be used to create a Key Vault access policy for the managed disk encryption set. RBA role 'Key Vault Crypto Officer' is required to create the access policy.
The Key Vault access policy should be created with the following permissions: 'Get', 'Wrap Key', 'Unwrap Key', 'Sign', 'Verify', 'List'. or Key Vault Crypto User role.

NOTE: Disabling CMK for Disk is currently not supported. If you want to disable Managed Services, you must delete the workspace and create a new one.

Type: string

Default: null

managed_disk_cmk_rotation_to_latest_version_enabled

Description: Whether customer managed keys for disk encryption will automatically be rotated to the latest version. Optional.

Type: bool

Default: false

managed_identities

Description: Managed identities to be created for the resource.

Type:

object({
    system_assigned            = optional(bool, false)
    user_assigned_resource_ids = optional(set(string), [])
  })

Default: {}

managed_resource_group_name

Description: The name of the resource group where Azure should place the managed Databricks resources.
Changing this forces a new resource to be created.

NOTE: Make sure that this field is unique if you have multiple Databrick Workspaces deployed in your subscription and choose to not have the managed_resource_group_name auto generated by the Azure Resource Provider. Having multiple Databrick Workspaces deployed in the same subscription with the same manage_resource_group_name may result in some resources that cannot be deleted.

Type: string

Default: null

managed_services_cmk_key_vault_key_id

Description: Databricks Workspace Customer Managed Keys for Managed Services(e.g. Notebooks and Artifacts).

To find the correct Object ID to use for the Key vault access policy for managed services, follow these steps:  
1. Go to portal -> Azure Active Directory.  
2. In the search your tenant bar enter the value 2ff814a6-3304-4ab8-85cb-cd0e6f879c1d.  
3. You will see under Enterprise application results AzureDatabricks, click on the AzureDatabricks search result.  
4. This will open the Enterprise Application overview blade where you will see three values, the name of the application, the application ID, and the object ID.  
5. The value you want is the object ID.  
6. The Key Vault access policy should be created with the following permissions: 'Get', 'Wrap Key', 'Unwrap Key', 'Sign', 'Verify', 'List'. or Key Vault Crypto User role.

NOTE: Disabling Managed Services (aka CMK for Notebook) is currently not supported. If you want to disable Managed Services, you must delete the workspace and create a new one.

Type: string

Default: null

network_security_group_rules_required

Description: Does the data plane (clusters) to control plane communication happen over private link endpoint only or publicly?
Possible values AllRules, NoAzureDatabricksRules or NoAzureServiceRules.
Required when public_network_access_enabled is set to false.

Type: string

Default: null

private_endpoints

Description: A map of private endpoints to create on the Databrick workspace. The map key is deliberately arbitrary to avoid issues where map keys maybe unknown at plan time.

name - (Optional) The name of the private endpoint. One will be generated if not set.
subresource_name - The subresource name for the private endpoint. Must be one of "databricks_ui_api" or "browser_authentication".
role_assignments - (Optional) A map of role assignments to create on the private endpoint. The map key is deliberately arbitrary to avoid issues where map keys maybe unknown at plan time. See var.role_assignments for more information.
lock - (Optional) The lock level to apply to the private endpoint. Default is None. Possible values are None, CanNotDelete, and ReadOnly.
tags - (Optional) A mapping of tags to assign to the private endpoint.
subnet_resource_id - The resource ID of the subnet to deploy the private endpoint in.
private_dns_zone_group_name - (Optional) The name of the private DNS zone group. One will be generated if not set.
private_dns_zone_resource_ids - (Optional) A set of resource IDs of private DNS zones to associate with the private endpoint. If not set, no zone groups will be created and the private endpoint will not be associated with any private DNS zones. DNS records must be managed external to this module.
application_security_group_resource_ids - (Optional) A map of resource IDs of application security groups to associate with the private endpoint. The map key is deliberately arbitrary to avoid issues where map keys maybe unknown at plan time.
private_service_connection_name - (Optional) The name of the private service connection. One will be generated if not set.
network_interface_name - (Optional) The name of the network interface. One will be generated if not set.
location - (Optional) The Azure location where the resources will be deployed. Defaults to the location of the resource group.
resource_group_name - (Optional) The resource group where the resources will be deployed. Defaults to the resource group of the databricks instance.
ip_configurations - (Optional) A map of IP configurations to create on the private endpoint. If not specified the platform will create one. The map key is deliberately arbitrary to avoid issues where map keys maybe unknown at plan time.
- name - The name of the IP configuration.
- private_ip_address - The private IP address of the IP configuration.

Type:

map(object({
    name             = optional(string, null)
    subresource_name = string
    role_assignments = optional(map(object({
      role_definition_id_or_name             = string
      principal_id                           = string
      description                            = optional(string, null)
      skip_service_principal_aad_check       = optional(bool, false)
      condition                              = optional(string, null)
      condition_version                      = optional(string, null)
      delegated_managed_identity_resource_id = optional(string, null)
    })), {})
    lock = optional(object({
      name = optional(string, null)
      kind = optional(string, "None")
    }), {})
    tags                                    = optional(map(any), null)
    subnet_resource_id                      = string
    private_dns_zone_group_name             = optional(string, "default")
    private_dns_zone_resource_ids           = optional(set(string), [])
    application_security_group_associations = optional(map(string), {})
    private_service_connection_name         = optional(string, null)
    network_interface_name                  = optional(string, null)
    location                                = optional(string, null)
    resource_group_name                     = optional(string, null)
    ip_configurations = optional(map(object({
      name               = string
      private_ip_address = string
      subresource_name   = optional(string)
      member_name        = optional(string)

    })), {})
  }))

Default: {}

public_network_access_enabled

Description: Allow public access for accessing workspace. Set value to false to access workspace only via private link endpoint.
Possible values include true or false. Defaults to true.
Creation of workspace with PublicNetworkAccess property set to false is only supported for VNet Injected workspace.

Type: bool

Default: true

role_assignments

Description: A map of role assignments to create on the databricks workspace . The map key is deliberately arbitrary to avoid issues where map keys maybe unknown at plan time.

role_definition_id_or_name - The ID or name of the role definition to assign to the principal.
principal_id - The ID of the principal to assign the role to.
description - The description of the role assignment.
skip_service_principal_aad_check - If set to true, skips the Azure Active Directory check for the service principal in the tenant. Defaults to false.
condition - The condition which will be used to scope the role assignment.
condition_version - The version of the condition syntax. If you are using a condition, valid values are '2.0'.

Note: only set skip_service_principal_aad_check to true if you are assigning a role to a service principal.

Type:

map(object({
    role_definition_id_or_name             = string
    principal_id                           = string
    description                            = optional(string, null)
    skip_service_principal_aad_check       = optional(bool, false)
    condition                              = optional(string, null)
    condition_version                      = optional(string, null)
    delegated_managed_identity_resource_id = optional(string, null)
  }))

Default: {}

virtual_network_peering

Description: A map of virtual network peering configurations. The map key is deliberately arbitrary to avoid issues where map keys may be unknown at plan time.

name - (Optional) Specifies the name of the Databricks Virtual Network Peering resource. Changing this forces a new resource to be created.
resource_group_name - (Optional) The name of the Resource Group in which the Databricks Virtual Network Peering should exist. Defaults to the resource group of the databricks instance.
remote_address_space_prefixes - (Required) A list of address blocks reserved for the remote virtual network in CIDR notation. Changing this forces a new resource to be created.
remote_virtual_network_id - (Required) The ID of the remote virtual network. Changing this forces a new resource to be created.
allow_virtual_network_access - (Optional) Can the VMs in the local virtual network space access the VMs in the remote virtual network space? Defaults to true.
allow_forwarded_traffic - (Optional) Can the forwarded traffic from the VMs in the local virtual network be forwarded to the remote virtual network? Defaults to false.
allow_gateway_transit - (Optional) Can the gateway links be used in the remote virtual network to link to the Databricks virtual network? Defaults to false.
use_remote_gateways - (Optional) Can remote gateways be used on the Databricks virtual network? Defaults to false.
If the use_remote_gateways is set to true, and allow_gateway_transit on the remote peering is also true, the virtual network will use the gateways of the remote virtual network for transit. Only one peering can have this flag set to true. use_remote_gateways cannot be set if the virtual network already has a gateway.

Type:

map(object({
    name                          = optional(string, null)
    resource_group_name           = optional(string, null)
    remote_address_space_prefixes = list(string)
    remote_virtual_network_id     = string
    allow_virtual_network_access  = optional(bool, true)
    allow_forwarded_traffic       = optional(bool, false)
    allow_gateway_transit         = optional(bool, false)
    use_remote_gateways           = optional(bool, false)
  }))

Default: {}

Outputs

The following outputs are exported:

databricks_id

Description: The ID of the Databricks Workspace in the Azure management plane.

databricks_virtual_network_peering_address_space_prefixes

Description: A list of address blocks reserved for this virtual network in CIDR notation.

databricks_virtual_network_peering_id

Description: The IDs of the internal Virtual Networks used by the DataBricks Workspace.

databricks_virtual_network_peering_virtual_network_id

Description: The ID of the internal Virtual Network used by the DataBricks Workspace.

databricks_workspace_disk_encryption_set_id

Description: The ID of Managed Disk Encryption Set created by the Databricks Workspace.

databricks_workspace_id

Description: The unique identifier of the databricks workspace in Databricks control plane.

databricks_workspace_managed_disk_identity

Description: A managed_disk_identity block as documented below

principal_id - The principal UUID for the internal databricks disks identity needed to provide access to the workspace for enabling Customer Managed Keys.
tenant_id - The UUID of the tenant where the internal databricks disks identity was created.
type - The type of the internal databricks disks identity.

databricks_workspace_managed_resource_group_id

Description: The ID of the Managed Resource Group created by the Databricks Workspace.

databricks_workspace_storage_account_identity

Description: A storage_account_identity block as documented below

principal_id - The principal UUID for the internal databricks storage account needed to provide access to the workspace for enabling Customer Managed Keys.
tenant_id - The UUID of the tenant where the internal databricks storage account was created.
type - The type of the internal databricks storage account.

databricks_workspace_url

Description: The workspace URL which is of the format 'adb-{workspaceId}.{random}.azuredatabricks.net'.

private_endpoints

Description: A map of private endpoints. The map key is the supplied input to var.private_endpoints. The map value is the entire azurerm_private_endpoint resource.

resource

Description: This is the full output for the resource.

Modules

No modules.

Data Collection

The software may collect information about you and your use of the software and send it to Microsoft. Microsoft may use this information to provide services and improve our products and services. You may turn off the telemetry as described in the repository. There are also some features in the software that may enable you and Microsoft to collect data from users of your applications. If you use these features, you must comply with applicable law, including providing appropriate notices to users of your applications together with a copy of Microsoft’s privacy statement. Our privacy statement is located at https://go.microsoft.com/fwlink/?LinkID=824704. You can learn more about data collection and use in the help documentation and our privacy statement. Your use of the software operates as your consent to these practices.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
.devcontainer		.devcontainer
.github		.github
.vscode		.vscode
examples		examples
modules		modules
tests		tests
.gitignore		.gitignore
.terraform-docs.yml		.terraform-docs.yml
.tflint.hcl		.tflint.hcl
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
SECURITY.md		SECURITY.md
SUPPORT.md		SUPPORT.md
_footer.md		_footer.md
_header.md		_header.md
avm		avm
avm.bat		avm.bat
locals.telemetry.tf		locals.telemetry.tf
locals.tf		locals.tf
locals.version.tf.json		locals.version.tf.json
main.privateendpoint.tf		main.privateendpoint.tf
main.telemetry.tf		main.telemetry.tf
main.tf		main.tf
outputs.tf		outputs.tf
terraform.tf		terraform.tf
variables.tf		variables.tf

License

Azure/terraform-azurerm-avm-res-databricks-workspace

Folders and files

Latest commit

History

Repository files navigation

terraform-azurerm-res-databricks-workspace

Requirements

Providers

Resources

Required Inputs

Optional Inputs

Outputs

Modules

Data Collection

About

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Languages