Skip to content

This terraform module is designed to create Azure Databricks resources. Azure Databricks is a fully managed first-party service that enables an open data lakehouse in Azure.

License

Notifications You must be signed in to change notification settings

clouddrove/terraform-azure-databricks

Terraform AZURE DATABRICKS

Terraform module to create Azure databricks service resource on AZURE.

Terraform Licence


We eat, drink, sleep and most importantly love DevOps. We are working towards strategies for standardizing architecture while ensuring security for the infrastructure. We are strong believer of the philosophy Bigger problems are always solved by breaking them into smaller manageable problems. Resonating with microservices architecture, it is considered best-practice to run database, cluster, storage in smaller connected yet manageable pieces within the infrastructure.

This module is basically combination of Terraform open source and includes automatation tests and examples. It also helps to create and improve your infrastructure with minimalistic code instead of maintaining the whole infrastructure code yourself.

We have fifty plus terraform modules. A few of them are comepleted and are available for open source usage while a few others are in progress.

Prerequisites

This module has a few dependencies:

Examples

IMPORTANT: Since the master branch used in source varies based on new modifications, we suggest that you use the release versions here.

Here are some examples of how you can use this module in your inventory structure:

azure databricks

  # Basic
  module "databricks" {
source                                               = "terraform/databricks/azure"
version                                              = "1.0.0"
name                                                 = "app"
environment                                          = "test"
label_order                                          = ["name", "environment"]
enable                                               = true
resource_group_name                                  = module.resource_group.resource_group_name
location                                             = module.resource_group.resource_group_location
sku                                                  = "standard"
network_security_group_rules_required                = "NoAzureDatabricksRules"
public_network_access_enabled                        = false
managed_resource_group_name                          = "databricks-resource-group"
virtual_network_id                                   = module.vnet.vnet_id[0]
public_subnet_name                                   = module.subnet_pub.default_subnet_name[0]
private_subnet_name                                  = module.subnet_pvt.default_subnet_name[0]
public_subnet_network_security_group_association_id  = module.network_security_group_public.id
private_subnet_network_security_group_association_id = module.network_security_group_private.id
no_public_ip                                         = true
storage_account_name                                 = "databrickstestingcd"

cluster_enable          = true
autotermination_minutes = 20
# spark_version = "11.3.x-scala2.12" # Enter manual spark version or will choose latest spark version
# num_workers             = 0  # Required when enable_autoscale is false

enable_autoscale = true
min_workers      = 1
max_workers      = 2

cluster_profile = "multiNode"
}

Inputs

Name Description Type Default Required
autotermination_minutes Set a minutes to auto terminate cluster if it's unhealthy. number n/a yes
cluster_enable Set to false to prevent the databricks cluster from creating it's resources. bool false no
cluster_profile The profile that Cluster will be contain. Possible values are 'singleNode' and 'multiNode' string "" no
enable Set to false to prevent the module from creating any resources. bool false no
enable_autoscale Set to false to not enable the Autoscale feature from the cluster. bool false no
environment Environment (e.g. prod, dev, staging). string "" no
label_order Label order, e.g. sequence of application name and environment name,environment,'attribute' [webserver,qa,devops,public,] . list(any) [] no
location The location/region where the virtual network is created. Changing this forces a new resource to be created. string "" no
managed_resource_group_name Managed Resource Group name to create Resource group by provided name. string "" no
managedby Managed By e.g. Clouddrove , Anmol Nagpal string "" no
max_workers Set a Ammount of maximum workers that needs to be created among with Databricks Cluster. number n/a yes
min_workers Set a Ammount of minimum workers that needs to be created among with Databricks Cluster. number n/a yes
name Name (e.g. app or cluster). string "" no
network_security_group_rules_required Does the data plane (clusters) to control plane communication happen over private link endpoint only or publicly? Possible values AllRules, NoAzureDatabricksRules or NoAzureServiceRules. Required when public_network_access_enabled is set to false. string "" no
no_public_ip Select true to disble public IP. string "" no
num_workers Set a Ammount of workers that needs to be created among with Databricks Cluster. number 0 no
private_subnet_name Private Subnet Name to attach with databricks. string "" no
private_subnet_network_security_group_association_id Private subnet Network security group association ID of the Virtual Network to attach with databricks. string "" no
public_network_access_enabled Set to false to disable public Network access to the databricks. bool n/a yes
public_subnet_name Public Subnet Name to attach with databricks. string "" no
public_subnet_network_security_group_association_id Public subnet Network security group association ID of the Virtual Network to attach with databricks. string "" no
repository Terraform current module repo string "" no
resource_group_name The name of the resource group in which to create the virtual network. Changing this forces a new resource to be created. string "" no
sku The sku to use for the Databricks Workspace. Possible values are standard, premium, or trial. string "" no
spark_version Enter the Spark version to to create the Databricks's Cluster. string null no
storage_account_name Storage account name to attach with databricks. string "" no
virtual_network_id Id of the Virtual Network to attach with databricks. string "" no

Outputs

Name Description
id Specifies the resource id of the Workspace.

Testing

In this module testing is performed with terratest and it creates a small piece of infrastructure, matches the output like ARN, ID and Tags name etc and destroy infrastructure in your AWS account. This testing is written in GO, so you need a GO environment in your system.

You need to run the following command in the testing folder:

  go test -run Test

Feedback

If you come accross a bug or have any feedback, please log it in our issue tracker, or feel free to drop us an email at hello@clouddrove.com.

If you have found it worth your time, go ahead and give us a ★ on our GitHub!

About us

At CloudDrove, we offer expert guidance, implementation support and services to help organisations accelerate their journey to the cloud. Our services include docker and container orchestration, cloud migration and adoption, infrastructure automation, application modernisation and remediation, and performance engineering.

We are The Cloud Experts!


We ❤️ Open Source and you can check out our other modules to get help with your new Cloud ideas.