Intel Optimized Cloud Modules for Terraform

Intel Optimized Databricks Cluster

The module can deploy an Intel Optimized Databricks Cluster. Instance Selection and Intel Optimizations have been defaulted in the code.

Learn more about Intel optimizations :

Performance Data

Faster insights With Databricks Photon Using AWS i4i Instances With the Latest Intel Ice Lake Scalable Processors

5.3x relative speed up of i4i Photon against the i3 DBR

Reduce Time to Decision With the Databricks Lakehouse Platform and Latest Intel 3rd Gen Xeon Scalable Processors

Up to 3.0x price/performance benefits and 6.7x the speed up on Azure Edsv5

Accelerating Azure Databricks Runtime for Machine Learning

Usage

All the examples in example folder shows how to create a Intel Optimized Databricks cluster using this module along with the Intel Cloud Optimization Module for Databricks Workspace in AWS and Azure

Usage Considerations

If you dont have pre-existing Databricks Workspace, use the Intel Cloud Optimization Module for Databricks Workspace:
- AWS Databricks Workspace
- Azure Databricks Workspace
See examples folder and README for each examples above to use this module

Run Terraform

terraform init  
terraform plan
terraform apply

Considerations

More Information regarding deploying Databricks Workspace Databricks

Requirements

Name	Version
aws	~> 5.31
azurerm	~> 3.48
databricks	~> 1.14.2

Providers

Name	Version
databricks	~> 1.14.2

Modules

No modules.

Resources

Name	Type
databricks_cluster.dbx_cluster	resource
databricks_token.pat	resource
databricks_spark_version.latest_lts	data source

Inputs

Name	Description	Type	Default	Required
aws_dbx_node_type_id	The type of the AWS Compute Machine that are supported by databricks.	`string`	`"i4i.2xlarge"`	no
azure_dbx_node_type_id	The type of the Azure Compute Machine that are supported by databricks.	`string`	`"Standard_E8ds_v5"`	no
dbx_auto_terminate_min	Automatically terminate the cluster after being inactive for this time in minutes. If specified, the threshold must be between 10 and 10000 minutes. You can also set this value to 0 to explicitly disable automatic termination. Defaults to 60. We highly recommend having this setting present for Interactive/BI clusters.	`number`	`120`	no
dbx_cloud	Flag that decides which Cloud to use for the instance type in Databricks Cluster	`string`	n/a	yes
dbx_cluster_name	Cluster name, which doesn’t have to be unique. If not specified at creation, the cluster name will be an empty string.	`string`	`"dbx_optimized_cluster"`	no
dbx_host	Required URL for the databricks workspace	`string`	n/a	yes
dbx_num_workers	Number of worker nodes that this cluster should have. A cluster has one Spark driver and num_workers executors for a total of num_workers + 1 Spark nodes.	`number`	`8`	no
dbx_runtime_engine	The type of runtime engine to use. If not specified, the runtime engine type is inferred based on the spark_version value. Allowed values include: PHOTON, STANDARD.	`string`	`"PHOTON"`	no
dbx_spark_config	Key - Value pair for Intel Optimizations for Spark configuration	`map(string)`	{ "spark.databricks.adaptive.autoOptimizeShuffle.enabled": "true", "spark.databricks.delta.preview.enabled": "true", "spark.databricks.io.cache.enabled": "true", "spark.databricks.io.cache.maxDiskUsage": "100g", "spark.databricks.io.cache.maxMetaDataCache": "10g", "spark.databricks.passthrough.enabled": "true" }	no
enable_intel_tags	If true adds additional Intel tags to resources	`bool`	`true`	no
intel_tags	Intel Tags	`map(string)`	{ "intel-module": "terraform-intel-databricks-cluster", "intel-registry": "https://registry.terraform.io/namespaces/intel" }	no
tags	Tags	`map(string)`	{ "key": "value" }	no

Outputs

Name	Description
dbx_cluster_autoterminate_min	Autoterminate minute of the databricks cluster
dbx_cluster_custom_tags	Custom Tags
dbx_cluster_name	Name of the databricks cluster
dbx_cluster_node_type_id	Instance type of the databricks cluster
dbx_cluster_num_workers	Num of workers nodes of the databricks cluster
dbx_cluster_runtime_engine	Runtime Engine of the databricks cluster
dbx_cluster_spark_conf	Spark Configurations of the databricks cluster
dbx_cluster_spark_version	Spark version of the databricks cluster
dbx_pat	Personal Access Token

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
.github		.github
examples		examples
images		images
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE.md		LICENSE.md
NOTICE.md		NOTICE.md
README.md		README.md
main.tf		main.tf
outputs.tf		outputs.tf
policies.md		policies.md
security.md		security.md
terraform-docs-setup.md		terraform-docs-setup.md
variables.tf		variables.tf
versions.tf		versions.tf

License

intel/terraform-intel-databricks-cluster

Folders and files

Latest commit

History

Repository files navigation

Intel Optimized Cloud Modules for Terraform

Intel Optimized Databricks Cluster

Performance Data

Usage

Considerations

Requirements

Providers

Modules

Resources

Inputs

Outputs

About

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Languages