This module is not suitable for production.
There is no guarantee.
This module is only a demo code.
- Install the Azure CLI and log in with
$ az login
- Install pip with
$ sudo apt get install python-pip
, and use it to install the Databricks CLI. Note that this may require you to update your path following the pip install, so test the install with$ databricks -h
. - Install Terraform
- Install zip with
$ sudo apt install zip
on Linux - Install version 0.2.4 of the Databricks Labs Terraform provider. This can be installed as follows:
curl https://raw.githubusercontent.com/databrickslabs/databricks-terraform/master/godownloader-databricks-provider.sh 0.2.4 | bash -s -- -b $HOME/.terraform.d/plugins
The examples
directory contains ready to run usage examples for the module.
Before running, in order for the Databricks provider to complete authentication, you must set several environment variables on your machine as follows:
# ID of the Azure subscription in which the Databricks workspace is located
export ARM_SUBSCRIPTION_ID="your-sub-id"
# ID of the Azure tenant in which the Databricks workspace is located
export ARM_TENANT_ID="your-tenant-id"
# Service Principal or App Registration Client ID for workspace auth
export ARM_CLIENT_ID="your-client-id"
# Service Principal or App Registration Client secret for workspace auth
export ARM_CLIENT_SECRET="your-client-secret"
The module also provides the option to deploy a pre-created local Jupyter notebook to the workspace. This can contain any valid Jupyter notebook content. Use of variable notebook_path
triggers this resource deployment, but as in the minimal example it can be foregone entirely, deploying just the clusters to the workspace. For ease of getting started, an example notebook is provided at ./notebooks/notebook.ipynb
, and the complete example uploads this by default.
If a notebook is deployed, three APIM APIs are automatically deployed with it, which together use the Databricks Jobs API to invoke and get results from the notebook.
- The first API
create_job_api
sets up a notebook task job in your Databricks instance. - The second API
invoke_notebook_api
runs this job given its ID (1 by default). - The third API
notebook_output_api
, given the ID of a run, retrieves any outputs from that notebook run.
All APIs can be used with optional custom parameters immediately after deployment, through the "Test" window of APIM in the Azure Portal.
Name | Version |
---|---|
databricks | n/a |
null | n/a |
time | n/a |
Name | Description | Type | Default | Required |
---|---|---|---|---|
cluster_default_packages | List of uris for any custom Python packages (.whl) to install on clusters by default. | list(string) |
[] |
no |
databricks_workspace | Databricks workspace to deploy resources to. | any |
n/a | yes |
apim | API Management resource object to deploy notebook invocation API(s) to. | any |
n/a | yes |
prefix | A naming prefix to be used in the creation of unique names for deployed Databricks resources. | list(string) |
[] |
no |
suffix | A naming suffix to be used in the creation of unique names for deployed Databricks resources. | list(string) |
[] |
no |
whl_upload_script_path | Path to a bash script which downloads the whls in cluster_default_packages, and uploads them to dbfs. | string |
"" |
no |
notebook_path | Optional relative path to a local Jupyter notebook to deploy to the workspace. | string |
"" | no |
notebook_name | If deploying, the desired name of the deployed notebook as it will appear in the workspace. | string |
"mynotebook" |
no |
Name | Description |
---|---|
high_concurrency_cluster | n/a |
standard_cluster | n/a |
This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.
When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.
This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.