Azure Databricks is a managed Spark offering on Azure and customers already use it for advanced analytics. It provides a collaborative Notebook based environment with CPU or GPU based compute cluster.
In this section, you will find sample notebooks on how to use Azure Machine Learning SDK with Azure Databricks. You can train a model using Spark MLlib and then deploy the model to ACI/AKS from within Azure Databricks. You can also use Automated ML capability (public preview) of Azure ML SDK with Azure Databricks.
- Customers who use Azure Databricks for advanced analytics can now use the same cluster to run experiments with or without automated machine learning.
- You can keep the data within the same cluster.
- You can leverage the local worker nodes with autoscale and auto termination capabilities.
- You can use multiple cores of your Azure Databricks cluster to perform simultenous training.
- You can further tune the model generated by automated machine learning if you chose to.
- Every run (including the best run) is available as a pipeline, which you can tune further if needed.
- The model trained using Azure Databricks can be registered in Azure ML SDK workspace and then deployed to Azure managed compute (ACI or AKS) using the Azure Machine learning SDK.
Please follow our Azure doc to install the sdk in your Azure Databricks cluster before trying any of the sample notebooks.
Single file - The following archive contains all the sample notebooks. You can the run notebooks after importing DBC in your Databricks workspace instead of downloading individually.
Notebooks 1-4 have to be run sequentially & are related to Income prediction experiment based on this dataset and demonstrate how to data prep, train and operationalize a Spark ML model with Azure ML Python SDK from within Azure Databricks.
Notebook 6 is an Automated ML sample notebook for Classification.
Learn more about how to use Azure Databricks as a development environment for Azure Machine Learning service.
Databricks as a Compute Target from AML Pipelines You can use Azure Databricks as a compute target from Azure Machine Learning Pipelines. Take a look at this notebook for details: aml-pipelines-use-databricks-as-compute-target.ipynb.
For more on SDK concepts, please refer to notebooks.
Please let us know your feedback.