
<div style="text-align: center; line-height: 0; padding-top: 9px;">
  <img
    src="https://databricks.com/wp-content/uploads/2018/03/db-academy-rgb-1200px.png"
    alt="Databricks Learning"
  >
</div>



# LAB - AutoML

Welcome to the AutoML Lab! In this lab, you will explore the capabilities of AutoML using the Databricks AutoMl UI and AutoML API. 


**Lab Outline:**

In this lab, you will need to complete the following tasks;

* **Task 1 :** Load data set.

* **Task 2 :** Create a classification experiment using the AutoML UI.

* **Task 3 :** Create a classification experiment with the AutoML API using a feature table.

* **Task 4 :** Retrieve the best run and show the model URI.

* **Task 5 :** Import the notebook for a run.



## REQUIRED - SELECT CLASSIC COMPUTE
Before executing cells in this notebook, please select your classic compute cluster in the lab. Be aware that **Serverless** is enabled by default.
Follow these steps to select the classic compute cluster:
1. Navigate to the top-right of this notebook and click the drop-down menu to select your cluster. By default, the notebook will use **Serverless**.
1. If your cluster is available, select it and continue to the next cell. If the cluster is not shown:
   - In the drop-down, select **More**.
   - In the **Attach to an existing compute resource** pop-up, select the first drop-down. You will see a unique cluster name in that drop-down. Please select that cluster.

**NOTE:** If your cluster has terminated, you might need to restart it in order to select it. To do this:
1. Right-click on **Compute** in the left navigation pane and select *Open in new tab*.
1. Find the triangle icon to the right of your compute cluster name and click it.
1. Wait a few minutes for the cluster to start.
1. Once the cluster is running, complete the steps above to select your cluster.

## Requirements

Please review the following requirements before starting the lesson:

* To run this notebook, you need to use one of the following Databricks runtime(s): **17.3.x-cpu-ml-scala2.13**


## Classroom Setup

Before starting the lab, run the provided classroom setup script. This script will define configuration variables necessary for the lab. Execute the following cell:

In [0]:
%run ../Includes/Classroom-Setup-3.2

**Other Conventions:**

Throughout this demo, we'll refer to the object `DA`. This object, provided by Databricks Academy, contains variables such as your username, catalog name, schema name, working directory, and dataset locations. Run the code block below to view these details:

In [0]:
print(f"Username:          {DA.username}")
print(f"Catalog Name:      {DA.catalog_name}")
print(f"Schema Name:       {DA.schema_name}")
print(f"Working Directory: {DA.paths.working_dir}")
print(f"User DB Location:  {DA.paths.datasets}")

## Task 1 : Load data set

Load the dataset that will be used for the AutoML experiment.

* Load and display the dataset where the table name is **`bank_loan`**.

* Load and display the feature table where the feature table is **`bank_loan_features`**.

In [0]:
loan_data = <FILL_IN>
display<FILL_IN>

In [0]:
%skip
loan_data = spark.sql("SELECT * FROM bank_loan")
display(loan_data)

In [0]:
loan_features_data = <FILL_IN>
display <FILL_IN>

In [0]:
%skip
loan_features_data = spark.sql("SELECT * FROM bank_loan_features")
display(loan_features_data)

## Task 2: Create Classification Experiment Using AutoML UI

Follow these steps to create an AutoML experiment using the  UI:

  ***Step 1.*** Navigate to the **Experiments** section.

  ***Step 2.*** Click on **Classification**.

  ***Step 3.*** Choose a cluster for experiment execution.

  ***Step 4.*** Select the input training dataset as **`catalog > database > bank_loan`**.

  ***Step 5.*** Specify **`Personal_Loan`** as the prediction target.

  ***Step 6.*** Deselect the **`ID`**, **`ZIP_Code`** field as it's not needed as a feature.

  ***Step 7.*** Enter a name for your experiment, like `Bank_Loan_Prediction_AutoML_Experiment`.

  ***Step 8.*** In the **Advanced Configuration** section, set the **Timeout** to **5 minutes**.

  ***Step 9.*** Click on **Start AutoML**.

## Task 3: Create a Classification Experiment with the AutoML API

Utilize the AutoML API to set up and run a classification experiment. Follow these steps:

1. **Setting up the Experiment:**

   - **Specify the Dataset:** Specify the dataset using the Spark table name, which is **`bank_loan`**.

   - **Set Target Column:** Assign the target_col to the column you want to predict, which is **`Personal_Loan`**.

   - **Adjust Exclude Columns:** Provide a list of columns to exclude from the modeling process after reviewing the displayed dataset.

   - **Use features table** to be used as part of training. Feature table name: **`bank_loan_features`**.

   - **Set Timeout Duration:** Determine the timeout_minutes for the AutoML experiment. such as `5` minutes.   

2. **Running AutoML:**
   - Use the AutoML API to explore various machine learning models.



In [0]:
from databricks import automl
from datetime import datetime

features_table_path = f"{DA.catalog_name}.{DA.schema_name}.bank_loan_features"

## Define the feature store lookups
feauture_store_lookups = <FILL_IN> 

summary = automl.classify(
    dataset = <FILL_IN>,
    target_col = <FILL_IN>,
    exclude_cols =<FILL_IN>, 
    timeout_minutes = <FILL_IN>,
    feature_store_lookups = <FILL_IN>
    )

In [0]:
%skip
from databricks import automl
from datetime import datetime

features_table_path = f"{DA.catalog_name}.{DA.schema_name}.bank_loan_features"

# Define the feature store lookups
feauture_store_lookups = [
    {
        "table_name": features_table_path,
        "lookup_key": ["ID"]
    }
] 

summary = automl.classify(
    dataset = loan_data,
    target_col = "Personal_Loan",
    exclude_cols = ["ID", "ZIP_Code"],  # Exclude columns as needed
    timeout_minutes = 5,
    feature_store_lookups = feauture_store_lookups
)

## Task 4: Retrieve the best run and show the model URI

Identify the best model generated by AutoML based on a chosen metric. Retrieve information about the best run, including the model URI, to further explore and analyze the model.
 + Find the experiment id associated with your AutoML run experiment. 
 + Define a search term to filter for runs. Adjust the search term based on the desired status, such as `FINISHED` or `ACTIVE`. 
 + Specify the run view type to view only active runs or to view all runs.
 + Provide the metric you want to use for ordering  and Specify whether you want to order the runs in descending or ascending order.

In [0]:
import mlflow
from mlflow.entities import ViewType

## Find the best run ...
automl_runs_pd = mlflow.search_runs(
  experiment_ids=<FILL_IN>,
  filter_string=f<FILL_IN>,
  run_view_type=<FILL_IN>,
  order_by=<FILL_IN>
)

In [0]:
%skip
import mlflow
from mlflow.entities import ViewType

## Find the best run ...
automl_runs_pd = mlflow.search_runs(
  experiment_ids=[summary.experiment.experiment_id], 
  filter_string=f"attributes.status = 'FINISHED'", 
  run_view_type=ViewType.ACTIVE_ONLY, 
  order_by=["metrics.val_f1_score DESC"] 
)

In [0]:
## Print information about the best trial
print(<FILL_IN>)

In [0]:
%skip
## Print information about the best trial
print(summary.best_trial)


## Task 5: Import Notebook for a Run

AutoML automatically generates the best run's notebook and makes it available for you. If you want to access to other runs' notebooks, you need to import them.

In this task, you will import the **5th run's notebook** to the **`destination_path`**. 

Show the `url` and `path` of the imported notebook.

In [0]:
destination_path = f"/Users/{DA.username}/imported_notebooks/lab.3-{datetime.now().strftime('%Y%m%d%H%M%S')}"

## Get the path and url for the generated notebook
result = <FILL_IN>
print(result.path)
print(result.url)

In [0]:
%skip
destination_path = f"/Users/{DA.username}/imported_notebooks/lab.3-{datetime.now().strftime('%Y%m%d%H%M%S')}"

## Get the path and url for t
# he generated notebook
result = automl.import_notebook(summary.trials[1].artifact_uri, destination_path)
print(result.path)
print(result.url)


## Conclusion

In this lab, you got hands-on with Databricks AutoML. You started by loading a dataset and creating a classification experiment using the AutoMl UI and AutoML API. You then learned how to summarize the best model by applying specific filters and explored the process of retrieving the best model along with its Model URI.

&copy; 2026 Databricks, Inc. All rights reserved. Apache, Apache Spark, Spark, the Spark Logo, Apache Iceberg, Iceberg, and the Apache Iceberg logo are trademarks of the <a href="https://www.apache.org/" target="_blank">Apache Software Foundation</a>.<br/><br/><a href="https://databricks.com/privacy-policy" target="_blank">Privacy Policy</a> | <a href="https://databricks.com/terms-of-use" target="_blank">Terms of Use</a> | <a href="https://help.databricks.com/" target="_blank">Support</a>