
<div style="text-align: center; line-height: 0; padding-top: 9px;">
  <img src="https://databricks.com/wp-content/uploads/2018/03/db-academy-rgb-1200px.png" alt="Databricks Learning">
</div>



# LAB - Hyperparameter Tuning with Optuna

Welcome to the Hyperparameter Tuning with Optuna lab! In this hands-on session, you'll gain practical insights into **optimizing machine learning models using Optuna**. Throughout the lab, we'll cover key steps, from loading the dataset and creating training/test sets to **defining a hyperparameter search space and running optimization trials with Spark**. The primary objective is to equip you with the skills to fine-tune models effectively using Spark, Optuna, and MLflow.

**Lab Outline:**
1. Load the dataset and create training/test sets for a scikit-learn model. 
1. Define the hyperparameter search space for optimization.
1. Define the optimization function to fine-tune the model.
1. Run hyperparameter tuning trials. 
1. Search for runs using the MLflow API and visualize all runs within the MLflow experiment.
1. Identify the best run based on the model's precision value programmatically and visually.
1. Register the model with Unity Catalog.

## REQUIRED - SELECT CLASSIC COMPUTE
Before executing cells in this notebook, please select your classic compute cluster in the lab. Be aware that **Serverless** is enabled by default.
Follow these steps to select the classic compute cluster:
1. Navigate to the top-right of this notebook and click the drop-down menu to select your cluster. By default, the notebook will use **Serverless**.
1. If your cluster is available, select it and continue to the next cell. If the cluster is not shown:
   - In the drop-down, select **More**.
   - In the **Attach to an existing compute resource** pop-up, select the first drop-down. You will see a unique cluster name in that drop-down. Please select that cluster.
  
**NOTE:** If your cluster has terminated, you might need to restart it in order to select it. To do this:
1. Right-click on **Compute** in the left navigation pane and select *Open in new tab*.
1. Find the triangle icon to the right of your compute cluster name and click it.
1. Wait a few minutes for the cluster to start.
1. Once the cluster is running, complete the steps above to select your cluster.

## Requirements

Please review the following requirements before starting the lesson:

* To run this notebook, you need to use one of the following Databricks runtime(s): **16.3.x-cpu-ml-scala2.12**


## Classroom Setup

Before starting the lab, run the provided classroom setup script. This script will define configuration variables necessary for the lab. Execute the following cell:

In [0]:
%pip install -U -qq optuna
%restart_python

[43mNote: you may need to restart the kernel using %restart_python or dbutils.library.restartPython() to use updated packages.[0m


In [0]:
%run ../Includes/Classroom-Setup-2.2

Collecting databricks-sdk==0.36.0
  Using cached databricks_sdk-0.36.0-py3-none-any.whl.metadata (38 kB)
Using cached databricks_sdk-0.36.0-py3-none-any.whl (569 kB)
Installing collected packages: databricks-sdk
  Attempting uninstall: databricks-sdk
    Found existing installation: databricks-sdk 0.30.0
    Not uninstalling databricks-sdk at /databricks/python3/lib/python3.12/site-packages, outside environment /local_disk0/.ephemeral_nfs/envs/pythonEnv-1118aec7-0c85-46c9-b754-ea465bc6df62
    Can't uninstall 'databricks-sdk'. No files were found to uninstall.
Successfully installed databricks-sdk-0.36.0
[43mNote: you may need to restart the kernel using %restart_python or dbutils.library.restartPython() to use updated packages.[0m


Downloading artifacts:   0%|          | 0/85 [00:00<?, ?it/s]

Uploading artifacts:   0%|          | 0/4 [00:00<?, ?it/s]

**Other Conventions:**

Throughout this demo, we'll refer to the object `DA`. This object, provided by Databricks Academy, contains variables such as your username, catalog name, schema name, working directory, and dataset locations. Run the code block below to view these details:

In [0]:
print(f"Username:          {DA.username}")
print(f"Catalog Name:      {DA.catalog_name}")
print(f"Schema Name:       {DA.schema_name}")
print(f"Working Directory: {DA.paths.working_dir}")
print(f"Dataset Location:  {DA.paths.datasets}")

Username:          labuser11091541_1754532261@vocareum.com
Catalog Name:      dbacademy
Schema Name:       labuser11091541_1754532261
Working Directory: /Volumes/dbacademy/ops/labuser11091541_1754532261@vocareum_com
Dataset Location:  NestedNamespace (california_housing='/Volumes/dbacademy_california_housing/v02', cdc_diabetes='/Volumes/dbacademy_cdc_diabetes/v01', telco='/Volumes/dbacademy_telco/v01', banking='/Volumes/dbacademy_banking/v01')


## Prepare Dataset

In this lab, you will be using a fictional dataset from a Telecom Company, which includes customer information. This dataset encompasses **customer demographics**, including gender, as well as internet subscription details such as subscription plans and payment methods.

In this lab, we will create and tune a model that will predict customer churn based on the **`Churn`** field. 

A table with all features is already created for you.

**Table name: `customer_churn`**

In [0]:
import pandas as pd
from sklearn.model_selection import train_test_split
## load the table from Unity Catalog called custome_churn
table_name = f"{DA.catalog_name}.{DA.schema_name}.customer_churn"
## Read into a PySpark DataFrame and convert to Pandas DataFrame
diabetes_dataset = spark.read.table(table_name)
customer_pd = diabetes_dataset.drop('CustomerID').toPandas()

## split dataset between features and targets. The target variable is Churn
target_col = "Churn"
X_all = customer_pd.drop(labels=target_col, axis=1)
y_all = customer_pd[target_col]

## test / train split using 95% train/5% test
X_train, X_test, y_train, y_test = train_test_split(X_all, y_all, train_size=0.95, random_state=42)
print(f"We have {X_train.shape[0]} records in our training dataset")
print(f"We have {X_test.shape[0]} records in our test dataset")

We have 6690 records in our training dataset
We have 353 records in our test dataset


## Step 1: Define the Search Space and Optimization Function

Define the parameter search space for Optuna.

Your objective function should meet the following requirements:

1. Define the search space using the hyperparameters `max_depth` and `max_features`. For `max_depth`, the search range should be between 5 and 50, while `max_features` should be between 5 and 10. Additionally, for the `criterion` parameter, search based on `gini`, `entropy`, and `log_loss`. 
1. Enable MLflow run as a nested experiment.
1. For each run, log the cross-validation results for `accuracy`, `precision`, `recall`, and `f1`.
1. Use **3-fold** cross-validation. Be sure to average the fold results using `.mean()`.
1. The objective will be to _maximize_ **`precision`**.

In [0]:
import optuna
import mlflow
import mlflow.sklearn
from sklearn.tree import DecisionTreeClassifier
from sklearn.model_selection import cross_validate
from mlflow.models.signature import infer_signature

## Define the objective function
def optuna_objective_function(trial):
    params = {
        'criterion': trial.suggest_categorical('criterion', ['gini', 'entropy', 'log_loss']),
        'max_depth': trial.suggest_int('max_depth', 5, 50),
        'max_features': trial.suggest_int('max_features', 5, 10)
    }
    
    with mlflow.start_run(nested=True, run_name=f"Optuna Trial {trial.number}"):
        
        ## Train model
        dtc = DecisionTreeClassifier(**params)
        dtc.fit(X_train, y_train)

        ## Perform cross-validation
        scoring_metrics = ['accuracy', 'precision', 'recall', 'f1']
        cv_results = cross_validate(dtc, X_train, y_train, cv=3, scoring=scoring_metrics)

        ## Create input signature using the first row of X_train
        input_example = X_train.iloc[[0]]
        signature = infer_signature(input_example, dtc.predict(input_example))

        ## Compute and log average scores
        cv_results_avg = {metric: cv_results[f'test_{metric}'].mean() for metric in scoring_metrics}
        mlflow.log_metrics(cv_results_avg)
        mlflow.log_params(params)
        mlflow.sklearn.log_model(dtc, "lab_optuna_decision_tree_model", signature = signature, input_example=input_example)

        ## Return precision to maximize it
        return cv_results_avg['precision']

## Step 2: Create an Optuna Study and Log with MLflow

First, we will delete all previous runs to keep our workspace and experiment tidy. Second, you will create an Optuna study and run the experiment with MLflow.

In [0]:
## Set the MLflow experiment name and get the id
experiment_name = f"/Users/{DA.username}/Lab_Optuna_Experiment_{DA.schema_name}"
print(f"Experiment Name: {experiment_name}")
mlflow.set_experiment(experiment_name)
experiment_id = mlflow.get_experiment_by_name(experiment_name).experiment_id
print(f"Experiment ID: {experiment_id}")

print("Clearing out old runs (If you want to add more runs, change the n_trial parameter in the next cell) ...")
## Get all runs
runs = mlflow.search_runs(experiment_ids=[experiment_id], output_format="pandas")

if runs.empty:
    print("No runs found in the experiment.")
else:
    ## Iterate and delete each run
    for run_id in runs["run_id"]:
        mlflow.delete_run(run_id)
        print(f"Deleted run: {run_id}")

    print("All runs have been deleted.")

Experiment Name: /Users/labuser11091541_1754532261@vocareum.com/Lab_Optuna_Experiment_labuser11091541_1754532261
Experiment ID: 4297320214106148
Clearing out old runs (If you want to add more runs, change the n_trial parameter in the next cell) ...
Deleted run: 39ce6b43585d442b97046ce86fb0f2c5
Deleted run: e9fab673aca4451fa72cdb1edf62ff2c
Deleted run: dca00a43054a445e959378034aa1689b
Deleted run: dedecd1c856f4297b9f496d890cf162c
Deleted run: 4b3982f804ad4aca96995090d0196638
Deleted run: ca9b9cbd7eac49d583f91de6d0f224d3
Deleted run: 6991f0d0f6184e30a045c416b5d49583
Deleted run: 904b0a8b7a6a422b9aebe6decd0b35a0
Deleted run: 7a0a6b688e334139bee523df777c2b69
Deleted run: 780c0f0e84bd41d6bf7c445122447b35
Deleted run: b907949e1cf142aea12b7581b95f2651
All runs have been deleted.


### Create the Study and Log with MLflow

#### Instructions:

1. Create an Optuna study with name `lab_optuna_hpo`.
1. Maximize the objective function. 
1. Give the parent run the name `Lab_Optuna_Hyperparameter_Optimization`.
1. Only run 10 trials with Optuna.

In [0]:
study = optuna.create_study(
    study_name="lab_optuna_hpo",
    direction="maximize"
)

with mlflow.start_run(run_name='Lab_Optuna_Hyperparameter_Optimization') as parent_run:
    ## Run optimization
    study.optimize(
        optuna_objective_function, 
        n_trials=10,
        )

[I 2025-08-07 04:07:48,431] A new study created in memory with name: lab_optuna_hpo


Uploading artifacts:   0%|          | 0/11 [00:00<?, ?it/s]

Downloading artifacts:   0%|          | 0/11 [00:00<?, ?it/s]

2025/08/07 04:07:54 INFO mlflow.tracking._tracking_service.client: 🏃 View run Optuna Trial 0 at: dbc-8b9f7bce-656b.cloud.databricks.com/ml/experiments/4297320214106148/runs/94feb8776dfa434f80c652fb42b70f40.
2025/08/07 04:07:54 INFO mlflow.tracking._tracking_service.client: 🧪 View experiment at: dbc-8b9f7bce-656b.cloud.databricks.com/ml/experiments/4297320214106148.
[I 2025-08-07 04:07:54,229] Trial 0 finished with value: 0.5647793116863332 and parameters: {'criterion': 'gini', 'max_depth': 10, 'max_features': 8}. Best is trial 0 with value: 0.5647793116863332.


Uploading artifacts:   0%|          | 0/11 [00:00<?, ?it/s]

Downloading artifacts:   0%|          | 0/11 [00:00<?, ?it/s]

2025/08/07 04:07:59 INFO mlflow.tracking._tracking_service.client: 🏃 View run Optuna Trial 1 at: dbc-8b9f7bce-656b.cloud.databricks.com/ml/experiments/4297320214106148/runs/cf5500c8d2904e9c8f39b486ff738871.
2025/08/07 04:07:59 INFO mlflow.tracking._tracking_service.client: 🧪 View experiment at: dbc-8b9f7bce-656b.cloud.databricks.com/ml/experiments/4297320214106148.
[I 2025-08-07 04:07:59,602] Trial 1 finished with value: 0.5055753042920798 and parameters: {'criterion': 'entropy', 'max_depth': 18, 'max_features': 9}. Best is trial 0 with value: 0.5647793116863332.


Uploading artifacts:   0%|          | 0/11 [00:00<?, ?it/s]

Downloading artifacts:   0%|          | 0/11 [00:00<?, ?it/s]

2025/08/07 04:08:04 INFO mlflow.tracking._tracking_service.client: 🏃 View run Optuna Trial 2 at: dbc-8b9f7bce-656b.cloud.databricks.com/ml/experiments/4297320214106148/runs/6c1de4e822534437bc9b9435f3c4d0e2.
2025/08/07 04:08:04 INFO mlflow.tracking._tracking_service.client: 🧪 View experiment at: dbc-8b9f7bce-656b.cloud.databricks.com/ml/experiments/4297320214106148.
[I 2025-08-07 04:08:04,890] Trial 2 finished with value: 0.5319827420618483 and parameters: {'criterion': 'entropy', 'max_depth': 16, 'max_features': 5}. Best is trial 0 with value: 0.5647793116863332.


Uploading artifacts:   0%|          | 0/11 [00:00<?, ?it/s]

Downloading artifacts:   0%|          | 0/11 [00:00<?, ?it/s]

2025/08/07 04:08:10 INFO mlflow.tracking._tracking_service.client: 🏃 View run Optuna Trial 3 at: dbc-8b9f7bce-656b.cloud.databricks.com/ml/experiments/4297320214106148/runs/2bacaaaf8d584e92a40256e47c5a8b00.
2025/08/07 04:08:10 INFO mlflow.tracking._tracking_service.client: 🧪 View experiment at: dbc-8b9f7bce-656b.cloud.databricks.com/ml/experiments/4297320214106148.
[I 2025-08-07 04:08:10,387] Trial 3 finished with value: 0.49650374222975463 and parameters: {'criterion': 'gini', 'max_depth': 38, 'max_features': 5}. Best is trial 0 with value: 0.5647793116863332.


Uploading artifacts:   0%|          | 0/11 [00:00<?, ?it/s]

Downloading artifacts:   0%|          | 0/11 [00:00<?, ?it/s]

2025/08/07 04:08:15 INFO mlflow.tracking._tracking_service.client: 🏃 View run Optuna Trial 4 at: dbc-8b9f7bce-656b.cloud.databricks.com/ml/experiments/4297320214106148/runs/66257ffa8d0f470781925d672909c665.
2025/08/07 04:08:15 INFO mlflow.tracking._tracking_service.client: 🧪 View experiment at: dbc-8b9f7bce-656b.cloud.databricks.com/ml/experiments/4297320214106148.
[I 2025-08-07 04:08:15,629] Trial 4 finished with value: 0.4885201373040247 and parameters: {'criterion': 'gini', 'max_depth': 46, 'max_features': 6}. Best is trial 0 with value: 0.5647793116863332.


Uploading artifacts:   0%|          | 0/11 [00:00<?, ?it/s]

Downloading artifacts:   0%|          | 0/11 [00:00<?, ?it/s]

2025/08/07 04:08:20 INFO mlflow.tracking._tracking_service.client: 🏃 View run Optuna Trial 5 at: dbc-8b9f7bce-656b.cloud.databricks.com/ml/experiments/4297320214106148/runs/5e027c7274e748b4ad35ea920428d681.
2025/08/07 04:08:20 INFO mlflow.tracking._tracking_service.client: 🧪 View experiment at: dbc-8b9f7bce-656b.cloud.databricks.com/ml/experiments/4297320214106148.
[I 2025-08-07 04:08:21,114] Trial 5 finished with value: 0.5092658818985419 and parameters: {'criterion': 'log_loss', 'max_depth': 18, 'max_features': 7}. Best is trial 0 with value: 0.5647793116863332.


Uploading artifacts:   0%|          | 0/11 [00:00<?, ?it/s]

Downloading artifacts:   0%|          | 0/11 [00:00<?, ?it/s]

2025/08/07 04:08:26 INFO mlflow.tracking._tracking_service.client: 🏃 View run Optuna Trial 6 at: dbc-8b9f7bce-656b.cloud.databricks.com/ml/experiments/4297320214106148/runs/a5277cb67923409dbb31abc5bbce3c8b.
2025/08/07 04:08:26 INFO mlflow.tracking._tracking_service.client: 🧪 View experiment at: dbc-8b9f7bce-656b.cloud.databricks.com/ml/experiments/4297320214106148.
[I 2025-08-07 04:08:26,330] Trial 6 finished with value: 0.4905523288438103 and parameters: {'criterion': 'log_loss', 'max_depth': 44, 'max_features': 9}. Best is trial 0 with value: 0.5647793116863332.


Uploading artifacts:   0%|          | 0/11 [00:00<?, ?it/s]

Downloading artifacts:   0%|          | 0/11 [00:00<?, ?it/s]

2025/08/07 04:08:31 INFO mlflow.tracking._tracking_service.client: 🏃 View run Optuna Trial 7 at: dbc-8b9f7bce-656b.cloud.databricks.com/ml/experiments/4297320214106148/runs/b27e67ba1fa949aaa7aefd98b2ede321.
2025/08/07 04:08:31 INFO mlflow.tracking._tracking_service.client: 🧪 View experiment at: dbc-8b9f7bce-656b.cloud.databricks.com/ml/experiments/4297320214106148.
[I 2025-08-07 04:08:31,551] Trial 7 finished with value: 0.4857366037437891 and parameters: {'criterion': 'gini', 'max_depth': 38, 'max_features': 10}. Best is trial 0 with value: 0.5647793116863332.


Uploading artifacts:   0%|          | 0/11 [00:00<?, ?it/s]

Downloading artifacts:   0%|          | 0/11 [00:00<?, ?it/s]

2025/08/07 04:08:36 INFO mlflow.tracking._tracking_service.client: 🏃 View run Optuna Trial 8 at: dbc-8b9f7bce-656b.cloud.databricks.com/ml/experiments/4297320214106148/runs/19f632fb547f4cd2b374b5365064b985.
2025/08/07 04:08:36 INFO mlflow.tracking._tracking_service.client: 🧪 View experiment at: dbc-8b9f7bce-656b.cloud.databricks.com/ml/experiments/4297320214106148.
[I 2025-08-07 04:08:36,649] Trial 8 finished with value: 0.4967086126250772 and parameters: {'criterion': 'entropy', 'max_depth': 37, 'max_features': 8}. Best is trial 0 with value: 0.5647793116863332.


Uploading artifacts:   0%|          | 0/11 [00:00<?, ?it/s]

Downloading artifacts:   0%|          | 0/11 [00:00<?, ?it/s]

2025/08/07 04:08:41 INFO mlflow.tracking._tracking_service.client: 🏃 View run Optuna Trial 9 at: dbc-8b9f7bce-656b.cloud.databricks.com/ml/experiments/4297320214106148/runs/189c57bea9304d2ea085b25c8ee97790.
2025/08/07 04:08:41 INFO mlflow.tracking._tracking_service.client: 🧪 View experiment at: dbc-8b9f7bce-656b.cloud.databricks.com/ml/experiments/4297320214106148.
[I 2025-08-07 04:08:41,965] Trial 9 finished with value: 0.49183582507704454 and parameters: {'criterion': 'log_loss', 'max_depth': 38, 'max_features': 9}. Best is trial 0 with value: 0.5647793116863332.
2025/08/07 04:08:42 INFO mlflow.tracking._tracking_service.client: 🏃 View run Lab_Optuna_Hyperparameter_Optimization at: dbc-8b9f7bce-656b.cloud.databricks.com/ml/experiments/4297320214106148/runs/59eb4d82251647deb33f0f668e856997.
2025/08/07 04:08:42 INFO mlflow.tracking._tracking_service.client: 🧪 View experiment at: dbc-8b9f7bce-656b.cloud.databricks.com/ml/experiments/4297320214106148.


## Step 3. Visual Inspection of Precision Values

Here, we can view all 10 runs. After completing the code and running the following cell, scroll to the right and locate the column `metrics.precision`. Use the UI to order and order by descending. This will locate the largest precision score. Next, you will create a visual to also help understand the distribution of scores by trial. 


### Creating a precision score visual

1. **Run the next cell** to generate the table output.  
1. Click on the **plus (+) symbol** in the output cell.  
1. Select **Visualization** from the options.  
1. In the visualization settings, choose 
**Bar** and ensure **Horizontal Chart** toggle is **on**.  
1. Configure the **Y-axis**:  
   - Set **Y Column** to `tags.mlflow.runName`.  
1. Configure the **X-axis**:  
   - Set **X Columns** to `metrics.precision`.  
   - Choose **Sum** as the aggregation method.  
1. Click on the **Y-axis tab**:  
   - Ensure **Show Labels** is **on**.  
1. Apply the settings and visualize the data.


After following the above instructions, visually inspect which trial had the best run according to `precision`.

In [0]:
import mlflow
import pandas as pd

## Define your experiment name or ID
experiment_id = parent_run.info.experiment_id # Replace with your actual experiment ID

## Fetch all runs from the experiment
df_runs = mlflow.search_runs(
  experiment_ids=[experiment_id]
  )

display(df_runs)

run_id,experiment_id,status,artifact_uri,start_time,end_time,metrics.f1,metrics.recall,metrics.accuracy,metrics.precision,params.criterion,params.max_features,params.max_depth,tags.mlflow.databricks.cluster.info,tags.mlflow.rootRunId,tags.mlflow.user,tags.mlflow.source.name,tags.mlflow.runName,tags.mlflow.runColor,tags.mlflow.databricks.notebook.commandID,tags.mlflow.databricks.workspaceURL,tags.mlflow.databricks.notebookRevisionID,tags.sparkDatasourceInfo,tags.mlflow.log-model.history,tags.mlflow.databricks.cluster.libraries,tags.mlflow.databricks.cluster.id,tags.mlflow.parentRunId,tags.mlflow.databricks.notebookID,tags.mlflow.databricks.notebookPath,tags.mlflow.databricks.workspaceID,tags.mlflow.databricks.webappURL,tags.mlflow.source.type
189c57bea9304d2ea085b25c8ee97790,4297320214106148,FINISHED,dbfs:/databricks/mlflow-tracking/4297320214106148/189c57bea9304d2ea085b25c8ee97790/artifacts,2025-08-07T04:08:36.781Z,2025-08-07T04:08:41.828Z,0.4887967936069273,0.4864931874487805,0.729745889387145,0.4918358250770445,log_loss,9.0,38.0,"{""cluster_name"":""labuser11091541_1754532261"",""spark_version"":""16.3.x-cpu-ml-scala2.12"",""node_type_id"":""i3.xlarge"",""driver_node_type_id"":""i3.xlarge"",""autotermination_minutes"":120,""disk_spec"":{},""num_workers"":0}",59eb4d82251647deb33f0f668e856997,labuser11091541_1754532261@vocareum.com,/Users/labuser11091541_1754532261@vocareum.com/machine-learning-model-development-2.1.4/M02 - Hyperparameter Tuning/2.2 Lab Solution - Hyperparameter Tuning with Optuna,Optuna Trial 9,#229487,1754532603628_6841699981224735058_51395fde30134abb94f299931c04b150,dbc-8b9f7bce-656b.cloud.databricks.com,1754539721968,"path=dbfs:/Volumes/dbacademy_telco/v01/telco/telco-customer-churn.csv,format=csv path=s3://unity-catalogs-us-west-2/metastore/3812518-root/1de8b107-0623-45f2-a2ab-de9793bf8c9f/tables/4f417dc7-994a-43f5-acfd-b3aa99ece837,version=0,format=delta","[{""artifact_path"":""lab_optuna_decision_tree_model"",""flavors"":{""python_function"":{""predict_fn"":""predict"",""model_path"":""model.pkl"",""loader_module"":""mlflow.sklearn"",""env"":{""conda"":""conda.yaml"",""virtualenv"":""python_env.yaml""},""python_version"":""3.12.3""},""sklearn"":{""pickled_model"":""model.pkl"",""sklearn_version"":""1.4.2"",""serialization_format"":""cloudpickle"",""code"":null}},""utc_time_created"":""2025-08-07 04:08:37.203429""}]","{""installable"":[],""redacted"":[]}",0807-020503-n8pcq51u,59eb4d82251647deb33f0f668e856997,4297320214104202,/Users/labuser11091541_1754532261@vocareum.com/machine-learning-model-development-2.1.4/M02 - Hyperparameter Tuning/2.2 Lab Solution - Hyperparameter Tuning with Optuna,182135318479115,https://oregon.cloud.databricks.com,NOTEBOOK
19f632fb547f4cd2b374b5365064b985,4297320214106148,FINISHED,dbfs:/databricks/mlflow-tracking/4297320214106148/19f632fb547f4cd2b374b5365064b985/artifacts,2025-08-07T04:08:31.679Z,2025-08-07T04:08:36.52Z,0.5020483687653235,0.5078459316795461,0.732286995515695,0.4967086126250772,entropy,8.0,37.0,"{""cluster_name"":""labuser11091541_1754532261"",""spark_version"":""16.3.x-cpu-ml-scala2.12"",""node_type_id"":""i3.xlarge"",""driver_node_type_id"":""i3.xlarge"",""autotermination_minutes"":120,""disk_spec"":{},""num_workers"":0}",59eb4d82251647deb33f0f668e856997,labuser11091541_1754532261@vocareum.com,/Users/labuser11091541_1754532261@vocareum.com/machine-learning-model-development-2.1.4/M02 - Hyperparameter Tuning/2.2 Lab Solution - Hyperparameter Tuning with Optuna,Optuna Trial 8,#5bc5db,1754532603628_6841699981224735058_51395fde30134abb94f299931c04b150,dbc-8b9f7bce-656b.cloud.databricks.com,1754539716649,"path=dbfs:/Volumes/dbacademy_telco/v01/telco/telco-customer-churn.csv,format=csv path=s3://unity-catalogs-us-west-2/metastore/3812518-root/1de8b107-0623-45f2-a2ab-de9793bf8c9f/tables/4f417dc7-994a-43f5-acfd-b3aa99ece837,version=0,format=delta","[{""artifact_path"":""lab_optuna_decision_tree_model"",""flavors"":{""python_function"":{""predict_fn"":""predict"",""model_path"":""model.pkl"",""loader_module"":""mlflow.sklearn"",""env"":{""conda"":""conda.yaml"",""virtualenv"":""python_env.yaml""},""python_version"":""3.12.3""},""sklearn"":{""pickled_model"":""model.pkl"",""sklearn_version"":""1.4.2"",""serialization_format"":""cloudpickle"",""code"":null}},""utc_time_created"":""2025-08-07 04:08:32.070666""}]","{""installable"":[],""redacted"":[]}",0807-020503-n8pcq51u,59eb4d82251647deb33f0f668e856997,4297320214104202,/Users/labuser11091541_1754532261@vocareum.com/machine-learning-model-development-2.1.4/M02 - Hyperparameter Tuning/2.2 Lab Solution - Hyperparameter Tuning with Optuna,182135318479115,https://oregon.cloud.databricks.com,NOTEBOOK
b27e67ba1fa949aaa7aefd98b2ede321,4297320214106148,FINISHED,dbfs:/databricks/mlflow-tracking/4297320214106148/b27e67ba1fa949aaa7aefd98b2ede321/artifacts,2025-08-07T04:08:26.461Z,2025-08-07T04:08:31.423Z,0.4888887564561573,0.4921218934709659,0.7263079222720479,0.4857366037437891,gini,10.0,38.0,"{""cluster_name"":""labuser11091541_1754532261"",""spark_version"":""16.3.x-cpu-ml-scala2.12"",""node_type_id"":""i3.xlarge"",""driver_node_type_id"":""i3.xlarge"",""autotermination_minutes"":120,""disk_spec"":{},""num_workers"":0}",59eb4d82251647deb33f0f668e856997,labuser11091541_1754532261@vocareum.com,/Users/labuser11091541_1754532261@vocareum.com/machine-learning-model-development-2.1.4/M02 - Hyperparameter Tuning/2.2 Lab Solution - Hyperparameter Tuning with Optuna,Optuna Trial 7,#edb732,1754532603628_6841699981224735058_51395fde30134abb94f299931c04b150,dbc-8b9f7bce-656b.cloud.databricks.com,1754539711553,"path=dbfs:/Volumes/dbacademy_telco/v01/telco/telco-customer-churn.csv,format=csv path=s3://unity-catalogs-us-west-2/metastore/3812518-root/1de8b107-0623-45f2-a2ab-de9793bf8c9f/tables/4f417dc7-994a-43f5-acfd-b3aa99ece837,version=0,format=delta","[{""artifact_path"":""lab_optuna_decision_tree_model"",""flavors"":{""python_function"":{""predict_fn"":""predict"",""model_path"":""model.pkl"",""loader_module"":""mlflow.sklearn"",""env"":{""conda"":""conda.yaml"",""virtualenv"":""python_env.yaml""},""python_version"":""3.12.3""},""sklearn"":{""pickled_model"":""model.pkl"",""sklearn_version"":""1.4.2"",""serialization_format"":""cloudpickle"",""code"":null}},""utc_time_created"":""2025-08-07 04:08:26.902086""}]","{""installable"":[],""redacted"":[]}",0807-020503-n8pcq51u,59eb4d82251647deb33f0f668e856997,4297320214104202,/Users/labuser11091541_1754532261@vocareum.com/machine-learning-model-development-2.1.4/M02 - Hyperparameter Tuning/2.2 Lab Solution - Hyperparameter Tuning with Optuna,182135318479115,https://oregon.cloud.databricks.com,NOTEBOOK
a5277cb67923409dbb31abc5bbce3c8b,4297320214106148,FINISHED,dbfs:/databricks/mlflow-tracking/4297320214106148/a5277cb67923409dbb31abc5bbce3c8b/artifacts,2025-08-07T04:08:21.258Z,2025-08-07T04:08:26.188Z,0.4919032476021858,0.4937978246016469,0.7289985052316892,0.4905523288438103,log_loss,9.0,44.0,"{""cluster_name"":""labuser11091541_1754532261"",""spark_version"":""16.3.x-cpu-ml-scala2.12"",""node_type_id"":""i3.xlarge"",""driver_node_type_id"":""i3.xlarge"",""autotermination_minutes"":120,""disk_spec"":{},""num_workers"":0}",59eb4d82251647deb33f0f668e856997,labuser11091541_1754532261@vocareum.com,/Users/labuser11091541_1754532261@vocareum.com/machine-learning-model-development-2.1.4/M02 - Hyperparameter Tuning/2.2 Lab Solution - Hyperparameter Tuning with Optuna,Optuna Trial 6,#c565c7,1754532603628_6841699981224735058_51395fde30134abb94f299931c04b150,dbc-8b9f7bce-656b.cloud.databricks.com,1754539706331,"path=dbfs:/Volumes/dbacademy_telco/v01/telco/telco-customer-churn.csv,format=csv path=s3://unity-catalogs-us-west-2/metastore/3812518-root/1de8b107-0623-45f2-a2ab-de9793bf8c9f/tables/4f417dc7-994a-43f5-acfd-b3aa99ece837,version=0,format=delta","[{""artifact_path"":""lab_optuna_decision_tree_model"",""flavors"":{""python_function"":{""predict_fn"":""predict"",""model_path"":""model.pkl"",""loader_module"":""mlflow.sklearn"",""env"":{""conda"":""conda.yaml"",""virtualenv"":""python_env.yaml""},""python_version"":""3.12.3""},""sklearn"":{""pickled_model"":""model.pkl"",""sklearn_version"":""1.4.2"",""serialization_format"":""cloudpickle"",""code"":null}},""utc_time_created"":""2025-08-07 04:08:21.660545""}]","{""installable"":[],""redacted"":[]}",0807-020503-n8pcq51u,59eb4d82251647deb33f0f668e856997,4297320214104202,/Users/labuser11091541_1754532261@vocareum.com/machine-learning-model-development-2.1.4/M02 - Hyperparameter Tuning/2.2 Lab Solution - Hyperparameter Tuning with Optuna,182135318479115,https://oregon.cloud.databricks.com,NOTEBOOK
5e027c7274e748b4ad35ea920428d681,4297320214106148,FINISHED,dbfs:/databricks/mlflow-tracking/4297320214106148/5e027c7274e748b4ad35ea920428d681/artifacts,2025-08-07T04:08:15.757Z,2025-08-07T04:08:20.921Z,0.5130938172283945,0.5173990230201585,0.7390134529147981,0.5092658818985419,log_loss,7.0,18.0,"{""cluster_name"":""labuser11091541_1754532261"",""spark_version"":""16.3.x-cpu-ml-scala2.12"",""node_type_id"":""i3.xlarge"",""driver_node_type_id"":""i3.xlarge"",""autotermination_minutes"":120,""disk_spec"":{},""num_workers"":0}",59eb4d82251647deb33f0f668e856997,labuser11091541_1754532261@vocareum.com,/Users/labuser11091541_1754532261@vocareum.com/machine-learning-model-development-2.1.4/M02 - Hyperparameter Tuning/2.2 Lab Solution - Hyperparameter Tuning with Optuna,Optuna Trial 5,#87cebf,1754532603628_6841699981224735058_51395fde30134abb94f299931c04b150,dbc-8b9f7bce-656b.cloud.databricks.com,1754539701114,"path=dbfs:/Volumes/dbacademy_telco/v01/telco/telco-customer-churn.csv,format=csv path=s3://unity-catalogs-us-west-2/metastore/3812518-root/1de8b107-0623-45f2-a2ab-de9793bf8c9f/tables/4f417dc7-994a-43f5-acfd-b3aa99ece837,version=0,format=delta","[{""artifact_path"":""lab_optuna_decision_tree_model"",""flavors"":{""python_function"":{""predict_fn"":""predict"",""model_path"":""model.pkl"",""loader_module"":""mlflow.sklearn"",""env"":{""conda"":""conda.yaml"",""virtualenv"":""python_env.yaml""},""python_version"":""3.12.3""},""sklearn"":{""pickled_model"":""model.pkl"",""sklearn_version"":""1.4.2"",""serialization_format"":""cloudpickle"",""code"":null}},""utc_time_created"":""2025-08-07 04:08:16.219305""}]","{""installable"":[],""redacted"":[]}",0807-020503-n8pcq51u,59eb4d82251647deb33f0f668e856997,4297320214104202,/Users/labuser11091541_1754532261@vocareum.com/machine-learning-model-development-2.1.4/M02 - Hyperparameter Tuning/2.2 Lab Solution - Hyperparameter Tuning with Optuna,182135318479115,https://oregon.cloud.databricks.com,NOTEBOOK
66257ffa8d0f470781925d672909c665,4297320214106148,FINISHED,dbfs:/databricks/mlflow-tracking/4297320214106148/66257ffa8d0f470781925d672909c665/artifacts,2025-08-07T04:08:10.543Z,2025-08-07T04:08:15.498Z,0.4970622064126015,0.5061804100589935,0.7276532137518684,0.4885201373040247,gini,6.0,46.0,"{""cluster_name"":""labuser11091541_1754532261"",""spark_version"":""16.3.x-cpu-ml-scala2.12"",""node_type_id"":""i3.xlarge"",""driver_node_type_id"":""i3.xlarge"",""autotermination_minutes"":120,""disk_spec"":{},""num_workers"":0}",59eb4d82251647deb33f0f668e856997,labuser11091541_1754532261@vocareum.com,/Users/labuser11091541_1754532261@vocareum.com/machine-learning-model-development-2.1.4/M02 - Hyperparameter Tuning/2.2 Lab Solution - Hyperparameter Tuning with Optuna,Optuna Trial 4,#e57439,1754532603628_6841699981224735058_51395fde30134abb94f299931c04b150,dbc-8b9f7bce-656b.cloud.databricks.com,1754539695634,"path=dbfs:/Volumes/dbacademy_telco/v01/telco/telco-customer-churn.csv,format=csv path=s3://unity-catalogs-us-west-2/metastore/3812518-root/1de8b107-0623-45f2-a2ab-de9793bf8c9f/tables/4f417dc7-994a-43f5-acfd-b3aa99ece837,version=0,format=delta","[{""artifact_path"":""lab_optuna_decision_tree_model"",""flavors"":{""python_function"":{""predict_fn"":""predict"",""model_path"":""model.pkl"",""loader_module"":""mlflow.sklearn"",""env"":{""conda"":""conda.yaml"",""virtualenv"":""python_env.yaml""},""python_version"":""3.12.3""},""sklearn"":{""pickled_model"":""model.pkl"",""sklearn_version"":""1.4.2"",""serialization_format"":""cloudpickle"",""code"":null}},""utc_time_created"":""2025-08-07 04:08:10.950906""}]","{""installable"":[],""redacted"":[]}",0807-020503-n8pcq51u,59eb4d82251647deb33f0f668e856997,4297320214104202,/Users/labuser11091541_1754532261@vocareum.com/machine-learning-model-development-2.1.4/M02 - Hyperparameter Tuning/2.2 Lab Solution - Hyperparameter Tuning with Optuna,182135318479115,https://oregon.cloud.databricks.com,NOTEBOOK
2bacaaaf8d584e92a40256e47c5a8b00,4297320214106148,FINISHED,dbfs:/databricks/mlflow-tracking/4297320214106148/2bacaaaf8d584e92a40256e47c5a8b00/artifacts,2025-08-07T04:08:05.039Z,2025-08-07T04:08:10.261Z,0.4987508543078893,0.501112871264642,0.7321375186846039,0.4965037422297546,gini,5.0,38.0,"{""cluster_name"":""labuser11091541_1754532261"",""spark_version"":""16.3.x-cpu-ml-scala2.12"",""node_type_id"":""i3.xlarge"",""driver_node_type_id"":""i3.xlarge"",""autotermination_minutes"":120,""disk_spec"":{},""num_workers"":0}",59eb4d82251647deb33f0f668e856997,labuser11091541_1754532261@vocareum.com,/Users/labuser11091541_1754532261@vocareum.com/machine-learning-model-development-2.1.4/M02 - Hyperparameter Tuning/2.2 Lab Solution - Hyperparameter Tuning with Optuna,Optuna Trial 3,#e87b9f,1754532603628_6841699981224735058_51395fde30134abb94f299931c04b150,dbc-8b9f7bce-656b.cloud.databricks.com,1754539690506,"path=dbfs:/Volumes/dbacademy_telco/v01/telco/telco-customer-churn.csv,format=csv path=s3://unity-catalogs-us-west-2/metastore/3812518-root/1de8b107-0623-45f2-a2ab-de9793bf8c9f/tables/4f417dc7-994a-43f5-acfd-b3aa99ece837,version=0,format=delta","[{""artifact_path"":""lab_optuna_decision_tree_model"",""flavors"":{""python_function"":{""predict_fn"":""predict"",""model_path"":""model.pkl"",""loader_module"":""mlflow.sklearn"",""env"":{""conda"":""conda.yaml"",""virtualenv"":""python_env.yaml""},""python_version"":""3.12.3""},""sklearn"":{""pickled_model"":""model.pkl"",""sklearn_version"":""1.4.2"",""serialization_format"":""cloudpickle"",""code"":null}},""utc_time_created"":""2025-08-07 04:08:05.567596""}]","{""installable"":[],""redacted"":[]}",0807-020503-n8pcq51u,59eb4d82251647deb33f0f668e856997,4297320214104202,/Users/labuser11091541_1754532261@vocareum.com/machine-learning-model-development-2.1.4/M02 - Hyperparameter Tuning/2.2 Lab Solution - Hyperparameter Tuning with Optuna,182135318479115,https://oregon.cloud.databricks.com,NOTEBOOK
6c1de4e822534437bc9b9435f3c4d0e2,4297320214106148,FINISHED,dbfs:/databricks/mlflow-tracking/4297320214106148/6c1de4e822534437bc9b9435f3c4d0e2/artifacts,2025-08-07T04:07:59.741Z,2025-08-07T04:08:04.757Z,0.5318328810681657,0.5325637866391099,0.7508221225710016,0.5319827420618483,entropy,5.0,16.0,"{""cluster_name"":""labuser11091541_1754532261"",""spark_version"":""16.3.x-cpu-ml-scala2.12"",""node_type_id"":""i3.xlarge"",""driver_node_type_id"":""i3.xlarge"",""autotermination_minutes"":120,""disk_spec"":{},""num_workers"":0}",59eb4d82251647deb33f0f668e856997,labuser11091541_1754532261@vocareum.com,/Users/labuser11091541_1754532261@vocareum.com/machine-learning-model-development-2.1.4/M02 - Hyperparameter Tuning/2.2 Lab Solution - Hyperparameter Tuning with Optuna,Optuna Trial 2,#7d54b2,1754532603628_6841699981224735058_51395fde30134abb94f299931c04b150,dbc-8b9f7bce-656b.cloud.databricks.com,1754539684891,"path=dbfs:/Volumes/dbacademy_telco/v01/telco/telco-customer-churn.csv,format=csv path=s3://unity-catalogs-us-west-2/metastore/3812518-root/1de8b107-0623-45f2-a2ab-de9793bf8c9f/tables/4f417dc7-994a-43f5-acfd-b3aa99ece837,version=0,format=delta","[{""artifact_path"":""lab_optuna_decision_tree_model"",""flavors"":{""python_function"":{""predict_fn"":""predict"",""model_path"":""model.pkl"",""loader_module"":""mlflow.sklearn"",""env"":{""conda"":""conda.yaml"",""virtualenv"":""python_env.yaml""},""python_version"":""3.12.3""},""sklearn"":{""pickled_model"":""model.pkl"",""sklearn_version"":""1.4.2"",""serialization_format"":""cloudpickle"",""code"":null}},""utc_time_created"":""2025-08-07 04:08:00.155883""}]","{""installable"":[],""redacted"":[]}",0807-020503-n8pcq51u,59eb4d82251647deb33f0f668e856997,4297320214104202,/Users/labuser11091541_1754532261@vocareum.com/machine-learning-model-development-2.1.4/M02 - Hyperparameter Tuning/2.2 Lab Solution - Hyperparameter Tuning with Optuna,182135318479115,https://oregon.cloud.databricks.com,NOTEBOOK
cf5500c8d2904e9c8f39b486ff738871,4297320214106148,FINISHED,dbfs:/databricks/mlflow-tracking/4297320214106148/cf5500c8d2904e9c8f39b486ff738871/artifacts,2025-08-07T04:07:54.367Z,2025-08-07T04:07:59.471Z,0.5066239696273639,0.5089502860722647,0.7373692077727952,0.5055753042920798,entropy,9.0,18.0,"{""cluster_name"":""labuser11091541_1754532261"",""spark_version"":""16.3.x-cpu-ml-scala2.12"",""node_type_id"":""i3.xlarge"",""driver_node_type_id"":""i3.xlarge"",""autotermination_minutes"":120,""disk_spec"":{},""num_workers"":0}",59eb4d82251647deb33f0f668e856997,labuser11091541_1754532261@vocareum.com,/Users/labuser11091541_1754532261@vocareum.com/machine-learning-model-development-2.1.4/M02 - Hyperparameter Tuning/2.2 Lab Solution - Hyperparameter Tuning with Optuna,Optuna Trial 1,#479a5f,1754532603628_6841699981224735058_51395fde30134abb94f299931c04b150,dbc-8b9f7bce-656b.cloud.databricks.com,1754539679606,"path=dbfs:/Volumes/dbacademy_telco/v01/telco/telco-customer-churn.csv,format=csv path=s3://unity-catalogs-us-west-2/metastore/3812518-root/1de8b107-0623-45f2-a2ab-de9793bf8c9f/tables/4f417dc7-994a-43f5-acfd-b3aa99ece837,version=0,format=delta","[{""artifact_path"":""lab_optuna_decision_tree_model"",""flavors"":{""python_function"":{""predict_fn"":""predict"",""model_path"":""model.pkl"",""loader_module"":""mlflow.sklearn"",""env"":{""conda"":""conda.yaml"",""virtualenv"":""python_env.yaml""},""python_version"":""3.12.3""},""sklearn"":{""pickled_model"":""model.pkl"",""sklearn_version"":""1.4.2"",""serialization_format"":""cloudpickle"",""code"":null}},""utc_time_created"":""2025-08-07 04:07:54.803199""}]","{""installable"":[],""redacted"":[]}",0807-020503-n8pcq51u,59eb4d82251647deb33f0f668e856997,4297320214104202,/Users/labuser11091541_1754532261@vocareum.com/machine-learning-model-development-2.1.4/M02 - Hyperparameter Tuning/2.2 Lab Solution - Hyperparameter Tuning with Optuna,182135318479115,https://oregon.cloud.databricks.com,NOTEBOOK
94feb8776dfa434f80c652fb42b70f40,4297320214106148,FINISHED,dbfs:/databricks/mlflow-tracking/4297320214106148/94feb8776dfa434f80c652fb42b70f40/artifacts,2025-08-07T04:07:48.85Z,2025-08-07T04:07:54.09Z,0.5347489889735407,0.5095161848956116,0.7650224215246637,0.5647793116863332,gini,8.0,10.0,"{""cluster_name"":""labuser11091541_1754532261"",""spark_version"":""16.3.x-cpu-ml-scala2.12"",""node_type_id"":""i3.xlarge"",""driver_node_type_id"":""i3.xlarge"",""autotermination_minutes"":120,""disk_spec"":{},""num_workers"":0}",59eb4d82251647deb33f0f668e856997,labuser11091541_1754532261@vocareum.com,/Users/labuser11091541_1754532261@vocareum.com/machine-learning-model-development-2.1.4/M02 - Hyperparameter Tuning/2.2 Lab Solution - Hyperparameter Tuning with Optuna,Optuna Trial 0,#da4c4c,1754532603628_6841699981224735058_51395fde30134abb94f299931c04b150,dbc-8b9f7bce-656b.cloud.databricks.com,1754539674344,"path=dbfs:/Volumes/dbacademy_telco/v01/telco/telco-customer-churn.csv,format=csv path=s3://unity-catalogs-us-west-2/metastore/3812518-root/1de8b107-0623-45f2-a2ab-de9793bf8c9f/tables/4f417dc7-994a-43f5-acfd-b3aa99ece837,version=0,format=delta","[{""artifact_path"":""lab_optuna_decision_tree_model"",""flavors"":{""python_function"":{""predict_fn"":""predict"",""model_path"":""model.pkl"",""loader_module"":""mlflow.sklearn"",""env"":{""conda"":""conda.yaml"",""virtualenv"":""python_env.yaml""},""python_version"":""3.12.3""},""sklearn"":{""pickled_model"":""model.pkl"",""sklearn_version"":""1.4.2"",""serialization_format"":""cloudpickle"",""code"":null}},""utc_time_created"":""2025-08-07 04:07:49.286149""}]","{""installable"":[],""redacted"":[]}",0807-020503-n8pcq51u,59eb4d82251647deb33f0f668e856997,4297320214104202,/Users/labuser11091541_1754532261@vocareum.com/machine-learning-model-development-2.1.4/M02 - Hyperparameter Tuning/2.2 Lab Solution - Hyperparameter Tuning with Optuna,182135318479115,https://oregon.cloud.databricks.com,NOTEBOOK


Databricks visualization. Run in Databricks to view.

## Step 4. Find the Best Run Programmatically

In this step you will find the best scores using the Optuna library to find the best value and parameter values. Additionally, you will use MLflow to find these values. 

#### Instructions
1. Use the Optuna study to find the best precision score. 
1. Use the Optuna study to find the best hyperparameter values. 
1. Use the MLflow API to find the best run based on precision score.

In [0]:
## Display the best hyperparameters and metric
print(f"Best hyperparameters: {study.best_params}")
print(f"Best precision score: {study.best_value}")

Best hyperparameters: {'criterion': 'gini', 'max_depth': 10, 'max_features': 8}
Best precision score: 0.5647793116863332


In [0]:
search_runs_pd = (mlflow.search_runs(
    experiment_ids=[experiment_id],
    order_by=["metrics.precision DESC"],
    max_results=1))

## convert search_runs_pd to pyspark dataframe
search_runs_sd = spark.createDataFrame(search_runs_pd)
display(search_runs_pd)

run_id,experiment_id,status,artifact_uri,start_time,end_time,metrics.f1,metrics.recall,metrics.accuracy,metrics.precision,params.criterion,params.max_features,params.max_depth,tags.mlflow.databricks.cluster.info,tags.mlflow.rootRunId,tags.mlflow.user,tags.mlflow.source.name,tags.mlflow.runName,tags.mlflow.runColor,tags.mlflow.databricks.notebook.commandID,tags.mlflow.databricks.workspaceURL,tags.mlflow.databricks.notebookRevisionID,tags.sparkDatasourceInfo,tags.mlflow.log-model.history,tags.mlflow.databricks.cluster.libraries,tags.mlflow.parentRunId,tags.mlflow.databricks.cluster.id,tags.mlflow.databricks.notebookID,tags.mlflow.databricks.notebookPath,tags.mlflow.databricks.workspaceID,tags.mlflow.databricks.webappURL,tags.mlflow.source.type
2b5878c62ffe44e39cd529dc56616033,4297320214106148,FINISHED,dbfs:/databricks/mlflow-tracking/4297320214106148/2b5878c62ffe44e39cd529dc56616033/artifacts,2025-08-07T04:11:01.439Z,2025-08-07T04:11:06.443Z,0.5656753356212866,0.5449018950986347,0.7784753363228699,0.5909665044548491,entropy,6,8,"{""cluster_name"":""labuser11091541_1754532261"",""spark_version"":""16.3.x-cpu-ml-scala2.12"",""node_type_id"":""i3.xlarge"",""driver_node_type_id"":""i3.xlarge"",""autotermination_minutes"":120,""disk_spec"":{},""num_workers"":0}",a9057d0fc1d84658a649b4a326930e24,labuser11091541_1754532261@vocareum.com,/Users/labuser11091541_1754532261@vocareum.com/machine-learning-model-development-2.1.4/M02 - Hyperparameter Tuning/2.2 Lab - Hyperparameter Tuning with Optuna,Optuna Trial 7,#e87b9f,1754532603627_7527455431020846362_8b4a4edff31f42ba92202ed65116fb55,dbc-8b9f7bce-656b.cloud.databricks.com,1754539866584,"path=dbfs:/Volumes/dbacademy_telco/v01/telco/telco-customer-churn.csv,format=csv path=s3://unity-catalogs-us-west-2/metastore/3812518-root/1de8b107-0623-45f2-a2ab-de9793bf8c9f/tables/7af8cd24-68b5-48a9-9fe1-b46710ca4781,version=0,format=delta","[{""artifact_path"":""lab_optuna_decision_tree_model"",""flavors"":{""python_function"":{""predict_fn"":""predict"",""model_path"":""model.pkl"",""loader_module"":""mlflow.sklearn"",""env"":{""conda"":""conda.yaml"",""virtualenv"":""python_env.yaml""},""python_version"":""3.12.3""},""sklearn"":{""pickled_model"":""model.pkl"",""sklearn_version"":""1.4.2"",""serialization_format"":""cloudpickle"",""code"":null}},""utc_time_created"":""2025-08-07 04:11:01.825432""}]","{""installable"":[],""redacted"":[]}",a9057d0fc1d84658a649b4a326930e24,0807-020503-n8pcq51u,4297320214104172,/Users/labuser11091541_1754532261@vocareum.com/machine-learning-model-development-2.1.4/M02 - Hyperparameter Tuning/2.2 Lab - Hyperparameter Tuning with Optuna,182135318479115,https://oregon.cloud.databricks.com,NOTEBOOK


## Load the Best Model and Parameters and Register to Unity Catalog

#### Instructions:
1. Either use the results from above to copy and paste the run_id and experiment_id below or perform this task programmatically using `.collect()` on the `search_runs` PySpark DataFrame. 
1. Load the model from MLflow.
1. Display the results for the best model and parameters.

In [0]:
## Get the string value from run_id and experiment_id from PySpark DataFrame hpo_runs_df
run_id = search_runs_sd.select("run_id").collect()[0][0]
experiment_id = search_runs_sd.select("experiment_id").collect()[0][0]

print(f"Run ID: {run_id}")
print(f"Experiment ID: {experiment_id}")

Run ID: 2b5878c62ffe44e39cd529dc56616033
Experiment ID: 4297320214106148


In [0]:
import mlflow
import json
from mlflow.models import Model

## Grab an input example from the test set
input_example = X_test.iloc[[0]]

model_path = f"dbfs:/databricks/mlflow-tracking/{experiment_id}/{run_id}/artifacts/lab_optuna_decision_tree_model"

## Load the model using the run ID
loaded_model = mlflow.pyfunc.load_model(model_path)

## Retrieve model parameters MLflow client and get_run() method
client = mlflow.tracking.MlflowClient()
params = client.get_run(run_id).data.params

## Display model parameters
print("Best Model Parameters:")
print(json.dumps(params, indent=4))

### Register the Model to Unity Catalog

Register your model to Unity Catalog under the name `lab_optuna_model`. 

> _You can get the catalog name and schema name using `DA.catalog_name` and `DA.schema_name`, respectively._

In [0]:
mlflow.set_registry_uri("databricks-uc")
model_uri = f'runs:/{run_id}/lab_optuna_decision_tree_model'
mlflow.register_model(model_uri=model_uri, name=f"{DA.catalog_name}.{DA.schema_name}.lab_optuna_model")

Registered model 'dbacademy.labuser11091541_1754532261.lab_optuna_model' already exists. Creating a new version of this model...


Downloading artifacts:   0%|          | 0/11 [00:00<?, ?it/s]

Uploading artifacts:   0%|          | 0/11 [00:00<?, ?it/s]

Created version '2' of model 'dbacademy.labuser11091541_1754532261.lab_optuna_model'.


<ModelVersion: aliases=[], creation_timestamp=1754540472673, current_stage=None, description='', last_updated_timestamp=1754540473634, name='dbacademy.labuser11091541_1754532261.lab_optuna_model', run_id='2b5878c62ffe44e39cd529dc56616033', run_link=None, source='dbfs:/databricks/mlflow-tracking/4297320214106148/2b5878c62ffe44e39cd529dc56616033/artifacts/lab_optuna_decision_tree_model', status='READY', status_message='', tags={}, user_id='labuser11091541_1754532261@vocareum.com', version='2'>


## Conclusion

In this lab, you learned about Optuna and how to integrate Optuna trials and studies with MLflow. You also demonstrated the ability to programmatically and visually inspect the best trial. Finally, you showed how to load the MLflow model and register it to Unity Catalog.


&copy; 2025 Databricks, Inc. All rights reserved. Apache, Apache Spark, Spark, the Spark Logo, Apache Iceberg, Iceberg, and the Apache Iceberg logo are trademarks of the <a href="https://www.apache.org/" target="blank">Apache Software Foundation</a>.<br/>
<br/><a href="https://databricks.com/privacy-policy" target="blank">Privacy Policy</a> | 
<a href="https://databricks.com/terms-of-use" target="blank">Terms of Use</a> | 
<a href="https://help.databricks.com/" target="blank">Support</a>