
<div style="text-align: center; line-height: 0; padding-top: 9px;">
  <img src="https://databricks.com/wp-content/uploads/2018/03/db-academy-rgb-1200px.png" alt="Databricks Learning">
</div>


# Lab- Deploying Models with Jobs and the Databricks CLI

In this Lab, you will update the alias of a previously created model to **"Champion"**, signifying its readiness for deployment. Utilizing the **Databricks CLI**, you will construct and initiate a **Lakeflow Jobs**. This job will deploy the latest model version marked **"Champion"** if it meets the production-ready criteria, followed by executing a **batch inference** process.

**Lab Outline:**

_In this lab, you will complete the following tasks:_
- **Task 1:** Identify and update a model's alias to **"Champion"**.
- **Task 2:** Configure and use the Databricks CLI to manage jobs.
- **Task 3:** Create and run a job for model deployment and Batch Inferencing.
- **Task 4:** Monitor and explore the executing job.


## REQUIRED - SELECT CLASSIC COMPUTE
Before executing cells in this notebook, please select your classic compute cluster in the lab. Be aware that **Serverless** is enabled by default.
Follow these steps to select the classic compute cluster:
1. Navigate to the top-right of this notebook and click the drop-down menu to select your cluster. By default, the notebook will use **Serverless**.
1. If your cluster is available, select it and continue to the next cell. If the cluster is not shown:
   - In the drop-down, select **More**.
   - In the **Attach to an existing compute resource** pop-up, select the first drop-down. You will see a unique cluster name in that drop-down. Please select that cluster.
  
**NOTE:** If your cluster has terminated, you might need to restart it in order to select it. To do this:
1. Right-click on **Compute** in the left navigation pane and select *Open in new tab*.
1. Find the triangle icon to the right of your compute cluster name and click it.
1. Wait a few minutes for the cluster to start.
1. Once the cluster is running, complete the steps above to select your cluster.

## Requirements

Please review the following requirements before starting the lesson:

* To run this notebook, you need to use one of the following Databricks runtime(s): **16.3.x-cpu-ml-scala2.12**

## Classroom Setup

Before starting the demo, run the provided classroom setup script. This script will define configuration variables necessary for the demo. Execute the following cell:


In [0]:
%run ../Includes/Classroom-Setup-02Lab

Collecting databricks-sdk==0.36.0
  Using cached databricks_sdk-0.36.0-py3-none-any.whl.metadata (38 kB)
Using cached databricks_sdk-0.36.0-py3-none-any.whl (569 kB)
Installing collected packages: databricks-sdk
  Attempting uninstall: databricks-sdk
    Found existing installation: databricks-sdk 0.30.0
    Not uninstalling databricks-sdk at /databricks/python3/lib/python3.12/site-packages, outside environment /local_disk0/.ephemeral_nfs/envs/pythonEnv-b3292ff6-6449-49c2-8164-d51a69240874
    Can't uninstall 'databricks-sdk'. No files were found to uninstall.
Successfully installed databricks-sdk-0.36.0
[43mNote: you may need to restart the kernel using %restart_python or dbutils.library.restartPython() to use updated packages.[0m


**Other Conventions:**

Throughout this demo, we'll refer to the object `DA`. This object, provided by Databricks Academy, contains variables such as your username, catalog name, schema name, working directory, and dataset locations. Run the code block below to view these details:

In [0]:
print(f"Username:          {DA.username}")
print(f"Catalog Name:      {DA.catalog_name}")
print(f"Schema Name:       {DA.schema_name}")
print(f"Working Directory: {DA.paths.working_dir}")
print(f"Dataset Location:  {DA.paths.datasets}")

Username:          labuser10859621_1751977661@vocareum.com
Catalog Name:      dbacademy
Schema Name:       labuser10859621_1751977661
Working Directory: /Volumes/dbacademy/ops/labuser10859621_1751977661@vocareum_com
Dataset Location:  NestedNamespace (banking='/Volumes/dbacademy_banking/v01', cdc_diabetes='/Volumes/dbacademy_cdc_diabetes/v01', monitoring='/Volumes/dbacademy_monitoring/v01', telco='/Volumes/dbacademy_telco/v01')


### Authentication

Usually, you would have to set up authentication for the CLI. But in this training environment, that's already taken care of if you ran through the accompanying 
**'Generate Tokens'** notebook. 
If you did, credentials will already be loaded into the **`DATABRICKS_HOST`** and **`DATABRICKS_TOKEN`** environment variables. 

#####*If you did not, run through it now then restart this notebook.*

In [0]:
DA.get_credentials()

interactive(children=(Text(value='https://dbc-ef9ac081-5ec6.cloud.databricks.com', continuous_update=False, de…


### Install CLI

Install the Databricks CLI using the following cell. Note that this procedure removes any existing version that may already be installed, and installs the newest version of the [Databricks CLI](https://docs.databricks.com/en/dev-tools/cli/index.html). A legacy version exists that is distributed through **`pip`**, however we recommend following the procedure here to install the newer one.

In [0]:
%sh rm -f $(which databricks); curl -fsSL https://raw.githubusercontent.com/databricks/setup-cli/v0.211.0/install.sh | sh

Installed Databricks CLI v0.211.0 at /usr/local/bin/databricks.


### Notebook Path Setup Continued

This code cell performs the following setup tasks:
- Retrieves the current Databricks **cluster ID** and displays it.
- Identifies the path of the currently running notebook.
- Constructs **paths to related notebooks** for checking model status, deploying the model, performing model inference, and handling cases where the model is not ready for production. These paths are printed to confirm their accuracy.


In [0]:
# Retrieve the current cluster ID
cluster_id = spark.conf.get("spark.databricks.clusterUsageTags.clusterId")
print(cluster_id)

# Get the current notebook path
notebook_path = dbutils.notebook.entry_point.getDbutils().notebook().getContext().notebookPath().get()

# Base path for related notebooks
base_path = notebook_path.rsplit('/', 1)[0] + "/2.3 Lab Pipeline - Status Check and Deployment"

# Paths for specific process notebooks
check_status_notebook_path = f"{base_path}/2.3a LAB - Checking Model Status"
print(check_status_notebook_path)

deploy_notebook_path = f"{base_path}/2.3b LAB - Model Deployment"
print(deploy_notebook_path)

inference_notebook_path = f"{base_path}/2.3c LAB - Batch Inferencing"
print(inference_notebook_path)

notready_notebook_path = f"{base_path}/2.3d LAB - Not Ready For Production"
print(notready_notebook_path)

0708-122829-du6hmtbj
/Users/labuser10859621_1751977661@vocareum.com/machine-learning-operations-2.2.4/M02 - Architecting MLOps Solutions/2.3 Lab Pipeline - Status Check and Deployment/2.3a LAB - Checking Model Status
/Users/labuser10859621_1751977661@vocareum.com/machine-learning-operations-2.2.4/M02 - Architecting MLOps Solutions/2.3 Lab Pipeline - Status Check and Deployment/2.3b LAB - Model Deployment
/Users/labuser10859621_1751977661@vocareum.com/machine-learning-operations-2.2.4/M02 - Architecting MLOps Solutions/2.3 Lab Pipeline - Status Check and Deployment/2.3c LAB - Batch Inferencing
/Users/labuser10859621_1751977661@vocareum.com/machine-learning-operations-2.2.4/M02 - Architecting MLOps Solutions/2.3 Lab Pipeline - Status Check and Deployment/2.3d LAB - Not Ready For Production



## Task 1: Identify Model as Ready for Deployment

1. Navigate to **Models** in the left sidebar.
2. Apply the filter for **models Owned by Me**.
3. Locate and select the model named **churn-prediction**.
   - **Note:** Complete the Job in **Notebook: 1.2 LAB - Creating and Managing Lakeflow Jobs using UI** to create the model. If the model isn't listed, verify that your first Job Run is complete.
4. Review the information provided when your model is registered in the Unity Catalog's Model Registry.
5. Click the pencil icon next to the alias **"baseline"** and change it to **"champion"**. Save the alias.

**_This step identifies your model as ready for deployment._**

<!-- <img src="https://s3.us-west-2.amazonaws.com/files.training.databricks.com/images/Model_Deploy_Alias_Change.png" alt="Alias Change" width="700"/> -->

![Baseline](../Includes/images/Baseline.png)

![Edit_alias](../Includes/images/Edit_alias.png)

![Champion](../Includes/images/Champion.png)



## Task 2: Configuration of a Lakeflow Jobs

This code cell constructs a JSON configuration string for a Databricks Lakeflow Jobs. The configuration specifies a series of tasks designed to handle the deployment and inference stages for a model called **"churn-prediction"**. Here are the key components and their functions:

- **General Settings**:
  - The job runs sequentially with a **maximum of one concurrent run** and is named using the current user's username with the suffix `-deploy-workflow-job`.
  - It includes **email notifications** on failure, configured to alert the user.

- **Tasks Defined**:
  - **Check Status**: Verifies if the model is ready for production by checking its status.
  - **Production Ready**: Proceeds if the model status is 'ready for production'.
  - **Deploy**: Deploys the model if the previous task confirms readiness.
  - **Batch Inference**: Executes a batch inference process following successful deployment.
  - **Not Ready**: Handles scenarios where the model is not ready for production.

- **Conditional Execution**:
  - Each task, except for the initial status check, includes conditions that depend on the success of the preceding tasks.
  - If conditions are met, the next task in the Job is triggered, otherwise, alternative actions are specified.

- **File Operations**:
  - The constructed JSON string is written to a file named `workflow-job-lab.json` in write mode, ensuring that the entire Job configuration is saved externally for deployment purposes.

This setup is essential for automating model deployment workflows in a controlled and predictable manner, allowing for efficient scaling and maintenance of machine learning models.


In [0]:
workflow_config = f"""
{{
  "email_notifications": {{
    "on_failure": [
      "{DA.username}"
    ]
  }},
  "format": "MULTI_TASK",
  "max_concurrent_runs": 1,
  "name": "{DA.username}-deploy-workflow-job",
  "notification_settings": {{
    "alert_on_last_attempt": false,
    "no_alert_for_canceled_runs": false,
    "no_alert_for_skipped_runs": false
  }},
  "tasks": [
    {{
      "existing_cluster_id": "{cluster_id}",
      "notebook_task": {{
        "notebook_path": "{check_status_notebook_path}",
        "source": "WORKSPACE"
      }},
      "run_if": "ALL_SUCCESS",
      "task_key": "check_status"
    }},
    {{
      "condition_task": {{
        "left": "{{{{tasks.check_status.values.model_status}}}}",
        "op": "EQUAL_TO",
        "right": "ready_for_production"
      }},
      "depends_on": [
        {{
          "task_key": "check_status"
        }}
      ],
      "email_notifications": {{}},
      "notification_settings": {{
        "alert_on_last_attempt": false,
        "no_alert_for_canceled_runs": false,
        "no_alert_for_skipped_runs": false
      }},
      "run_if": "ALL_SUCCESS",
      "task_key": "Production_Ready",
      "timeout_seconds": 0,
      "webhook_notifications": {{}}
    }},
    {{
      "depends_on": [
        {{
          "outcome": "true",
          "task_key": "Production_Ready"
        }}
      ],
      "email_notifications": {{}},
      "existing_cluster_id": "{cluster_id}",
      "notebook_task": {{
        "notebook_path": "{deploy_notebook_path}",
        "source": "WORKSPACE"
      }},
      "notification_settings": {{
        "alert_on_last_attempt": false,
        "no_alert_for_canceled_runs": false,
        "no_alert_for_skipped_runs": false
      }},
      "run_if": "ALL_SUCCESS",
      "task_key": "Deploy",
      "timeout_seconds": 0,
      "webhook_notifications": {{}}
    }},
    {{
      "depends_on": [
        {{
          "task_key": "Deploy"
        }}
      ],
      "email_notifications": {{}},
      "existing_cluster_id": "{cluster_id}",
      "notebook_task": {{
        "notebook_path": "{inference_notebook_path}",
        "source": "WORKSPACE"
      }},
      "notification_settings": {{
        "alert_on_last_attempt": false,
        "no_alert_for_canceled_runs": false,
        "no_alert_for_skipped_runs": false
      }},
      "run_if": "ALL_SUCCESS",
      "task_key": "Batch_Inference",
      "timeout_seconds": 0,
      "webhook_notifications": {{}}
    }},
    {{
      "depends_on": [
        {{
          "outcome": "false",
          "task_key": "Production_Ready"
        }}
      ],
      "email_notifications": {{}},
      "existing_cluster_id": "{cluster_id}",
      "notebook_task": {{
        "notebook_path": "{notready_notebook_path}",
        "source": "WORKSPACE"
      }},
      "notification_settings": {{
        "alert_on_last_attempt": false,
        "no_alert_for_canceled_runs": false,
        "no_alert_for_skipped_runs": false
      }},
      "run_if": "ALL_SUCCESS",
      "task_key": "Not_Ready",
      "timeout_seconds": 0,
      "webhook_notifications": {{}}
    }}
  ],
  "queue": {{
    "enabled": true
  }},
  "run_as": {{
    "user_name": "{DA.username}"
  }}
}}
"""

with open('workflow-job-lab.json', 'w') as file:
    file.write(workflow_config)

## Task 3: Creating and Running a Lakeflow Jobs

This section details the process of creating and executing a Lakeflow Jobs using the Databricks CLI:

1. **Creating the Job**:
   - The job is created by passing the JSON configuration file `workflow-job-lab.json` to the `databricks jobs create` command. This command returns a JSON object containing details of the created job, including the `job_id`.

2. **Extracting the Job ID**:
   - The `job_id` is extracted from the JSON output using a combination of `grep` and `awk`. The `grep` command isolates the line containing `job_id`, and `awk` is used to select the second field (the actual ID value), which is then stripped of extra characters using `tr`.

3. **Running the Job**:
   - With the `job_id` extracted, the job is initiated using `databricks jobs run-now`. This command triggers the execution of the workflow defined in the job configuration file.
   
This process automates the deployment of tasks defined in the Databricks environment, ensuring that the model deployment and associated tasks are handled efficiently.


In [0]:
%sh
## Create the job and capture the output
output=$(databricks jobs create --json @workflow-job-lab.json)
echo $output
## Extract the job_id from the output
job_id=$(echo $output | grep -o '"job_id":[0-9]*' | awk -F':' '{print $2}')
echo "Extracted job_id: $job_id"

## Run the job using the extracted job_id
databricks jobs run-now $job_id

{ "job_id":816473192443836 }
Extracted job_id: 816473192443836


Error: failed to reach TERMINATED or SKIPPED, got INTERNAL_ERROR: Task check_status failed with message: Workload failed, see run output for details. This caused all downstream tasks to get skipped.


##Task 4:  Monitoring and Exploring the Executing Lakeflow Jobs

To effectively manage and gain insights from your executing Lakeflow Jobs in the Databricks environment, follow these steps:

1. **Access the Jobs Console**:
   - From the Databricks sidebar, navigate to the **Job Runs** section, which lists all configured jobs.

2. **Find and View the Job**:
   - Use the `job_id` (``) or `{DA.username}-deploy-workflow-job` to locate your job. Click on the job name to access its details page.

3. **Explore Tasks**:
   - The job's details page displays its current status (e.g., *running*, *success*, *failure*). Click on a **task tab** to view the created Job.  Click on each tasks to see the details.

4. **Explore Run Outputs**:
   - Go back to the **Run Tab** click on the Runs.  Investigate the output logs, metrics, etc. of tasks for debugging information or to verify successful execution.


# Conclusion
In this lab, you successfully set up and executed a Lakeflow Jobs using the Databricks CLI. You configured the job to check the model status, deploy the model, perform batch inference, and handle scenarios where the model is not ready for production.


&copy; 2025 Databricks, Inc. All rights reserved. Apache, Apache Spark, Spark, the Spark Logo, Apache Iceberg, Iceberg, and the Apache Iceberg logo are trademarks of the <a href="https://www.apache.org/" target="blank">Apache Software Foundation</a>.<br/>
<br/><a href="https://databricks.com/privacy-policy" target="blank">Privacy Policy</a> | 
<a href="https://databricks.com/terms-of-use" target="blank">Terms of Use</a> | 
<a href="https://help.databricks.com/" target="blank">Support</a>