
<div style="text-align: center; line-height: 0; padding-top: 9px;">
  <img src="https://databricks.com/wp-content/uploads/2018/03/db-academy-rgb-1200px.png" alt="Databricks Learning">
</div>


# Lab: Modular Orchestration

In this lab, you'll be configuring a multi-task job comprising of three notebooks.

## Learning Objectives
By the end of this lab, you should be able to:
* Schedule a Master Job consists of SubJobs (RunJobs)

## REQUIRED - SELECT CLASSIC COMPUTE

Before executing cells in this notebook, please select your classic compute cluster in the lab. Be aware that **Serverless** is enabled by default.

Follow these steps to select the classic compute cluster:

1. Navigate to the top-right of this notebook and click the drop-down menu to select your cluster. By default, the notebook will use **Serverless**.

1. If your cluster is available, select it and continue to the next cell. If the cluster is not shown:

  - In the drop-down, select **More**.

  - In the **Attach to an existing compute resource** pop-up, select the first drop-down. You will see a unique cluster name in that drop-down. Please select that cluster.

**NOTE:** If your cluster has terminated, you might need to restart it in order to select it. To do this:

1. Right-click on **Compute** in the left navigation pane and select *Open in new tab*.

1. Find the triangle icon to the right of your compute cluster name and click it.

1. Wait a few minutes for the cluster to start.

1. Once the cluster is running, complete the steps above to select your cluster.

## A. Classroom Setup

Run the following cell to configure your working environment for this course. It will also set your default catalog to **dbacademy** and the schema to your specific schema name shown below using the `USE` statements.
<br></br>
```
USE CATALOG dbacademy;
USE SCHEMA dbacademy.<your unique schema name>;
```

**NOTE:** The **DA** object is only used in Databricks Academy courses and is not available outside of these courses.

In [0]:
%run ./Includes/Classroom-Setup-5L

[43mNote: you may need to restart the kernel using %restart_python or dbutils.library.restartPython() to use updated packages.[0m


0,1
Course Catalog:,
Your Schema:,


## B. Create a Starter Job

Run the cell below to use the custom `DA` object to create a starter job for this demonstration. After the cell completes, it will create a job named **\<your-schema>_Lesson_05** with three individual jobs.

**NOTE:** The following custom method uses the Databricks SDK to programmatically create a job for demonstration purposes. You can find the method definition that uses the Databricks SDK to create the job in the [Classroom-Setup-Common]($./Includes/Classroom-Setup-Common) notebook. However, the [Databricks SDK](https://databricks-sdk-py.readthedocs.io/en/latest/) is outside the scope of this course.

In [0]:
DA.create_job_lesson05()

Created the job: labuser10356537_1748007971_Lesson_5_Job_1
Job ID: 1018178797453911
Created the job: labuser10356537_1748007971_Lesson_5_Job_2
Job ID: 1032584844568060
Created the job: labuser10356537_1748007971_Lesson_5_Job_3
Job ID: 622737496028843


##C. Using the `Run Job` Task Type
We are going to configure a job that has three "sub-jobs" where each sub-job will be a Task. The bundle we just deployed configured these sub-jobs for us. 

To confirm this, from fly-in menu, go to *Workflows* > *Jobs*.  You will see a **...Job1**, **...Job2** and **...Job3**.

To setup the full job, complete the following:


### C1. Creating a Master Job and adding Run Job as a task
1. Right-click on **Workflows** in the left navigation bar, and open the link in a new tab.
2. Click **Create job**, and give it the name of **your-schema - Modular Orchestration Job**
3. For the first task, complete the fields as follows:

Configure the task:

| Setting | Instructions |
|--|--|
| Task name | Enter **Ingest_From_Source_1** |
| Type | Choose **Run Job** |
| Job | Start typing "job_1". You should see a job that is named -> **[your-schema]_Lesson_5_job_1** Select this job.|

4. Click **Create task**

![Lesson05_RunJob1](files/images/deploy-workloads-with-databricks-workflows-2.0.2/Lesson05_RunJob1.png)


### C2. Add another Run Job as task
Now, configure the second task similar to first task. The second task is already a job being created as **[your_schema]_Lesson_5_Job_2**
1. Complete the fields as follows:
Configure the task:

| Setting | Instructions |
|--|--|
| Task name | Enter **Ingest_From_Source_2** |
| Type | Choose **Run Job** |
| Job | Start typing "job_2". You should see a job that is named -> "[your-schema]_Lesson_5_job_2" Select this job.|
|Depends on| Click the "x" to remove **Ingest_From_Source_1** from the list.

<br>

2. Click **Create task**

![Lesson05_RunJob2](files/images/deploy-workloads-with-databricks-workflows-2.0.2/Lesson05_RunJob2.png)

### C3. Adding a Dependent Run Job as task
In our scenario, we are configuring two tasks that run jobs that ingest data from two different sources (however, these example jobs do not actually ingest any data). We are now going to configure a third task that runs a different job that is designed to perform data cleaning:
1. Complete the fields as follows:
Configure the task:

| Setting | Instructions |
|--|--|
| Task name | Enter **Cleaning_Data** |
| Type | Choose **Run Job** |
| Job | Start typing "job_3". You should see a job that is named -> "[your-schema]_Lesson_5_job_3" Select this job.|
|Depends on| Click inside the field, and select **`Ingest_From_Source_2`**, and **`Ingest_From_Source_1`** to add it to the list
|Dependencies| Verify that "All succeeded" is selected.

<br>

2. Click **`Create task`**


![Lesson05_RunJob3](files/images/deploy-workloads-with-databricks-workflows-2.0.2/Lesson05_RunJob3.png)

##D. Job Parameters
In a previous lesson, we configured "Task parameters" that passed key/value data to individual tasks. In the job we are currently configuring, we want to pass key/value data to *all* tasks. We can use "Job parameters" to perform this action.



1. On the right side of the job configuration page, find the section called **`Job parameters`**, and click **`Edit parameters`**.
1. Add a parameter as follows:
  * Key: **test_value** --- Value: **Succeed**
3. Click **`Save`**.

![Lesson05_MasterJob_1](files/images/deploy-workloads-with-databricks-workflows-2.0.2/Lesson05_MasterJob_1.png)

##E. Run the Job
Click **`Run now`** to run the job


&copy; 2025 Databricks, Inc. All rights reserved.<br/>
Apache, Apache Spark, Spark, the Spark Logo, Apache Iceberg, Iceberg, and the Apache Iceberg logo are trademarks of the <a href="https://www.apache.org/">Apache Software Foundation</a>.<br/>
<br/><a href="https://databricks.com/privacy-policy">Privacy Policy</a> | 
<a href="https://databricks.com/terms-of-use">Terms of Use</a> | 
<a href="https://help.databricks.com/">Support</a>