
<div style="text-align: center; line-height: 0; padding-top: 9px;">
  <img
    src="https://databricks.com/wp-content/uploads/2018/03/db-academy-rgb-1200px.png"
    alt="Databricks Learning"
  >
</div>




# Loops

<!-- ## ![Spark Logo Tiny](https://files.training.databricks.com/images/105/logo_spark_tiny.png)  -->

In this lesson you:<br>

- Explore for-loops, a new way to change control flow
- Use for-loops to filter lists

## REQUIRED - SELECT CLASSIC COMPUTE
Before executing cells in this notebook, please select your classic compute cluster in the lab. Be aware that **Serverless** is enabled by default.
Follow these steps to select the classic compute cluster:
1. Navigate to the top-right of this notebook and click the drop-down menu to select your cluster. By default, the notebook will use **Serverless**.
1. If your cluster is available, select it and continue to the next cell. If the cluster is not shown:
    - In the drop-down, select **More**.
    - In the **Attach to an existing compute resource** pop-up, select the first drop-down. You will see a unique cluster name in that drop-down. Please select that cluster.
**NOTE:** If your cluster has terminated, you might need to restart it in order to select it. To do this:
1. Right-click on **Compute** in the left navigation pane and select *Open in new tab*.
1. Find the triangle icon to the right of your compute cluster name and click it.
1. Wait a few minutes for the cluster to start.
1. Once the cluster is running, complete the steps above to select your cluster.



### For-loops

Loops are a way to repeat a block of code while iterating over a sequence.

We write a [**for-loop**](https://www.w3schools.com/python/python_for_loops.asp) like this:
```
 for var_name in list:
     code_block
```

Python will execute the code in the **`code_block`** once per item in the list. Each time, it will assign **`var_name`** to be the next item in the list, starting at the beginning.



Each time the code loops, **`var_name`**, which is **`number`** in the example below, is set equal to the next item in the list. 

Let's break this down:

Step 1. **`number = 0`**, the first element in the list, and prints **`0`**.

Step 2. **`number`** is set to the next element in the list, so **`number = 1`**, and prints **`1`**.

Step 3. **`number`** is set to the next element in the list again, so **`number = 2`**, and prints **`2`**.

Using this, we can have the code in the **`for`** loop act on every item in the list.

In [0]:
for number in [0, 1, 2]:
    print(number)



If you want to execute a code block many times but don't need to iterate over a list, you can instead use [**range()**](https://www.w3schools.com/python/ref_func_range.asp).

**`range()`** takes in a start and a stop index (stop index value is exclusive, not inclusive). By default, it will increment one at a time starting at start and ending at stop-1. So **`range(0, 4)`** would iterate over the values: 0, 1, 2, 3.

For example, let's print **`"Hello!"`** 10 times.

In [0]:
for element in range(0, 10):
    print("Hello!")



Here, **`element`** is temporarily assigned to each number in that range at each iteration.

In [0]:
for element in range(0, 10):
    print(element)


Question: How can we change the code to print 1-10, not 0-9?



You can use loops to filter a list. For example, let's say we wanted to filter a list of numbers to only keep the numbers greater than 4. 

We can accomplish this by creating an empty new list, looping over our list of numbers, and adding numbers to the empty list if they are greater than 4.

In [0]:
numbers_list = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
final_list = []

for element in numbers_list:
    if element > 4:
        final_list.append(element)
        
final_list


Creating a new list from filtering one with a loop is a common enough problem that Python actually provides a very useful shortcut. 

The shortcut is called [list comprehension](https://www.w3schools.com/python/python_lists_comprehension.asp) and it looks like this: **`[var_name for var_name in list if (boolean condition)]`**

Let's use this shortcut to do the exact same thing we did in the previous cell.

In [0]:
final_list_shortcut = [element for element in numbers_list if element > 4]
final_list_shortcut


You can read this left to right as including **`element`** in the final list for each **`element`** in **`numbers`** if the **`element`** is greater than 4.

For example, let's say instead of just including every **`element`** in **`numbers`** that is greater than 4, we instead want to include **`2 * element`** for every **`element`** in **`numbers`** if **`element`** if greater than 4. Let's look at the code below.

In [0]:
doubled_list = [2 * element for element in numbers_list if element > 4]
doubled_list


The boolean expression is actually optional. Let's double every element in the list.

In [0]:
[2 * element for element in numbers_list]


### `break`

If you want to exit a loop early, before the loop has finished iterating over it's sequence, you can use **`break`**.

**`break`** is written on it's own line inside a loop code block, and when Python executes that line, Python will exit the loop code block and stop iterating over the list. 

Let's use this to stop iterating over **`numbers_list`** once we reach the number 4.

In [0]:
for element in numbers_list:
    if element == 4:
        break
    print(element)


### `continue`

**`break`** exits the loop code block when it is executed and stops iterating over the list. If instead you wanted to exit the loop code block early, but still move on and continue to execute the sequence, you could use **`continue`** instead. 

**`continue`** is also written on it's own line and when it is executed Python stops executing the loop code block and then continues to iterate over the sequence.

In [0]:
for element in numbers_list:
    if element == 4:
        continue # 4 is not printed, but the numbers after are
    print(element)


### While-loops

In addition to for-loops there is another kind of loops called while-loops. 

We write a [**while-loop**](https://www.w3schools.com/python/python_while_loops.asp) like this:
```
 while boolean expression:
     code_block
```

Python will loop and execute the code in the **`code_block`** until the boolean expression evaluates to **`False`**. Every loop, it will reevaluate the boolean expression, and if it is **`True`** it will execute the code again, otherwise it will exit. 

**NOTE:** You need to be careful not to have infinite loops here. If the boolean expression never evaluates to **`False`**, this code will keep running and never stop.

In [0]:
count = 10

while count > 0:
    print(count)
    count = count - 1

### More fun with repetition

Python also supports nested loops and ranges.

In [0]:
for a in range(0,10):
    for b in range(0,10):
        for c in range(0,10):
            print(str(a) +  str(b) + str(c))

If it is not already apparent, the general beauty of a loop is that it decouples the amount of code you write from the size of the data being processed.
- You can process a list or range of 50,000,000 elements with no more code than it takes to process a list or range of 5 elements.

&copy; 2025 Databricks, Inc. All rights reserved. Apache, Apache Spark, Spark, the Spark Logo, Apache Iceberg, Iceberg, and the Apache Iceberg logo are trademarks of the <a href="https://www.apache.org/" target="_blank">Apache Software Foundation</a>.<br/><br/><a href="https://databricks.com/privacy-policy" target="_blank">Privacy Policy</a> | <a href="https://databricks.com/terms-of-use" target="_blank">Terms of Use</a> | <a href="https://help.databricks.com/" target="_blank">Support</a>