# Loops

<div class="alert alert-warning" fade in>

**In this notebook you will learn how to use a loop to calculate the average of a list of numbers.**
    
</div>

## Looping through a list of numbers

To help you understanding the usefulness of lists, let's find the average diameter of types of white blood cells found in human blood. The diameters are given in this table.

<div>
<img src="attachment:WhiteBloodCells.png" title="Blausen.com staff (2014). Medical gallery of Blausen Medical 2014. WikiJournal of Medicine 1 (2). DOI:10.15347/wjm/2014.010. ISSN 2002-4436. - Own work"/>
</div>

Cell type | Diameter ($\mu\text{m}$)
:---|:---:
Neutrophil | 11
Eosinophil | 11
Basophil | 13.5
Small lymphocyte | 7.5
Large lymphocyte | 13.5
Monocyte | 22.5

We can store all the diameters in a single list variable like so:
```python
diameters = [11, 11, 13.5, 7.5, 13.5, 22.5]
```

Using this list let's find the average diameter and assign it to a variable called `average_diameter`.

The following code shows one way - but not the best way - of doing this.

The average is the sum of all the values, which we access using the each item's index within the list, divided by the number of items in the list, which is six. Remember that `diameters[0]` refers to the first item in the list, in this case 11.

<div class="alert alert-info">

1. Look at the code below and see if you can understand what it is trying to do.
2. Run the code.
</div>

In [1]:
diameters = [11, 11, 13.5, 7.5, 13.5, 22.5]

# Calculate the average diameter: sum the values and divide by the number of values.
average_diameters = (diameters[0] + diameters[1] + diameters[2] + diameters[3] + diameters[4] + diameters[5]) / 6

print( f'Average diameter = {average_diameters:.2f} micrometers' )

Average diameter = 13.17 micrometers


Calculating the average diameter like this is tedious and inefficient. It is also not re-usable. Which means that if we have a different set of data with a different number of values we would have to write the whole code over again.

What we really want is code that will take a list of numbers **of any size** and calculate the average of those numbers. That's where loops come in handy.

## How to calculate an average by looping

To calculate an average of a list of numbers we need two things: 
1. the number of values in the list and 
2. the sum of all the values in the list.

Step 1 is simple. The number of values, or items, in a list is its length which is obtained with the `len()` function described in the last notebook. Something like:

```python
n = len(diameters)

```

Next let's see how looping can help solve Step 2: sum of all values in the list. 

Before we do something as complicated as that, let's first write some code to loop through the list `diameters` and print out each of its values.

<div class="alert alert-info">

1. Look at the following code and see if you can predict what it will do.
2. Run the code and see if your prediction is correct.

</div>

In [2]:
diameters = [11, 11, 13.5, 7.5, 13.5, 22.5]

# Loop through diameters and print each item.

for d in diameters:
    print(d)

11
11
13.5
7.5
13.5
22.5


The loop starts at the line
```python
for d in diameters:

```
The colon at the end of this line is important; it tells Python that what follows is within the loop.

The line
```python
    print(d)

```
is **indented**. This tells Python that it is within the loop. If we remove the indentation (try it) we'll get an `IndentationError` because Python is expecting to find indented code just after we write a `for` loop. 

The code works like this. The first item in the list variable `diameters` is the number 11. The line of code `for d in diameters:` assigns 11 to the variable called `d`. Which means that the value of `d` is 11. We then print the value of `d`, i.e., we print 11. 

The loop moves onto the second item in `diameters` which is the number 11. The variable `d` is assigned the value 11 and then printed. 

The loop moves onto the third item in `diameters` which is 13.5. The variable `d` is assigned the value of 13.5 and then printed. 

And so on until the end of the `diameters` is reached and the last item, which is 22.5, is printed. At which point the code exits the loop and it is finished.

## Indentation matters: IndentationError

Python requires us to indent statements within a loop. Indentation makes code easier to read.

If we forget to indent code then Python complains with an `IndentationError`.

<div class="alert alert-info">


In the code above, un-indent the `print(d)` statement so the code looks like this:

```python
for d in diameters:
print(d)
```
and run it.
</div>

We get an `IndentationError: expected an indented block`.

There's a quick way to indent and un-indent code. 
1. First select the lines of code you want to (un)-indent either with your mouse or using **Shift-down arrow**.
2. Press **Ctrl-]** to indent the selected lines or **Ctrl-[** to un-indent the selected lines.

## The iterating variable

The variable `d` in `for d in diameters:` has a special name because it forms part of the `for` loop. It is called an **iterating variable**.

Many newcomers to Python believe that the name of the iterating variable influences the function of the code. This is not the case. The name of the iterating variable can be any valid variable name. It is usually chosen to indicate what the variable contains. 

<div class="alert alert-info">

1. Look at the following code and see if you can predict what it will do.
2. Run the code and see if your prediction is correct.

</div>

In [3]:
diameters = [11, 11, 13.5, 7.5, 13.5, 22.5]

for d in diameters:
    print(d)
    
print() # Print a blank line.

for t_shirt in diameters:
    print(t_shirt)

11
11
13.5
7.5
13.5
22.5

11
11
13.5
7.5
13.5
22.5


Both loops are equivalent and do exactly the same thing. 

It is best to name the iterating variable something meaningful. The name `d` is meaningful in this loop, whereas `t_shirt` is clearly not meaningful.

<div class="alert alert-info">

Experiment with the above examples and you will see that any non-reserved word can be used as an iterating variable - as long as you change the variable within the loop to match.
</div>

## Sum all the values in a list

Now let's see how to sum all the values in the list `diameters`.

Let's plan how we might do this by writing down something called an algorithm. An algorithm is a step-by-step procedure for solving a certain task.

1. We need a variable that keeps a running sum of the diameters. This variable will initially be set to zero.
2. Loop through the list one diameter at a time.
3. Add the current diameter to the running sum.

The following code shows how this is implemented.

<div class="alert alert-info">

1. Look at the following code and see if you can predict what it will do.
2. Run the code and see if your prediction is correct.

</div>

In [4]:
diameters = [11, 11, 13.5, 7.5, 13.5, 22.5]

# Initialise the running sum of diameters to zero.
sum_of_diameters = 0

# Loop through the list one diameter at a time.
for d in diameters:

    # Add the current diameter to the running sum of diameters.
    sum_of_diameters += d

    # Print out the running sum to see how it grows.
    print( f'This diameter is {d} so the running sum of diameters = {sum_of_diameters}' )

print( f'The final sum of diameters is {sum_of_diameters} micrometers' )

This diameter is 11 so the running sum of diameters = 11
This diameter is 11 so the running sum of diameters = 22
This diameter is 13.5 so the running sum of diameters = 35.5
This diameter is 7.5 so the running sum of diameters = 43.0
This diameter is 13.5 so the running sum of diameters = 56.5
This diameter is 22.5 so the running sum of diameters = 79.0
The final sum of diameters is 79.0 micrometers


The first thing we need is a variable to store the sum of the diameters as we loop through the list one item at a time. Let's call this variable `sum_of_diameters` so that it's clear what this variable contains. 

We have to initialise its value to zero:
```python
sum_of_diameters = 0
```

The loop starts at the line `for d in diameters:`. The colon at the end of the line tells Python that anything indented below it is within the loop.

The iterating variable `d` is assigned the value of each item in the loop one at a time. On the first iteration of the loop `d` is assigned the value 11.

The running sum `sum_of_diameters` is incremented by the value of `d`. Notice that the line `sum_of_diameters += d` is indented. This tells Python that it is within the loop.

We print out the values of `d` and `sum_of_diameters` to show what is happening in the loop. Normally we wouldn't do this.

As there are six items in the list, the loop will execute six times, each time incrementing `sum_of_diameters` by the current value of `d`.

Once the loop has finished the code drops out of the bottom of the loop (the first non-indented line below the loop) and prints the final sum in an f-string. 

## Average value of items in a list

We're almost there. We have the number of values in the list `diameters` and we've summed them. We now need to put it all together into a small program to calculate the average.

<div class="alert alert-info">

Look at the following code to make sure you understand what it does. Then run it.
    
The `print()` statement within the loop has been removed as it is not needed.
</div>

In [5]:
diameters = [11, 11, 13.5, 7.5, 13.5, 22.5]

# The number of items in the list diameters.
n = len( diameters )

# Initialise the running sum of diameters to zero.
sum_of_diameters = 0

# Loop through the list summing the diameters one at a time.
for d in diameters:
    sum_of_diameters += d

# Calculate the average diameter.
average_diameters = sum_of_diameters / n

print( f'There are {n} items in the list' )
print( f'The sum of diameters is {sum_of_diameters} micrometers' )
print( f'The average diameter of white blood cell types is {average_diameters:.2f} micrometers' )

There are 6 items in the list
The sum of diameters is 79.0 micrometers
The average diameter of white blood cell types is 13.17 micrometers


This code produces exactly the same average as the code at the top of the Notebook. It looks more complicated - it is - but it has two important advantages: it is general and reusable. 

It is general because we can calculate the average of **any** list of numbers however long it may be. It is reusable because we don't need to edit the code each time we want to calculate the average of a list of numbers.

## Exercise Notebook

[Loops](Exercises/2.2%20-%20Loops.ipynb)