**Python Supplemennt: Loops. V.1** Author: B. Hogan. Date last edited: 10/25/18

In [116]:
import datetime
print("Date last edited:", datetime.datetime.today().strftime("%x"))

Date last edited: 10/25/18


# Loops: The Missing Lecture. 

This lecture will cover a great deal of information about loops in patient detail. It is recommended to run the cells in order as you read along, but each one should be able to run independent of the other. Much of this has already been covered implicitly or explicitly, but it is here to be a little more explicit. 

The purpose of using loops is to do something in particular to each item in a collection of items. If you have multiple items in a collection, you will often want to operate on all of the items. This is typically accomplished by going over the items one-by-one. We call this process 'looping'. As a verb we would _loop_ over a collection, but also as a noun, would refer to 'the loop' as the set of iterations. So we could _loop_ over a collection of items but also if we stop halfway through because of an error we have just broke the _loop_.   

Loops are a foundational, crucial part of programming. In fact, much of programming really comes down to iterating through collections of things and doing something with each element under some condition. Often we put layers of abstraction on top of this process, but this process is pretty much the universal one. In fact, British mathematician Alan Turing demonstrated that, loosely speaking, any mechanism by which we can take a list, do something for some of the cases and remember where we are can do any computation. Formally this is referred to as the [universal computing machine](https://en.wikipedia.org/wiki/Universal_Turing_machine), but it is often called a Turing machine after its creator/discoverer. 

## Iterating using a ```for``` loop. 
Sometimes the collection is ordered, such as a ```list```, and sometimes the collection is unordered, such as a ```set``` or a ```dictionary```. Regardless, you are able to perform some action with each element of the collection and know that once you've finished you've iterated through every element of the collection. The process of going through the items is called **iterating**. 

The way to iterate through a collection in python is to use a "for" loop. It is called a "for" loop because we use it to do something for each element in a collection. In fact, the way it is written is meant to be similar to written English. 

Since we iterate through the collection, it is convention to use the letter ```i``` to stand in as a variable representing each of the collection as we operate on it one at a time. So in code you will typically see code in a pattern like the following: 
``` python
for i in range(n):
    print(i)
```

In this case, we use a function called range. It creates a list of numbers from ```0``` up to and including ```n```. To do something one hundred times you would write ```range(100)``` and it would return an object that operates like a list with a hundred elements, ```[0,1,2...99,100]```.

Python is a language that uses the spacing of the characters to represent semantics. In this case, we have the idea of being 'inside' something or 'outside'. Being 'inside' a ```for``` loop means that we are doing some action for each element of the collection. So for each element in the range one to one hundred we would print the number. 

If we want to do something for some items in the collection but not others, we would use an ```if``` statement. This similarly uses spacing to denote what to do in that condition. So if we only want to print multiples of five we can use the if statement to do this:

``` python
for i in range(100):
    if i%5==0:
        print(i)
```
In this case the %5 means modulus 5 or "get the remainder of dividing by 5". What is the remainder of dividing by five? Just have a look by running the code below: 

In [3]:
print("i","i//5","i%5",sep="\t") 
# "sep" means separate the elements using the
# "\t" which in this case is a tab character. 

for i in range(100):    
    print(i, i//5, i%5, sep="\t")

i	i//5	i%5
0	0	0
1	0	1
2	0	2
3	0	3
4	0	4
5	1	0
6	1	1
7	1	2
8	1	3
9	1	4
10	2	0
11	2	1
12	2	2
13	2	3
14	2	4
15	3	0
16	3	1
17	3	2
18	3	3
19	3	4
20	4	0
21	4	1
22	4	2
23	4	3
24	4	4
25	5	0
26	5	1
27	5	2
28	5	3
29	5	4
30	6	0
31	6	1
32	6	2
33	6	3
34	6	4
35	7	0
36	7	1
37	7	2
38	7	3
39	7	4
40	8	0
41	8	1
42	8	2
43	8	3
44	8	4
45	9	0
46	9	1
47	9	2
48	9	3
49	9	4
50	10	0
51	10	1
52	10	2
53	10	3
54	10	4
55	11	0
56	11	1
57	11	2
58	11	3
59	11	4
60	12	0
61	12	1
62	12	2
63	12	3
64	12	4
65	13	0
66	13	1
67	13	2
68	13	3
69	13	4
70	14	0
71	14	1
72	14	2
73	14	3
74	14	4
75	15	0
76	15	1
77	15	2
78	15	3
79	15	4
80	16	0
81	16	1
82	16	2
83	16	3
84	16	4
85	17	0
86	17	1
87	17	2
88	17	3
89	17	4
90	18	0
91	18	1
92	18	2
93	18	3
94	18	4
95	19	0
96	19	1
97	19	2
98	19	3
99	19	4


So in the above output, see the first column is the count, it's ```i```. The second column is the integer division of i/5. The last column is the remainder. So in the final row here, we have 

```99	19	4```

And as we can see in the code below:

In [103]:
print(19 * 5 + 4)

99


In a Jupyter notebook you can click on the long vertical (likely blue) rectangle on the left to collapse the panel. This can be helpful with long stretches of output. Speaking of which, why not print out only every fifth line instead? Now we can see why it's handy to use modulus, it allows us to do something every $nth$ time. 

In [21]:
for i in range(100):
    if i%5==0:
        print(i)

0
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95


## Iterating Strings 

If we have a list of strings we can iterate them in the same way. See below: 

In [22]:
list_of_strings = ["apple","banana","cherry","date"]

for i in list_of_strings: 
    print(i)

apple
banana
cherry
date


In this case ```i``` stands in for the word. Interestingly, you might remember that a string itself is a collection, a collection of characters. Therefore, we can iterate through that collection as well. Iterating through a set and then iterating within that set is an example of a **double loop**. Double loops are very common because they allow you to iterate in two dimensions. See below how we can iterate through the above list, first by string then by character.  Below we will use ```word``` instead of ```i```. For our second iterator, I named it ```char``` to stand in for character. 

To note, we did not need to call the items ```word``` and ```char```. It's just helpful. You will also know that it is a 'reserved' word if you type it and it turns green instead of black. Words like ```import``` are reserved words. They are green. 

In [117]:
import time

list_of_strings = ["apple","banana","cherry","date"]

for word in list_of_strings: 
    print(word)

    for char in word: 
        print(char,end="_")
        time.sleep(0.2)
    print() # try commenting this out

apple
a_p_p_l_e_
banana
b_a_n_a_n_a_
cherry
c_h_e_r_r_y_
date
d_a_t_e_


Notice it was animated to demonstrate what it was doing. That was using the ```time``` module. Here we used it to sleep for ```0.2``` seconds before continuing the loop. See how it does this for each ```char``` and then prints the next whole word? Also notice how I used ```end="_"```? In a ```print``` statement, using the additional ```end``` _argument_ allows us to change what happens at the end of a print statement. By default it is ```"\n"```, as in go to the next line after printing this. Most of the time, this is the expeccted behavior. Here we changed it so that it was ```"_"```, the underscore character, to demonstrate how it prints character-by-character. 

Notice that I put another print between the inner loop and the next iteration of the outer loop? Delete that print statement and rerun the code. What happens? The full word (from the ```print(word)``` at the top) shows up at the end of each line with ```_``` characters. That's because when we changed the print separator we no longer at a new line at the end of ```w_o_r_d_```. So the next thing that gets printed would be appened immediately addjacent to the prevous thing that was printed.

## More on the double loop

Above there we used a 'double loop'. You will encounter a lot of double loops. Later on in python you will see some different ways to manage iterating through multiple collections of different types. We want to avoid a double loop if we can since it is considered **computationally expensive**. Sometimes, through clever tricks we avoid double loops, but many more times it makes sense conceptually. Ultimately, our code has to be clear and functional. Correct code comes first and efficient code comes second. 

The reason why the double loop in particular is common is that it can be used with tabular data. We go through each row one at a time and then for each row we go through each column one at a time. For example, if we have a table of numbers 
```
       M1 M2 M3 M4
St_1   30 18 24 25
St_2   20 28 17 22
St_3   21 22 24 18
```
We could do something for each row and then do something for each column within the row. For example, these numbers might be mark out of ```30``` for each of the four assignments (```[M1, M2 M3, M4]```). If we wanted to get the _percentage_ for each student for each assignment we would divide all those numbers by 30. Let's see this below. See that I represent the table of marks as a list of lists. One list per student.

In [101]:
import time 

raw_scores = [[30, 18, 25, 25],
              [20, 28, 17, 22],
              [21, 22, 24, 18]]
# I didn't have to make it look squared up, but I did it because 
# it is clearer this way. 
# Also, notice that python lets you line break after a comma.

for row in raw_scores: 
    for mark in row: 
        print(mark, end=", ")
        time.sleep(0.2)
    print()
    
print()
for row in raw_scores:
    for mark in row: 
        print("{:.2f}".format(mark/30), end=", ")
        time.sleep(0.2)
    print()

30, 18, 25, 25, 
20, 28, 17, 22, 
21, 22, 24, 18, 

1.00, 0.60, 0.83, 0.83, 
0.67, 0.93, 0.57, 0.73, 
0.70, 0.73, 0.80, 0.60, 


Above we did a few things to note 
- We printed by row then column (twice). We first printed the original table (up top) then the table with the percentages down below. 
- We calculated percentages one by one using ```mark/25```. 
- We again used time.sleep to show how it goes by row then by column; more specifically it is working within each inner list before going on to the next.
- We then printed both the original mark (up top) and then the ```mark/25``` down below. 

Also notice that we didn't print the entire ```mark/30```. For example, if it was 17/30, the full number would be 0.5666666666666...etc. So instead we just rounded that number up by using ```format```. Format works by getting called from a string, in this case the entire string is just the code for how to format the number ```{:.2f}```. 

That string uses braces, ```{``` and ```}``` to encase the formatting codes. Inside the braces is ```:.2f```. The ```:``` is not so important now. We make use of it when we pass ```.format(<object)``` a collection rather than a number. We will address that later. For now, just put a number in your ```.format(<number>)``` and then after the colon give 
```
<significant digits before decimal>.<sig digits after decimal><numeric type>
```
So since we want all digits before the decimal and two after, for a **f**loating point number we write ```:.2f``` inside the braces. 

## Enumerate

Recall how useful it was to have a counter? We used modulus on the iterator as it was a number. But in the case of the list of strings you can't ask ```"apple" / 5```. So how do we get the $nth$ element here? We can create a counter and have the counter iterate through numbers and use that alongside whatever collection you were iterating through. 

The basic way to get a counter is to literally initialize a counter by doing ```counter=0``` and then once every loop you add one to the counter ```counter += 1```. When we add something to a variable and we don't want to create a new variable we can say ```+=```. This is easier than saying ```counter = counter + 1```. See this below: 

In [123]:
list_of_strings = ["apple","banana","cherry","date"]

counter = 0
for i in list_of_strings: 
    print(i,str(counter),i[counter],sep="\t")
    counter += 1 

apple	0	a
banana	1	a
cherry	2	e
date	3	e


A more pythonic way of doing this would be to use a special function called ```enumerate()```. This function takes in your collection and then during the for loop returns **both** a counter and one element at a time from your collection. See enumerate in action below:  

In [55]:
list_of_strings = ["apple","banana","cherry","date"]

for counter, i in enumerate(list_of_strings):
    print(counter,i,sep=". ")

0. apple
1. banana
2. cherry
3. date


Instead of counter, it is common to see people using just the letter ```c```. Then when you want to report something every tenth or thousandth time the loop runs you would insert in your code: 

``` python
if c%1000 == 0: print("Line %s" % c)
``` 

## Iterating dictionaries

Since we know that an iterator can return multiple things, this can help us use an iterator with something like a dictionary. Recall that dictionaries have ```key```,```value``` pairs? We can return these pairs using the ```{}.items()``` method. Below we will
- Create a dictionary. 
- Just print the items() method. 
- See how this can used in the syntax of a ```for``` loop 

In [58]:
ingredients = {"salt":"parmesean", 
               "fat":"olive oil",
               "acid":"vinegar",
               "heat":"toast"
              }

print( ingredients.items() ,end="\n\n")

for quality,food in ingredients.items():
    print(quality,food,sep=" --> ")

dict_items([('salt', 'parmesean'), ('fat', 'olive oil'), ('acid', 'vinegar'), ('heat', 'toast')])

salt --> parmesean
fat --> olive oil
acid --> vinegar
heat --> toast


## Iterating a table of values

Returning to the double loop, it is good to get used to using a double loop in clever ways. I like to practice using double loops with little squares of ascii art. For example, here I'll show a series of repeating loops of increasing complexity. 

In [61]:
# Square one
for row in range(5): 
    for col in range(5):
        print(row ,end=" ")
    print()

0 0 0 0 0 
1 1 1 1 1 
2 2 2 2 2 
3 3 3 3 3 
4 4 4 4 4 


In [62]:
# Square two
for row in range(5): 
    for col in range(5):
        print(col,end=" ")
    print()

0 1 2 3 4 
0 1 2 3 4 
0 1 2 3 4 
0 1 2 3 4 
0 1 2 3 4 


In [66]:
# Square three
for row in range(5):
    for col in range(5):
        print(row+col,end=" ")
    print()

0 1 2 3 4 
1 2 3 4 5 
2 3 4 5 6 
3 4 5 6 7 
4 5 6 7 8 


In [71]:
# Square four
for row in range(5):
    for col in range(5):
        if (row+col)%2==0: 
            print('x',end=" ")
        else:
            print('o',end = " ")
    print()

x o x o x 
o x o x o 
x o x o x 
o x o x o 
x o x o x 


In [76]:
# Square five
word = "point"
for i in range(5):
    print(word[i:],word[:i],sep='')

point
ointp
intpo
ntpoi
tpoin


## Clarifying continue, pass and break. 

You do not have to do everything inside a loop every time. We already saw how an ```if``` statement can be used to do something only under specific conditions. But we can use system words, often inside an if statement to direct the flow of a loop. 

Typically a loop will finish under the two following conditions: 
- At the end of the iterator in a ```for``` loop
- when the condition is ```False``` in a ```while``` loop. 

So, in: ```for i in range(10):```
it will leave the loop once ```i``` has reached 10. 

And in: ``` while foundWord == False: ```
it will leave the while loop once the variable ```foundWord``` is assigned a value of ```True```. (Yes you read that correctly). Normally you would see code like this:

~~~ python
foundWord = False
while foundWord == False:
    foundWord = doSomething()
~~~
If ```doSomething()``` returns ```False``` it just keeps looping. When ```doSomething``` returns ```True``` it is then assigned to ```foundWord```. On the next loop, ```foundWord``` is true, so ```foundWord == False``` through substitution becomes ```True == False``` which obviously evaluates to ```False``` and the loop breaks. 
    
There are other ways to leave a loop. 
- **Throw an error**. If the error isn't caught the program will stop, and by implication the loop will stop too. 
- **Use a break statement**. This will stop the loop. If you are in an 'inner' loop (for example: 

``` python
for i in range(10):
    for j in range(10):
        do something(i,j)
        break
```
    This will stop the ```j``` loop but not the outer ```i``` loop.  
- **Use a continue statement**. Continue doesn't stop the loop but it tells the program to go to the next iterator so that anything under the continue statement inside that loop is not executed.
- **Not by using a pass statement**. Sometimes we want to have an if or else condition that simply does nothing (perhaps as a placeholder for later). We cannot just say:

``` python
for i in range(10): 
    if i%2==0:
    else:
        print(i)
```
instead, you would have to place a pass statement in there. 

You can see all four of these different ways to exit a loop below.

In [106]:
print("Leave by throwing an error")

for i in range(0,4):
    print(i)
    for j in range(0,100,25):
        print("value: %s\t value / (value-75): %s" % (j,j / (j-75) ))
    print()
    
print("Hi, it's the end!")

Leave by throwing an error
0
value: 0	 value / (value-75): -0.0
value: 25	 value / (value-75): -0.5
value: 50	 value / (value-75): -2.0


ZeroDivisionError: division by zero

In [107]:
print("Now with a break statement",end="\n\n")
for i in range(0,4):
    for j in range(100,110,2):
        if str(j)[-1] == '4':
            break
        print("outer loop %s: inner loop %s" % (i, j))
    print()

print("Now with a continue statement",end="\n\n")
for i in range(0,4):
    for j in range(100,110,2):        
        if str(j)[-1] == '4':
            continue
        print("outer loop %s: inner loop %s" % (i, j))

    print()

    
print("Now with a pass statement:",end="\n\n")

for i in range(0,4):

    for j in range(100,110,2):
        if str(j)[-1] == '2':
            pass
        print("outer loop %s: inner loop %s" % (i, j))
    print()
    
print("Hi, it's the end!")

Now with a break statement

outer loop 0: inner loop 100
outer loop 0: inner loop 102

outer loop 1: inner loop 100
outer loop 1: inner loop 102

outer loop 2: inner loop 100
outer loop 2: inner loop 102

outer loop 3: inner loop 100
outer loop 3: inner loop 102

Now with a continue statement

outer loop 0: inner loop 100
outer loop 0: inner loop 102
outer loop 0: inner loop 106
outer loop 0: inner loop 108

outer loop 1: inner loop 100
outer loop 1: inner loop 102
outer loop 1: inner loop 106
outer loop 1: inner loop 108

outer loop 2: inner loop 100
outer loop 2: inner loop 102
outer loop 2: inner loop 106
outer loop 2: inner loop 108

outer loop 3: inner loop 100
outer loop 3: inner loop 102
outer loop 3: inner loop 106
outer loop 3: inner loop 108

Now with a pass statement:

outer loop 0: inner loop 100
outer loop 0: inner loop 102
outer loop 0: inner loop 104
outer loop 0: inner loop 106
outer loop 0: inner loop 108

outer loop 1: inner loop 100
outer loop 1: inner loop 102
outer

Notice that in the first case, the error happened when we tried to calculate 
$$ Value / (Value - 75)$$
$$75 / (75 - 75)$$
$$75 / 0 $$

Then it threw an error and exited the entire cell (or in a standalone program it would exit the entire program). That's really not a great approach. 

In the second case with a ```break``` statement, every time the last digit in the ```j``` iterator was a ```4``` it would break the loop. But notice that the outer loop kept going. ```break``` will break the current loop, not the entire program.  

In the third case, ```continue``` would skip the rest of the code _on that iteration_, but not break the loop. In the fourth case, ```pass``` just acts as a placeholder and does not affect the flow. 

## Advanced topics in loops

These are some basic skills about loops. For further concepts about loops, here are some topics worth investigating:
- **List comprehensions**: This is how we return a new list by running every element of a collection through a for loop.
- **Yield**: This is when we return a result but next time we call a function it starts back where it was in the loop rather than starting from the beginning. 
- **Generators**: This is what you call a function with a yield statement in it. 
- **Creating an iterator**: This is how you can make your own data structures iterable. You might create your own ```class``` of object and then want to call it directly in a for loop.
- **Halting problem**: This is a [well known](https://en.wikipedia.org/wiki/Halting_problem) proof in computer science to  help reflect on when a loop can finish, if ever. It's an extremely tricky idea, but also very profound as it shows the limits of computability. It is, some have argued, roughly equivalent to Godel's landmark incompleteness theorem but the halting problem is demonstrated using an algorithm.