# Chapter 8: More on loops

In the previous chapters, we have often used the powerful concept of looping in Python. Using loops, we can easily repeat certain actions when coding. With for loops, for instance, it is really easy to visit the items in a list in a list and print them. In this chapter, we will discuss some more advanced forms of looping, as well as new, quick ways to create and deal with lists and other iterable data sequences.

### Range

The first new function that we will discuss here is `range()`. Using this function, we can quickly generate a list of numbers in a specific range:

In [4]:
series = 12
for i in range(series):
    print(i)
    

0
1
2
3
4
5
6
7
8
9
10
11


Here, `range()` will return a number of integers, starting from zero, up to (but not including) the number which we pass as an argument to the function. Using `range()` is of course much more convenient to generate such lists of numbers than writing e.g. a while-loop to achieve the same result. Note that we can pass more than one argument to `range()`, if we want to start counting from a number higher than zero (which will be the default when you only pass a single parameter to the function):  

In [5]:
for i in range(300, 306):
    print(i)

300
301
302
303
304
305


We can even specify a 'step size' as a third argument, which controls how much a variable will increase with each step:

In [6]:
for i in range(15, 26, 3):
    print(i)

15
18
21
24


If you don't specify the step size explicitly, it will default to 1. If you want to store or print the result of calling `range()`, you have to cast it explicitly, for instance, to a list:

In [8]:
numbers = list(range(10))
print(numbers[3:])

[3, 4, 5, 6, 7, 8, 9]


### Enumerate

Of course, `range()` can also be used to iterate over the items in a list or tuple, typically in combination with calling `len()` to avoid `IndexErrors`: 

In [9]:
words = "Be yourself; everyone else is already taken".split()
for i in range(len(words)):
    print(words[i])

Be
yourself;
everyone
else
is
already
taken


Naturally, the same result can more easily be obtained by looping over `words` directly:

In [10]:
 for word in words:
    print(word)

Be
yourself;
everyone
else
is
already
taken


One drawback of such an easy-to-write loop, however, is that it doesn't keep track of the index of the word that we are printing in one of the iterations. Suppose that we would like to print the index of each word in our example above, we would then have to work with  a counter...

In [12]:
for i, word in enumerate(words):
    print(word, i)


Be 0
yourself; 1
everyone 2
else 3
is 4
already 5
taken 6


... or indeed use a call to `range()` and `len()`:

In [None]:
for i in range(len(words)):
    print(words[i], ": index", i)      

A function that makes life in Python much easier in this respect is `enumerate()`. If we pass a list to `enumerate()`, it will return a list of mini-tuples: each mini-tuple will contain as its first element the indices of the items, and as second element the actual item:

In [13]:
print(list(enumerate(words)))

[(0, 'Be'), (1, 'yourself;'), (2, 'everyone'), (3, 'else'), (4, 'is'), (5, 'already'), (6, 'taken')]


Here -- as with `range()` -- we have to cast the result of `enumerate()` to e.g. a list before we can actually print it. Iterating over the result of `enumerate()`, on the other hand, is not a problem. Here, we print out each mini-tuple, consisting of an index and an item in a for loop:

In [14]:
for mini_tuple in enumerate(words):
    print(mini_tuple)

(0, 'Be')
(1, 'yourself;')
(2, 'everyone')
(3, 'else')
(4, 'is')
(5, 'already')
(6, 'taken')


When using such for loops and `enumerate()`, we can do something really cool. Remember that we can 'unpack' tuples with multiple assignment: if a tuple consists of two elements, we can unpack it on one line of code to two different variables via the assignment operator: 

In [15]:
item = (5, 'already')
index, word = item # this is the same as: index, word = (5, "already")
print(index)
print(word)

5
already


In our for loop example, we can apply the same kind of unpacking in each iteration:

In [16]:
for item in enumerate(words):
    index, word = item
    print(index)
    print(word)
    print("=======")

0
Be
1
yourself;
2
everyone
3
else
4
is
5
already
6
taken


However, there is also a super-convenient shortcut for this in Python, where we unpack each item in the for-statement already:

In [19]:
d = {1: "hi", 2: "my", 3: "name", 4: "is"}
for key_value in d.items():
    print(key_value)

(1, 'hi')
(2, 'my')
(3, 'name')
(4, 'is')


How cool is that? Note how easy it becomes now, to solve our problem with the index above:

In [20]:
for i, word in enumerate(words):
    print(word, ": index", i)

Be : index 0
yourself; : index 1
everyone : index 2
else : index 3
is : index 4
already : index 5
taken : index 6


#### DIY 1
Let's put that into practice. First, extract the lines from the file `f` and put it in the variable `lines`. Then, loop over `lines` and print the line number followed by the line itself. Remember that `enumerate()` uses zero-based indexing...

In [25]:
import codecs
f = codecs.open("data/austen-emma-excerpt.txt", "r", "utf-8")
for i, line in enumerate(f.readlines()):
    print(f"This is line number {i}: {line}")

This is line number 0: Emma by Jane Austen 1816

This is line number 1: 

This is line number 2: VOLUME I

This is line number 3: 

This is line number 4: CHAPTER I

This is line number 5: 

This is line number 6: 

This is line number 7: Emma Woodhouse, handsome, clever, and rich, with a comfortable home

This is line number 8: and happy disposition, seemed to unite some of the best blessings

This is line number 9: of existence; and had lived nearly twenty-one years in the world

This is line number 10: with very little to distress or vex her.

This is line number 11: 

This is line number 12: She was the youngest of the two daughters of a most affectionate,

This is line number 13: indulgent father; and had, in consequence of her sister's marriage,

This is line number 14: been mistress of his house from a very early period.  Her mother

This is line number 15: had died too long ago for her to have more than an indistinct

This is line number 16: remembrance of her caresses; and her

### Zip

Obviously, `enumerate()` can be really useful when you're working with lists or other kinds of data sequences. Another helpful function in this respect is `zip()`. Suppose that we have a small database of 5 books in the forms of three lists: the first list contains the titles of the books, the second the author, while the third list contains the dates of publication: 

In [27]:
titles = ["Emma", "Stoner", "Inferno", "1984", "Aeneid"]
authors = ["J. Austen", "J. Williams", "D. Alighieri", "G. Orwell", "P. Vergilius"]
dates = ["1815", "2006", "Ca. 1321", "1949", "before 19 BC"]

In each of these lists, the third item always corresponds to Dante's masterpiece and the last item to the Aeneid by Vergil, which inspired him. The use of `zip()` can now easily be illustrated:

In [28]:
print(list(zip(titles, authors)))
print(list(zip(titles, dates)))
print(list(zip(authors, dates)))

[('Emma', 'J. Austen'), ('Stoner', 'J. Williams'), ('Inferno', 'D. Alighieri'), ('1984', 'G. Orwell'), ('Aeneid', 'P. Vergilius')]
[('Emma', '1815'), ('Stoner', '2006'), ('Inferno', 'Ca. 1321'), ('1984', '1949'), ('Aeneid', 'before 19 BC')]
[('J. Austen', '1815'), ('J. Williams', '2006'), ('D. Alighieri', 'Ca. 1321'), ('G. Orwell', '1949'), ('P. Vergilius', 'before 19 BC')]


Do you see what happened here? In fact, `zip()` really functions like a 'zipper' in the real-world: it zips together multiple lists, and return a list of mini-tuples, in which the correct authors, titles and dates will be combined with each other. Moreover, you can pass more than two sequences at once to `zip()`:

In [30]:
print(list(zip(authors, titles, dates)))
zipped_book_info = zip(authors, titles, dates)
for bookinf in zipped_book_info:
    print(bookinf)

[('J. Austen', 'Emma', '1815'), ('J. Williams', 'Stoner', '2006'), ('D. Alighieri', 'Inferno', 'Ca. 1321'), ('G. Orwell', '1984', '1949'), ('P. Vergilius', 'Aeneid', 'before 19 BC')]
('J. Austen', 'Emma', '1815')
('J. Williams', 'Stoner', '2006')
('D. Alighieri', 'Inferno', 'Ca. 1321')
('G. Orwell', '1984', '1949')
('P. Vergilius', 'Aeneid', 'before 19 BC')


How awesome is that? Here too: don't forget to cast the result of `zip()` to a list or tuple, e.g. if you want to print it. As with `enumerate()` we can now also unzip each mini-tuple when declaring a for-loop:

In [None]:
for author, title in zip(authors, titles):
    print(author)
    print(title)
    print("===")

As you can understand, this is really useful functionality for dealing with long, complex lists and especially combinations of them.

#### DIY 2
Suppose you want to make a `rot13` dictionary for the Caesar cipher (which maps each character to the character 13 places up or down the alphabet). Can you try making this dictionary using `letters1` and `letters2` and `zip()`?


In [44]:
letters1 = list("abcdefghijklm")
letters2 = list("nopqrstuvwxyz")
rot13 = {}

## Your code goes here
letterszip1 = list(zip(letters1, letters2))
letterszip2 = list(zip(letters2, letters1))
letterszip1.extend(letterszip2)
print(letterszip1)

rot13 = dict(letterszip1)

for letter, encoding in letterszip1:
    rot13[letter] = encoding
    
print(rot13)
## Test
message = "pbatenghyngvbaf, lbh oebxr gur pbqr!"
decrypted_message = "".join(rot13.get(l, l) for l in message)
    # FYI: the .get dictionary method looks up the first argument as a key, and returns the second argument in case of a KeyError
print(decrypted_message)

[('a', 'n'), ('b', 'o'), ('c', 'p'), ('d', 'q'), ('e', 'r'), ('f', 's'), ('g', 't'), ('h', 'u'), ('i', 'v'), ('j', 'w'), ('k', 'x'), ('l', 'y'), ('m', 'z'), ('n', 'a'), ('o', 'b'), ('p', 'c'), ('q', 'd'), ('r', 'e'), ('s', 'f'), ('t', 'g'), ('u', 'h'), ('v', 'i'), ('w', 'j'), ('x', 'k'), ('y', 'l'), ('z', 'm')]
{'a': 'n', 'b': 'o', 'c': 'p', 'd': 'q', 'e': 'r', 'f': 's', 'g': 't', 'h': 'u', 'i': 'v', 'j': 'w', 'k': 'x', 'l': 'y', 'm': 'z', 'n': 'a', 'o': 'b', 'p': 'c', 'q': 'd', 'r': 'e', 's': 'f', 't': 'g', 'u': 'h', 'v': 'i', 'w': 'j', 'x': 'k', 'y': 'l', 'z': 'm'}
congratulations, you broke the code!


### Bonus material: comprehensions

Lists and for loops are used all the time in programming. If you are interested, let's have to look at a more concise and easy way to create and fill new lists in Python: _list comprehensions_. They are also often used to change one list into another. Typically, comprehensions can be written in a single line of Python code, which is why people often feel like they are more readable than normal Python for loops. Let's start with an example. Say that we would like to fill a list of numbers that represent the length of each word in a sentence, but only if that word isn't a punctuation mark. By now, we can of course easily create such a list using a for loop:   

In [45]:
import string
words = "I have not failed . I’ve just found 10,000 ways that won’t work .".split()
word_lengths = []
for word in words:
    if word not in string.punctuation:
        word_lengths.append(len(word))
print(word_lengths)

[1, 4, 3, 6, 4, 4, 5, 6, 4, 4, 5, 4]


We can create the exact same list of numbers using a list comprehension which only takes up one line of Python code:

In [46]:
word_lengths = [len(word) for word in words if word not in string.punctuation]
print(word_lengths)

[1, 4, 3, 6, 4, 4, 5, 6, 4, 4, 5, 4]


OK, impressive, but there are a lot of new things going on here. Let's go through this step by step. The first step is easy: we initialize a variable `word_lengths` to which we assign a value using the assignment operator. The type of that value will eventually be a list: this is indicated by the square brackets which enclose the list comprehension:

In [None]:
print(type(word_lengths))

Inside the squared brackets, we can find the actual comprehension which will determine what goes inside our new list. Note that it is not always possible to read these comprehensions from left to right, so you will have to get used to the way they are built up from a syntactic point of view. First of all, all the way on the left, we add an expression that determines which elements will make it into our list, in this case: `len(word)`. The variable `word`, in this case, is generated by the following for-statement: `for word in words`. Finally, we add a condition to our statement that will determine whether or not `len(word)` should be added to our list. In this case, `len(word)` will only be included in our list if the word is not a punctuation mark: `if word not in string.punctuation`. This is a full list comprehension, but simpler ones exist. We could for instance not have called `len()` on word before appending it to our list. Like this, we could, for example, easily remove all punctuation for our wordlist:  

In [49]:
words_without_punc = [word for word in words if word not in string.punctuation]
print(words_without_punc)

['I', 'have', 'not', 'failed', 'I’ve', 'just', 'found', '10,000', 'ways', 'that', 'won’t', 'work']


Moreover, we don't have to include the if-statement at the end (it is always optional):

In [50]:
all_word_lengths = [len(word) for word in words]
print(all_word_lengths)

[1, 4, 3, 6, 1, 4, 4, 5, 6, 4, 4, 5, 4, 1]


In the comprehensions above, `words` is the only pre-existing input to our comprehension; all the other variables are created and manipulated inside the comprehension. The new `range()` function which we saw at the beginning of this chapter is also often used as the input for a comprehension:

In [51]:
square_numbers = [x*x for x in range(10)]
print(square_numbers)

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]


Good programmers can do amazing things with comprehensions. With list comprehensions, it becomes really easy, for example, to create nested lists (lists that themselves consist of lists or tuples). Can you figure out what is happening in the following code block?

In [53]:
nested_list = [{x: x*x} for x in range(10, 22, 3)]
print(nested_list)
print(type(nested_list))
print(type(nested_list[3]))

[{10: 100}, {13: 169}, {16: 256}, {19: 361}]
<class 'list'>
<class 'dict'>


In the first line above, we create a new list (`nested_list`) but we don't fill it with single numbers, but instead with mini-lists that contain two values. We could just as easily have done this with mini-tuples, by using round brackets. Can you spot the differences below? 

In [54]:
nested_tuple = [(x,x+2) for x in range(10, 22, 3)]
print(nested_tuple)
print(type(nested_tuple))
print(type(nested_tuple[3]))

[(10, 12), (13, 15), (16, 18), (19, 21)]
<class 'list'>
<class 'tuple'>


Note that `zip()` can also be very useful in this respect, because you can unpack items inside the comprehension. Do you understand what is going in the following code block?

In [55]:
a = [2, 3, 5, 7, False, 2, 8]
b = [3, 2, 1, 7, False, False, 9]
diffs = [a-b for a, b in zip(a, b)]
print(diffs)

[-1, 1, 4, 0, 0, 2, -1]


Again, more complex comprehensions are thinkable:

In [60]:
diffs = [abs(a-b) for a,b in zip(a, b) if (a or b)] # abs converts negative numbers to positive ones
print(diffs)

[1, 1, 4, 0, 2, 1]


Lots of things going on on that one line - you are starting to become a real pro at comprehensions!

Finally, we should also mention that dictionaries can also be filled in a one-liner using such comprehensions. Since dictionaries consist of key-value pairs, the syntax is slightly more complicated. Here, you have to make sure that you link the correct key to the correct value using a colon, in the very first part of the comprehension. The following example will make this clearer:

In [61]:
counts = {word:len(word) for word in words}
print(counts)

{'I': 1, 'have': 4, 'not': 3, 'failed': 6, '.': 1, 'I’ve': 4, 'just': 4, 'found': 5, '10,000': 6, 'ways': 4, 'that': 4, 'won’t': 5, 'work': 4}


#### DIY 3
If you want to try your hand at making some list comprehensions, here you go! Try rewriting the for loops below as comprehensions! The check at the end will print `True` if you did it correctly!

In [63]:
words = ["This", "is", "a", "short", "list", "of", "WORDS", "!"]

## For loop
lowercased_words1 = []
for word in words:
    lowercased_words1.append(word.lower())

## Comprehension
lowercased_words2 = [word.lower() for word in words]

## Check
print(lowercased_words2)
print(lowercased_words1 == lowercased_words2)

['this', 'is', 'a', 'short', 'list', 'of', 'words', '!']
True


In [68]:
alphabet = "abcdefghijklmnopqrstuvwxyz"

## For loop
consonants1 = []
for letter in alphabet:
    if letter not in "aeiouy":
        consonants1.append(letter)

## Comprehension
consonants2 = [letter if letter != "k" else 4 for letter in alphabet]

## Check
print(consonants2)
print(consonants1 == consonants2)

['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 4, 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z']
False


In [65]:
## For loop
laughter1 = {}
for i in range(1, 10):
    laughter1[i] = "ha"*i

## Comprehension
laughter2 = [{i: "ha"*i} for i in range(1,10)]

## Check
print(laughter1)
print(laughter1 == laughter2)

{1: 'ha', 2: 'haha', 3: 'hahaha', 4: 'hahahaha', 5: 'hahahahaha', 6: 'hahahahahaha', 7: 'hahahahahahaha', 8: 'hahahahahahahaha', 9: 'hahahahahahahahaha'}
False


------------------------------

You've reached the end of Chapter 7! You can safely ignore the code below, it's only there to make the page pretty:

In [None]:
from IPython.core.display import HTML
def css_styling():
    styles = open("styles/custom.css", "r").read()
    return HTML(styles)
css_styling()