# List comprehension, generators, iteration

## BProf Python course

### June 25-29, 2018

#### Judit Ács

# List comprehension

- transform any iterable into a list in one line
- syntactic sugar
- example: create a list of the first N odd numbers starting from 1

In [None]:
l = []
for i in range(10):
    l.append(2*i+1)
l

one-liner equivalent

In [None]:
l = [2*i+1 for i in range(10)]
l

## The general form of list comprehension is

~~~
[<expression> for <element> in <sequence>]
~~~

conditional expressions can be added to filter the sequence:

~~~
[<expression> for <element> in <sequence> if <condition>]
~~~

In [None]:
even = [n*n for n in range(20) if n % 2 == 0]
even

which is equivalent to

In [None]:
even = []
for n in range(20):
    if n % 2 == 0:
        even.append(n)
even

- since this expression implements a filtering mechanism, there is no `else` clause

- an if-else clause can be used as the first expression though:

In [None]:
l = [1, 0, -2, 3, -1, -5, 0]

signum_l = [int(n / abs(n)) if n != 0 else 0 for n in l]
signum_l

In [None]:
n = -3.2
int(n / abs(n)) if n != 0 else 0

More than one sequence may be traversed. Is this depth-first or breadth-first traversal?

In [None]:
l1 = [1, 2, 3]
l2 = [4, 5, 6]

[(i, j) for i in l1 for j in l2]

In [None]:
[(i, j) for j in l2 for i in l1]

List comprehensions may be nested by replacing the first expression with another list comprehension:

In [None]:
matrix = [
    [1, 2, 3],
    [5, 6, 7]
]

[[e*e for e in row] for row in matrix]

## What is the type of a (list) comprehension?

In [None]:
i = (i for i in range(10))
type(i)

# Generator expressions

Generator expressions are a generalization of list comprehension. They were introduced in PEP 289 in 2002.

Check out the memory consumption of these cells.

In [None]:
12

In [None]:
N = 8
s = sum([i*2 for i in range(int(10**N))])
print(s)

In [None]:
s = sum(i*2 for i in range(int(10**N)))
print(s)

Generators do not generate a list in memory

In [None]:
even_numbers = (2*n for n in range(10))
even_numbers

therefore they can only be traversed once

In [None]:
for num in even_numbers:
    print(num)

the generator is empty after the first run

In [None]:
for num in even_numbers:
    print(num)

calling `next()` raises a `StopIteration` exception

In [None]:
even_numbers = (2*n for n in range(10))

while True:
    try:
        print(next(even_numbers))
    except StopIteration:
        break

In [None]:
# next(even_numbers)  # raises StopIteration

these are actually the defining properties of the **iteration protocol**

# Iteration protocol

A class satisfies the iteration protocol if:

1. it has a `__iter__` function that returns and iterator, which
1. has a `__next__` function (this function is called `next` in Python 2),
2. raises a `StopIteration` after a certain number of iterations

For loops use the iteration protocol.

In [None]:
class MyIterator:
    def __init__(self):
        self.iter_no = 5
        
    def __iter__(self):
        return self
    
    def __next__(self):
        if self.iter_no <= 0:
            raise StopIteration()
        self.iter_no -= 1
        print("Returning {}".format(self.iter_no))
        return self.iter_no
    
myiter = MyIterator()

for i in myiter:
    print(i)

# Set and dict comprehension

Sets and dictionaries can be instantiated via generator expressions too.

A generator expression between curly brackets instantiates a set:

In [None]:
fruit_list = ["apple", "plum", "apple", "pear"]

fruits = {fruit.title() for fruit in fruit_list}

type(fruits), len(fruits), fruits

if the expression in the generator is a key-value pair separated by a colon, it instantiates a dictionary:

In [None]:
word_list = ["apple", "plum", "pear", "apple", "apple"]
word_length = {word: len(word) for word in word_list}
type(word_length), len(word_length), word_length

In [None]:
word_list = ["apple", "plum", "pear", "avocado"]
first_letters = {word[0]: word for word in word_list}
first_letters

# `yield` keyword

- if a function uses `yield` instead of return, it becomes a **generator function**
- `yield` temporarily gives back the execution to the caller
- the generator function continues

In [None]:
def hungarian_vowels():
    alphabet = ("a", "á", "e", "é", "i", "í", "o", "ó",
                "ö", "ő", "u", "ú", "ü", "ű")
    for vowel in alphabet:
        yield vowel

this function returns a generator object

In [None]:
type(hungarian_vowels())

In [None]:
for vowel in hungarian_vowels():
    print(vowel)

In [None]:
gen = hungarian_vowels()

print("first iteration: {}".format(", ".join(gen)))
print("second iteration: {}".format(", ".join(gen)))

The `next` function returns the next element of the generator.
A `StopIteration` is raised when no more elements are left:

In [None]:
gen = hungarian_vowels()

while True:
    try:
        print("The next element is {}".format(next(gen)))
    except StopIteration:
        print("No more elements left :(")
        break

the generator function returns a new generator object every time it's called

In [None]:
gen1 = hungarian_vowels()
gen2 = hungarian_vowels()

print(gen1 is gen2)
print("gen1 first time:", list(gen1))
print("gen1 second time:", list(gen1))
print("gen2 first time:", list(gen2))

iterators can only be traversed forward, but we can easily wrap an iterator to have memory:

In [None]:
def iter_with_memory(orig_iter):
    prev = None
    for current in orig_iter:
        yield current, prev
        prev = current

In [None]:
for i in iter_with_memory(hungarian_vowels()):
    print(i)

## Q. Add a `memory_size` parameter to the previous function which specifies how many of the previous elements are stored.

You can yield them in a list or better, wrap it in a class.

# Exercises

Generator expressions can be particularly useful for formatted output. We will demonstrate this through a few examples.

In [None]:
numbers = [1, -2, 3, 1]

# print(", ".join(numbers))  # raises TypeError
print(", ".join(str(number) for number in numbers))

In [None]:
shopping_list = ["apple", "plum", "pear"]

~~~
The shopping list is:
item 1: apple
item 2: plum
item 3: pear
~~~

In [None]:
shopping_list = ["apple", "plum", "pear"]

print("The shopping list is:\n{}".format(
    "\n".join("item {0}: {1}".format(idx+1, element) for idx, element in enumerate(shopping_list))
))

## Q. Print the following shopping list with quantities.

For example:

~~~
item 1: apple, quantity: 2
item 2: pear, quantity: 1
~~~

In [None]:
shopping_list = {
    "apple": 2,
    "pear": 1,
    "plum": 5,
}
print("\n".join(
    "item {0}: {1}, quantity: {2}".format( idx+1, item, quantity)
    for idx, (item, quantity) in enumerate(shopping_list.items())
))

## Q. Print the same format in alphabetical order.

- Decreasing order by quantity

In [None]:
shopping_list = {
    "apple": 2,
    "pear": 1,
    "plum": 5,
}
print("\n".join("item {0}: {1}, quantity: {2}".format(idx+1, item, quantity)
                for idx, (item, quantity) in sorted(enumerate(shopping_list.items()))
))

In [None]:
print("\n".join(
    "item {0}: {1}, quantity: {2}".format(idx+1, item, quantity) for idx, (item, quantity) in
    enumerate(sorted(shopping_list.items(), key=lambda x: -x[1]))))

## Q. Print the list of students. 

In [None]:
students = [
    ["Joe", "John", "Mary"],
    ["Tina", "Tony", "Jeff", "Béla"],
    ["Pete", "Dave"],
]

## Q. Print one class-per-line and print the size of the class too

Example:
~~~
class 1, size: 3, students: Joe, John, Mary
class 2, size: 2, students: Pete, Dave
~~~

## Q. Sort the classes by size in increasing order

Example:
~~~
class 1, size: 2, students: Pete, Dave
class 2, size: 3, students: Joe, John, Mary
~~~