# Iterators and generator expressions

by Koenraad De Smedt at UiB


---
Python has several constructions that iterate on data, such as `for`-loops and comprehensions. So what data types can we iterate on?
Sequences (strings, lists, tuples) and sets are some of the data types that qualify as *iterables*, that is, types that we can iterate on.

In addition, Python has *iterators* which we also can iterate on.
The difference is that while sequences and sets have a length, iterators do not, because they represent a stream of data in which elements are produced one by one, only when the program asks for them. In this sense, iterators are *lazy* and this is potentially more efficient.

This notebook aims to provide an understanding of iterators and shows how to use generator expressions.

---

## Iterators

Consider the `enumerate` function. It creates an iterator from which we can obtain tuples containing a count and the values obtained from iterating over its argument.

In [None]:
sentence = 'A rose by any other name would smell as sweet.'
sentence_enumerator = enumerate(sentence)
sentence_enumerator

We cannot yet see any elements in the enumerate object, because they have not yet been produced. For the same reason, we cannot ask for the length of the enumerate object. The following will produce an error.

In [None]:
len(sentence_enumerator)

We can however ask for the next value to be produced. This is like a ticket machine that prints a number whenever a button is pushed. Execute the following cell repeatedly until the iterator is exhausted.


In [None]:
print(next(sentence_enumerator))
print(next(sentence_enumerator))
print(next(sentence_enumerator))
print(next(sentence_enumerator))
print(next(sentence_enumerator))
print(next(sentence_enumerator))

Normally there is no need to use `next`, because programming constructions for iteration, such as `for`-loops and comprehensions, stop by themselves when the iterator is exhausted.

In [None]:
for num,chr in enumerate(sentence):
  if chr.lower() in 'aeiouy':
    print(chr, 'at position', num)

We can test if something is in the iterator. After it is found, the iterator pauses.

In [None]:
sentence_enumerator = enumerate(sentence)
(24, ' ') in sentence_enumerator


Check how much is left by unpacking the rest into a list. This shows how much we avoided computing in the previous cell.

In [None]:
[*sentence_enumerator]

---
## Generator expressions

We can make our own iterator by means of a generator expression. This is like a list comprehension, but it is written with parentheses instead of square brackets. It results in a generator, which is a special function that acts like an iterator: we can obtain subsequent values with `next`.

In [None]:
vowels = 'aeiouy'
vowels_generator = (char.upper() for char in sentence if char.lower() in vowels)
vowels_generator

Like the enumerate object above, the generator produces the next element every time we ask.
If we try to take the next element and none is left, there will be an error. Generators may be efficient if the sequence could be very long or infinite, and we need only a few elements.

Execute the following cell several times.

In [None]:
print(next(vowels_generator))
print(next(vowels_generator))
print(next(vowels_generator))
print(next(vowels_generator))
print(next(vowels_generator))

After taking out some or all elements in the previous cell, unpack all remaining elements with a starred expression.

In [None]:
[*vowels_generator]

If a generator is empty, we need to make a new one.

In [None]:
vowels_generator = (char.upper() for char in sentence if char.lower() in vowels)
vowels_generator

When we check if something is in the generator, it will only produce as many values as needed to find what we are looking for, and production pauses.

In [None]:
'Y' in vowels_generator

Now unpack all remaining elements into a list. This shows which elements were *not* computed by the code in the previous cell.

In [None]:
[*vowels_generator]

There is a built-in function `any` which checks if any element in an iterable is True. Values are generated until True is found and then it stops.

In [None]:
vowels_generator = (char.upper() for char in sentence if char.lower() in vowels)
any(v in {'U', 'Y'} for v in vowels_generator)

Again, we can check which elements of the generator were *not* computed by the above expression. This shows that a generator can be more efficient than a sequence.

In [None]:
[*vowels_generator]

There is also a built-in function `all` which checks if all elements of an iterable are True.

In [None]:
vowels_generator = (char.upper() for char in sentence if char.lower() in vowels)
all(v not in {'U', 'Y'} for v in vowels_generator)

Check what is left in the generator.

In [None]:
[*vowels_generator]

In conclusion, generator expressions look a lot like comprehensions, but they are *lazy* and therefore potentially more efficient.