# Content:

- 4.1, 4.2: What are Iterators?
- 4.3: Generators
- 4.7: Slicing Iterators
- 4.8: Dropwhile and Generator Comprehensions
- 4.5: Reverse Iteration
- 4.9: Permutations
- 4.12: Itertools Chain
- 4.14: Flatten Nested Sequence

## What is an Iterator?

In [9]:
class Incrementer:
    def __init__(self, start):
        self.value = start
        
    def __iter__(self):
        return self

    def __next__(self):
        self.value += 1
        return self.value

In [137]:
incrementer = Incrementer(10)
iterator = iter(incrementer)

print(next(iterator))
print(next(iterator))
print(next(iterator))

11
12
13


## Under the Hood

<img src="https://encrypted-tbn0.gstatic.com/images?q=tbn%3AANd9GcTLoeIGUgiRG6fL6cYhXGuqhAWROpuqv2zj-w&usqp=CAU">

In [13]:
class Incrementer:
    def __init__(self, start):
        self.value = start
        
    def __iter__(self):
        return self

    def __next__(self):
        self.value += 1
        return self.value

## Examples 

In [179]:
class Foo:
    def __init__(self):
        self.children = [1, 2, 3, 4]
        self.otherstuff = {'complicated': 'stuff'}
        
    def __iter__(self):
        return iter(self.children)

In [139]:
foo = Foo()
my_iterator = iter(foo)

try:
    while True:
        print(next(my_iterator))
except StopIteration:
    print("---- Iterator consumed ----")

1
2
3
4
---- Iterator consumed ----


## Using for loop?

In [180]:
for child in foo:
    print(child)

1
2
3
4


## What are generators?

In [230]:
def countdown(n):
    print('Counting down from', n)
    while n > 0:
        print('before yield')
        yield n
        print('after yield')
        n -= 1

In [231]:
c = countdown(10)

In [232]:
next(c)

Counting down from 10
before yield


10

In [233]:
next(c)

after yield
before yield


9

In [234]:
next(c)

after yield
before yield


8

- Generators are a type of iterator. When `yield` is used, the iterator is automatically a generator.

- The generator function is only called when `next()` is called.

- This is usually a faster way of creating an iterator for simple purposes. For more complicated uses, where other methods are needed or complex state management is required, an iterator is better used.



## Slicing Iterators

In [190]:
def countdown(n):
    while n > 0:
        yield n
        n -= 1
        
c = countdown(10)

c[2:7]

TypeError: 'generator' object is not subscriptable

- Can't be sliced because no indexing and no length information

In [76]:
len(c)

TypeError: object of type 'generator' has no len()

In [77]:
c[0]

TypeError: 'generator' object is not subscriptable

## Itertools `islice`

In [191]:
import itertools

# Iterator object, consume to see content
for x in itertools.islice(c, 2, 7):
    print(x)

8
7
6
5
4


- This works by consuming and throwing away all items up until start index, and then returning the items until end index.

Note that this will consume the iterator. Once consumed, the iterator can't go back to the beginning.

In [202]:
c = countdown(10)
for x in itertools.islice(c, 2, 7):
    print(x)

8
7
6
5
4


In [203]:
next(c)

3

<img src="https://img.buzzfeed.com/buzzfeed-static/static/2017-04/3/13/asset/buzzfeed-prod-fastlane-01/sub-buzz-6595-1491241412-2.png?downsize=200%3A%2A&output-quality=auto&output-format=auto&output-quality=auto&output-format=auto&downsize=140:*">

Once iterators are consumed, cannot restore to initial state. Need to recreate `countdown` object in this case.

## Skipping first part of iterable

In [153]:
content = [
    '# This file contains sensitive user information',
    '# Do not distribute',
    'Start Logs:',
    '[IMPORTANT] Data',
]

Generator function gets executed anyways, until the condition is false

In [154]:
for line in itertools.dropwhile(lambda s: s.startswith('#'), content):
    print(line)

Start Logs:
[IMPORTANT] Data


## Filtering an iterable

In [92]:
# Comments throughout, not just at the start
content = [
    '# This file contains sensitive user information',
    'Start Logs:',
    '[IMPORTANT] Data',
    '# Do not distribute',
    'More important data',
    '# Another comment'
]

In [204]:
big_list = content * 10000000
len(big_list)

40000000

In [205]:
list_comprehension = [x for x in big_list if not x.startswith('#')]
len(list_comprehension)

20000000

In [206]:
generator_comprehension = (x for x in big_list if not x.startswith('#'))
generator_comprehension

<generator object <genexpr> at 0x0000023DA1F5FAC0>

In [207]:
import sys

print("List comprehension size:", sys.getsizeof(list_comprehension))
print("Generator comprehension size:", sys.getsizeof(generator_comprehension))

List comprehension size: 165281096
Generator comprehension size: 112


## Iterating in reverse!

<img src="reverse.jpeg" width="200" height="200">

In [208]:
items = [5, 4, 3, 2, 1]

for x in reversed(items):
    print(x)

1
2
3
4
5


## Iterating over all possible permutations

In [209]:
items = ['a', 'b', 'c']

for x in itertools.permutations(items):
    print(x)

('a', 'b', 'c')
('a', 'c', 'b')
('b', 'a', 'c')
('b', 'c', 'a')
('c', 'a', 'b')
('c', 'b', 'a')


In [113]:
list(itertools.permutations(items, 2))

[('a', 'b'), ('a', 'c'), ('b', 'a'), ('b', 'c'), ('c', 'a'), ('c', 'b')]

## Iterating over all possible combinations

Order does not matter in combinations, so we get the following:

In [210]:
list(itertools.combinations(items, 3))

[('a', 'b', 'c')]

In [211]:
list(itertools.combinations(items, 2))

[('a', 'b'), ('a', 'c'), ('b', 'c')]

## For combinations with replacement:

In [117]:
list(itertools.combinations_with_replacement(items, 3))

[('a', 'a', 'a'),
 ('a', 'a', 'b'),
 ('a', 'a', 'c'),
 ('a', 'b', 'b'),
 ('a', 'b', 'c'),
 ('a', 'c', 'c'),
 ('b', 'b', 'b'),
 ('b', 'b', 'c'),
 ('b', 'c', 'c'),
 ('c', 'c', 'c')]

Itertools usually has solutions to complex iteration problems, so look there first before trying to write your own solution!

## Enumerate!

In many cases, when iterating over a list we also need the current index.

No need for your own counter variable, just use Python's built-in `enumerate`!

In [222]:
icecream = ['chocolat', 'vanille', 'fraise', 'citron']

for i, value in enumerate(icecream):
    print(f'Item {i}: {value}')

Item 0: chocolat
Item 1: vanille
Item 2: fraise
Item 3: citron
Item 4: chocolat
Item 5: vanille
Item 6: fraise
Item 7: citron


To unpack a sequence of tuples with enumerate, need to be careful about how we write our unpacking expression!

In [216]:
icecream = [('chocolat', 1.0), ('vanille', 1.2), ('fraise', 0.8), ('citron', 0.9)]

# Be careful about the unpacking expression!
for i, (name, price) in enumerate(icecream):
    print(f'Item {i}: {name}, price: €{price}')

Item 0: chocolat, price: €1.0
Item 1: vanille, price: €1.2
Item 2: fraise, price: €0.8
Item 3: citron, price: €0.9


## Chaining multiple iterables

In [217]:
resultsA = [1, 2, 3, 4, 5]
resultsB = [7, 8, 9, 10]

for x in itertools.chain(resultsA, resultsB):
    print(x)

1
2
3
4
5
7
8
9
10


The previous example is fine, but not the real purpose of itertools.chain, as Python lists have an `extend` method that does the same thing!

## Chaining multiple iterables

In [128]:
resultsA.extend(resultsB)
resultsA

[1, 2, 3, 4, 5, 7, 8, 9, 10]

Okay, this works fine and obviously isn't the best demonstration of `itertools.chain`. What about memory though?

In [220]:
big_results = resultsB * 60000000

listA = [1, 2, 3, 4]
listA.extend(big_results)
print("Extend:", sys.getsizeof(listA))

listA = [1, 2, 3, 4]
print("Chain:", sys.getsizeof(itertools.chain(listA, big_results)))

Extend: 2160000136
Chain: 48


## Chaining multiple iterables

What if we had different data types though? This is where `itertools.chain` would offer a solution.

In [224]:
resultsA = [1, 2, 3]
resultsB = {'x', 'y', 'z'}

for x in itertools.chain(resultsA, resultsB):
    print(x)

1
2
3
x
z
y


## Flatten Nested Sequence

In [225]:
items = [1, [2, 3], [4, [5, 6], [7, 8, 9]]]

In [228]:
from collections.abc import Iterable

def flatten(items):
    for x in items:
        if isinstance(x, Iterable):
            yield from flatten(x)
        else:
            yield x

In [229]:
tuple(flatten(items))

(1, 2, 3, 4, 5, 6, 7, 8, 9)

In [235]:
for i in range(10):
    print(i)

0
1
2
3
4
5
6
7
8
9


# Fin