# Three ways to iterate through an container

A tuple called sample is choosen, just because you can write it in a fast way ;)

In [1]:
sample = 1, 2, 3, 4

## The best way, the standard paython approach:

In [2]:
for element in sample:
    print(element)

1
2
3
4


## The bad way, iterating over the indecies and getting the elements via them

Why is it bad? If you not careful enough it can lead into IndexErrors, mostly because you want to get an element from an index that isn't in that sequence. Also it looks ugly compared to the standard way.

In [3]:
# BAD example

for i in range(len(sample)):
    print(sample[i])

1
2
3
4


## The worst way, iterating with a while loop over indecies to getting elements

First, the main difference between the usage of a while loop and a for loop, is that you use a for loop if you know how many iterations you have to make. The while loop is for cases you have no clue.

Second, you need to specify a stop condition to prevent IndexErrors.

Thirs, it just looks ugly compared to the standard way.

In [4]:
# Even WORSE!!!

i = 0
while i < len(sample):
    print(sample[i])
    i += 1

1
2
3
4


## So, what happend in the first for loop?

Python uses the iterator protocol, let's look at one of them.

There is the built-in function called iter to create an iterator from a sequence.

In [5]:
iter(sample)

<tuple_iterator at 0x7f526c3b6320>

Wow, that isn't really helpful, let's investigate a little more.

In [6]:
it = iter(sample)

In [7]:
next(it)

1

Now, somethings happend. Calling next with the iterator gives us some value, the first one to be specific.

In [8]:
next(it)

2

In [9]:
next(it)

3

In [10]:
next(it)

4

Hmm, so we reached the last element. What happens next?

In [11]:
next(it)

StopIteration: 

It throws an error called StopIteration. The iterator is empty, there is nothing more to iterate.

## Conclusion

An iterator gives use its values time after time with the next function.

Also he is consumable. He can be emptied. And in that case a StopIteration will be thrown.

Let's build a standard for loop with that knowledge.

In [12]:
it = iter(sample)
while True:  # now condition, let's run this loop until we descide to break out
    try:
        print(next(it))
    except StopIteration:
        break # terminates the loop in case a StopIteration is catched

1
2
3
4


In fact the standard for loop works like that, it is just highly optimized and implemented in C.

The good things about iterators:

We don't need to bother about IndexErros: Out of range.
In fact, we don't need to know the length of a sequence to loop over.

Also, we don't touch the original sequence, since iter() will create an iterator out of it.
Furthermore an iterator is small, he just knows its current value and how to get the next one.

Let's compare the sizes:

In [13]:
from sys import getsizeof

In [14]:
getsizeof(sample)

80

In [15]:
it = iter(sample)
getsizeof(it)

56

Looks not that much, but let's examine the comparision of way bigger sequences:

In [16]:
big = tuple(range(10**6))
len(big)

1000000

In [17]:
getsizeof(big)

8000048

In [18]:
it = iter(big)
getsizeof(it)

56

That is more a difference :D

The core developers of python really like iterators, so it happens that one of the main differences between python 2 and python 3 is the extended use of iterators.

range, zip, enumerate which returns lists in python 2 are returning now iterators. And this makes a lot of sense. Why creating a new memory heavy list if they can just pull elements from an already given original sequence.

# So what's the deal?

Knowing about iterators leads to another advanced topic:

# Generators
The customizable iterator.

Now you can create your own iterators. They look like functions but with one main difference:

In [19]:
def gen():
    yield 1
    yield 2
    yield 3
    
for i in gen():
    print(i)

1
2
3


There is no return, there is a yield statement.

Simply said, the yield marks a break point in that routine.

A next call will activate the generator and all code will executed until it hits a yield statement. Now everything in that statement will be returned by the next function and the generator is deactivated until the next next call.
After a nex next call everything after the yield will be excuted until it reached another yield again, and so on. 
If there is no code afterwards the generator is empty and a StopIteration will be thrown, like in an iterator.

In [20]:
g = gen()
next(g)

1

In [21]:
next(g)

2

In [22]:
next(g)

3

In [23]:
next(g)

StopIteration: 

# Generators are coroutines not functions

The yield behaviour makes them a coroutine. Because we can start and stop it now, unlike functions.

Generators also have additional functionalities, you can send data in, also handled via yield statement and some minor one too, but that is a little unrelated now, it will become more related within ascynrounios IO. 

In [24]:
def listener():
    while True:
        msg = yield
        print('Got: {}'.format(msg))

In [25]:
l = listener()

In [26]:
next(l)

In [27]:
l.send('Start')

Got: Start


In [28]:
l.send('Hello World')

Got: Hello World


In [29]:
l.send('Monty Python')

Got: Monty Python


# Let us a generator to something more normal

Lets create a list of squares, traditionally and with a generator.

In [30]:
sample = list(range(10))
sample

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

In [31]:
squared_sample = []
for element in sample:
    squared_sample.append(element**2)
    
squared_sample

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

In [32]:
def square_gen(seq):
    for element in seq:
        yield element**2

In [33]:
list(square_gen(sample))

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

Okay both gives us the same new squared sequence, let's compare the size

In [34]:
getsizeof(sample)

200

In [35]:
getsizeof(squared_sample)

192

In [36]:
getsizeof(square_gen(sample))

88

In [37]:
sample = list(range(10**6))
len(sample)

1000000

In [38]:
squared_sample = []
for element in sample:
    squared_sample.append(element**2)
    
len(squared_sample)

1000000

In [39]:
getsizeof(sample)

9000112

In [40]:
getsizeof(squared_sample)

8697464

In [41]:
getsizeof(square_gen(sample))

88

The generator is still as small as the iterator.

# The yield from statement

Besides yield, there is also a yield from statement.

`yield from` is just the short form of this:

    for element in seq:
        yield element


In [42]:
def pairwise(seq):
    for i, j in zip(seq, seq[1:]):
        yield i, j

In [43]:
list(pairwise(range(4)))

[(0, 1), (1, 2), (2, 3)]

In [44]:
def pairwise_from(seq):
    yield from zip(seq, seq[1:])

In [45]:
list(pairwise_from(range(4)))

[(0, 1), (1, 2), (2, 3)]

# Generators are cool, I get it, but comes next?

They are the foundation of all the nice things you can find in the module itertools, and itertools are very very useful and cool.

Also you can advance the interface of your functions with them, we'll eventually come late to this.

And as I mentioned before, AsyncIO, this is a very advanced topic but it is almost completely done with generators!