## The Wonderful World of Generators!

Python training6


In [3]:
for i in range(0,3):
    print(i)

0
1
2


You might naively think that range creates a list: \[0, 1, 2], but in fact, compare the following.

In [8]:
print([0,1,2])
print(range(0,3))

[0, 1, 2]
range(0, 3)


So, range doesn't actually contain any integers at first. It is a class of object called, you guessed it, a __generator__ (well, a generator-iterator in range's case).
Until (implicitly or explicitly) it's internal ```__iter__``` function is called (see the tutorial on magic functions), it just sits, waiting to be used for useful work. Here is what I mean by explicitly:

In [42]:
x = range(2)
print(x.__iter__())

<range_iterator object at 0x7ff65c30f780>


Okay, that didn't work as expected, that is, not like it would for any other iterative generator. Why not? After consulting documentation, it appears that range has become a special type as of Python 3. But that's okay, because...we can make our own generators!

## Making your own generator
So, here we make use of the all-important ```yield``` statement, which transforms any old function into a generator.

In [65]:
def my_gen(alist):
    for i in alist:
        yield i

In [66]:
gen = my_gen([1,2,3])
gen

<generator object my_gen at 0x7ff65c351570>

In [67]:
print(next(gen))
print(next(gen))
print(next(gen))
print(next(gen))

1
2
3


StopIteration: 

There you have it. The generator _yields_ the next element available, on-demand, as the generator function is called. When there is nothing left (when it reaches a stop point), the `__iter__` magic function being called under the hood throws the StopIteration exception, which is actually the same exception that allows a `for` loop to gracefully exit iterating over an generator-iterator like `range`. In fact: 

In [69]:
x = my_gen([3,4,5])
for i in x:
    print(i)

3
4
5


Voila! No StopIteration exception!

So, these are clearly trivial examples just to motivate your understanding. When does this actually come in handy?
Well, imagine you have a massive data set, somewhere in the area of 1 or 2 GB of JSON or XML (or just plain text) data sitting on your laptop. Are you going to read that whole monster into memory just so you can take a look at it? Of course not!

### Iterators
In case you had no idea what I was talking about when I talked about the `__iter__` method or iterators generally, let me give a quick overview.

An iterator is simply an object that can be...iterated...across. A list is an obvious example, as is any object that has ordered elements. At its most basic, this simply means the object needs to implement some version of `__iter__`. A simple example:

In [111]:
class DogHouse():
    def __init__(self, dogs, states):
        self.dogs = dogs
        self.states = states
    def __iter__(self):
        for dog, state in zip(self.dogs, self.states):
            yield dog, state

doghouse = DogHouse(('Chris', 'Raymond', 'Mohammed'), ('horny', 'hungry', 'hilarious'))

for dog, state in doghouse:
    print(dog + " is " + state)

Chris is horny
Raymond is hungry
Mohammed is hilarious


Try turning the above `yield` statement into a `return` statement, if you're still wondering why we must use `yield`. 


Okay, so I lied a bit. Technically, this instance of class `DogHouse` is not an iterator. It is an object that has an iterator method, though, so it can be iterated over, as we have done above.

We can explicitly turn it into an iterator, as it has an `__iter__` method, by calling the `iter` function (which creates objects of type `iter`, so don't confuse it with the magic method `__iter__`):

In [102]:
dog_iter = iter(doghouse)
print(next(dog_iter))
print(next(dog_iter))
print(next(dog_iter))
print(next(dog_iter))

('Chris', 'hungry')
('Raymond', 'horny')
('Mohammed', 'hilarious')


StopIteration: 

And, we get the StopIteration exception we expect.

### Comprehensions

Okay, comprende so far? Now we move onto a fancy use of generators called Comprehensions, which are very useful.

In [110]:
x = [i for i in range(10)]
y = list(range(10))
x == y

True

As the True output shows, these methods of creating a list from a range generator lead to the same results. The creation of x simply makes explicit that the generator `range` yields the values on the fly, which are then place in the list, element by element. Fancier uses involve making lists within lists, as well as dictionaries:

In [106]:
[[i for i in range(3)] for j in range(3)]

[[0, 1, 2], [0, 1, 2], [0, 1, 2]]

In [109]:
{e:i for e, i in zip(['a','b','c'], [1, 2, 3])}

{'a': 1, 'b': 2, 'c': 3}

If your not sure what zip does, play around with it a bit.