<h1 align="center">ITERATORS & GENERATORS</h1>
<h2 align="left"><ins>Lesson Guide</ins></h2>

- [**ITERATORS AND GENERATORS**](#itergen)
- [**BUILDING GENERATORS WITH THE `YIELD` KEYWORD**](#yield)
- [**BUILT-IN FUNCTIONS: `next()` and `iter()`**](#next)
    - [**A Brief Note on `range()`**](#range)
- [**MORE EXAMPLES**](#examples)
- [**SUMMARY**](#summary)
- [**PYTHON GENERATOR CLASSES AND ITERATORS**](#genclass)
- [**ITERABLES IN PYTHON**](#iterables)

#### [Documentation](https://docs.python.org/3/tutorial/classes.html#generators)

<a id='itergen'></a>
## ITERATORS AND GENERATORS
In this section we will look at some of the differences between iteration and generation in Python and how to construct our own Generators with the `*yield*` statement. Generators allow us to generate as we go along, instead of holding everything in memory. Some examples of these are certain built-in Python functions like `**range()**`, `**map()**` and `**filter()**`.

Generator functions allow us to write a function that can send back a value and then later resume to pick up where it left off. This type of function is a generator in Python, allowing us to generate a sequence of values over time. The main difference in syntax will be the use of a <code>yield</code> statement.

In most aspects, a generator function will appear very similar to a normal function. The main difference is when a generator function is compiled they become an object that supports an iteration protocol. That means when they are called in your code they don't actually return a value and then exit. Instead, generator functions will automatically suspend and resume their execution and state around the last point of value generation. The main advantage here is that instead of having to compute an entire series of values up front, the generator computes one value and then suspends its activity awaiting the next instruction. This feature is known as ***state suspension***.

<a id='yield'></a>
## BUILDING GENERATORS WITH THE `YIELD` KEYWORD

In [1]:
# normal function

def create_cubes(n):
    result = []
    for num in range(n):
        result.append(num**3)
    
    return result

# we can print the full list out or loop through the list:
print(create_cubes(10))

# for x in create_cubes(10):
#     print(x)

[0, 1, 8, 27, 64, 125, 216, 343, 512, 729]


In [2]:
# Generator function for the cube of numbers

def gencubes(n):
    for num in range(n):
        yield num**3

# Because of yield, when we call the function
# it does not return a list but rather an object.
print(gencubes(10))

<generator object gencubes at 0x0000018C56AA0848>


In [3]:
# to access all the values in a generator, we must first convert
# the generator into a list:
print(list(gencubes(10)))

# or we can loop through the generator the same way we would an iterable
# for x in gencubes(10):
#     print(x)

[0, 1, 8, 27, 64, 125, 216, 343, 512, 729]


We can check the amount of memory required for each of the methods.

In [4]:
import sys
# help(sys)

In [5]:
big_range = range(1000)
big_list = list(range(1000))

for_list = []
for val in range(1000):
    for_list.append(val)

print("big_range is {} bytes".format(sys.getsizeof(big_range)))
print("big_list is {} bytes".format(sys.getsizeof(big_list)))
print("for_list is {} bytes".format(sys.getsizeof(for_list)))

big_range is 48 bytes
big_list is 9112 bytes
for_list is 9024 bytes


In [6]:
def create_numbers(n):
    result = []
    for num in range(n):
        result.append(num)
    
    return result

my_func = create_numbers(1000)
my_iter = iter(my_func)
print("my_iter is {} bytes".format(sys.getsizeof(my_iter)))

my_iter is 56 bytes


In [7]:
def my_range(n: int):
#     print("my_range starts")
    start = 0
    while start < n:
#         print("my_range is returning {}".format(start))
        yield start
        start += 1
        
my_range_list = my_range(1000)
print("my_range_list is {} bytes".format(sys.getsizeof(my_range_list)))

my_range_list is 120 bytes


<a id='next'></a>
## BUILT-IN FUNCTIONS: `next()` and `iter()` 
A key to fully understanding generators is the `next()` function and the `iter()` function.
- The next() function allows us to access the next element in a sequence. 
- The iter() function converts and iterable object into a generator.

In [8]:
yield_func = gencubes(10)

print(next(yield_func))
print(next(yield_func))
print(next(yield_func))
print(type(gencubes(10)))
print(type(yield_func))

0
1
8
<class 'generator'>
<class 'generator'>


Since we have a generator function we don't have to keep track of every single cube we created.

Generators are best for calculating large sets of results (particularly in calculations that involve loops themselves) in cases where we don’t want to allocate the memory for all of the results at the same time.

Let's create another example generator which calculates [fibonacci](https://en.wikipedia.org/wiki/Fibonacci_number) numbers:

In [9]:
def fib1(n):
    """
    Generate a fibonnaci sequence up to nth value
    """
    a = 0
    b = 1
    for i in range(n):
        yield a
        a,b = b,a+b

def fib2():
    current, previous = 0, 1
    while True:
        yield current
        current, previous = current + previous, current        

In [10]:
my_gen = fib1(3)
print(next(my_gen))
print(next(my_gen))
print(next(my_gen))

0
1
1


In [11]:
print(next(my_gen))

StopIteration: 

After yielding all the values next() caused a StopIteration error. What this error informs us of is that all the values have been yielded. 

You might be wondering that why don’t we get this error while using a for loop? A for loop automatically catches this error and stops calling next().

In [12]:
for num in fib1(4):
    print(num)

0
1
1
2


What if this was a normal function, what would it look like?

In [13]:
def fibon(n):
    a = 1
    b = 1
    output = []
    
    for i in range(n):
        output.append(a)
        a,b = b,a+b
        
    return output

print(fibon(10))

[1, 1, 2, 3, 5, 8, 13, 21, 34, 55]


Notice that if we call some huge value of n (like 100000) the second function will have to keep track of every single result, when in our case we actually only care about the previous result to generate the next one!

We can also convert the original `create_cubes(n)` function into a generator using the `iter()` method.

In [14]:
my_iter = iter(create_cubes(10))

print(next(my_iter))
print(next(my_iter))
print(next(my_iter))

print(type(create_cubes(10)))
print(type(my_iter))

0
1
8
<class 'list'>
<class 'list_iterator'>


Recall that strings are iterables.

In [15]:
string = '1234567890'

my_iterator = iter(string)
print(type(my_iterator))
print(my_iterator)
print(next(my_iterator))
print(next(my_iterator))

<class 'str_iterator'>
<str_iterator object at 0x0000018C56B5DF48>
1
2


In [16]:
s = 'hello'

# Iterate over string with a for loop
for letter in s:
    print(letter, end=' ')

h e l l o 

But that doesn't mean the string itself is an *iterator*! We can confirm this using `next()`:

In [17]:
next(s)

TypeError: 'str' object is not an iterator

Interestingly, this means that a string object supports iteration, but we can not directly iterate over it as we could with a generator function. The iter() function allows us to do just that.

We could simply do:
```python
next(iter(s))
```
but this can be confusing and misleading. A better way is:

In [18]:
s_iter = iter(s)

print(next(s_iter))
print(next(s_iter))
print(next(s_iter))
print(next(s_iter))
print(next(s_iter))
print(next(s_iter))

h
e
l
l
o


StopIteration: 

In [19]:
gen_exp = (x for x in range(5))

print(next(gen_exp))
print(next(gen_exp))
print(next(gen_exp))

0
1
2


Now we know how to convert objects that are iterable into iterators themselves.

The main takeaway from this is that using the yield keyword in a function will cause the function to become a generator. This change can save you a lot of memory for large use cases. For more information on generators check out:

[Stack Overflow Answer](http://stackoverflow.com/questions/1756096/understanding-generators-in-python)

[Another StackOverflow Answer](http://stackoverflow.com/questions/231767/what-does-the-yield-keyword-do-in-python)

<a id='range'></a>
### <ins>A Brief Note on `range()`</ins>

In [20]:
test = iter(range(1,12,2))
print(next(test))
print(next(test))
print(next(test))
print(next(test))

1
3
5
7


In [21]:
test = [i for i in range(1,12,2)]
print(next(test))

TypeError: 'list' object is not an iterator

In [22]:
test = (i for i in range(1,12,2))
print(next(test))
print(next(test))
print(next(test))
print(next(test))

1
3
5
7


<a id='summary'></a>
## SUMMARY
A generator in Python is a function that remembers the state it’s in, in between executions. Imagine you wanted to build a list of 100 numbers, like this one:

In [23]:
def hundred_numbers():
    nums = []
    i = 0
    while i < 100:
        nums.append(i)
        i += 1
    return nums

We could use list comprehension for this and the `range()` function, but for now let’s assume that this is a cool way of doing it. We construct a list, fill it with the first 100 numbers, and then return them. We now have 100 numbers in a list called `nums`. The entire list is in the computer’s RAM memory, taking up a small amount of space.

If we wanted 10,000,000 numbers, the list would be substantially bigger. As you grow the number, the amount of memory taken up by the list also grows. A generator is used to circumvent this problem. Instead of having a list, the first time you run the function you would get the first number `0`. The second time you run the function you’d get `1`. Then `2`, and so on.

You have to run the function every time you want a new number, that’s why it’s called a “generator”. It generates numbers (or indeed strings, or anything else you want to generate).

In [24]:
def hundred_numbers():
    num = 0
    while num < 100:
        yield num
        num += 1

The `yield` keyword is very much like a `return`, in that it gives the value back to the caller and returns execution control to them. However, the next time you run the function, execution continues from the very next line inside the function, instead of from the top.

We could re-write the function as a list comprehension; 
```python
hundred_numbers = [num for num in range(100)]
```

Or as a generator comprehension. This is essentially the same thing, including the `yield` statement.

In [25]:
hundred_numbers = (n for n in range(100))
print(next(hundred_numbers))
print(next(hundred_numbers))

print(list(hundred_numbers))

0
1
[2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99]


Notice that when we do the code snippet above, `next()` runs the function once up until the `yield` (which would give you the first value). The following `next()` runs it again, which gives you the second value. Then, turning it into a list continues and builds a list from the remaining values (that’s only 98 values left).

An another example, let's look at finding prime numbers - see whether n is divisible by any number other than 1 and itself.

In [26]:
for n in range(2, 10):
    for x in range(2, n):   #could change n to sqrt(n)
        if n % x == 0:  # if n is divisible by x, it means it's not a prime number.
            print(f"{n} equals {x} * {n//x}")
            break
    else:  # if n was not divisible by any x, it means it is a prime number.
        print(f"{n} is a prime number.")

2 is a prime number.
3 is a prime number.
4 equals 2 * 2
5 is a prime number.
6 equals 2 * 3
7 is a prime number.
8 equals 2 * 4
9 equals 3 * 3


In [27]:
def prime_generator(bound):
    for n in range(2, bound):
        for x in range(2, n):   
            # if n is divisible by x, it means it's not a prime number.
            if n % x == 0:  
                break
        # if n was not divisible by any x, it means it is a prime number.
        else:  
            yield n
        
g = prime_generator(10)

In [28]:
next(g)

2

In [29]:
next(g)

3

<a id='genclass'></a>
## PYTHON GENERATOR CLASSES AND ITERATORS

The below is a class which implements `__next__`as if it was a function using the `yield` keyword:

In [30]:
class FirstHundredGenerator(object):
    def __init__(self):
        self.number = 0

    def __next__(self):
        if self.number < 100:
            current = self.number
            self.number += 1
            return current
        else:
            raise StopIteration()

my_gen = FirstHundredGenerator()

print(my_gen.number)       # starting value
print(my_gen.__next__())   # 0
print(next(my_gen))        # 1
print(next(my_gen))        # 2
print(my_gen.number)       # where the generator is up to

0
0
1
2
3


Notice how the object, with its property, remembers what the value of `self.number` is at all points in time. This object is called a generator because every time the next number is available, not because it’s 
in a sequence, but because it is generated from its current state (in this case, by adding 1 to `self.number`).

All objects that have this `__next__` method are called iterators. All generators are iterators, but not the other way round. For example, you could have an iterator on which you can call `next()`, but that doesn’t generate its values. Instead, it could take them from a list or from a database.

*Important*: iterators are objects which have a `__next__` method.

Here’s an example of an iterator which is not a generator:

In [31]:
class FirstFiveIterator():
    def __init__(self):
        self.numbers = [1, 2, 3, 4, 5]
        self.i = 0
    
    def __next__(self):
        if self.i < len(self.numbers):
            current = self.numbers[self.i]
            self.i += 1
            return current
        else:
            raise StopIteration()
            
test_gen = FirstFiveIterator()

print(next(test_gen))
print(next(test_gen))
print(test_gen.i)

1
2
2


As you can see it’s returning numbers that are not being generated; instead they’re being returned from a list.

If we run this code though, we will get an error:

In [32]:
sum(FirstHundredGenerator())

TypeError: 'FirstHundredGenerator' object is not iterable

Similarly if we run this code:

In [33]:
for i in FirstHundredGenerator():
    print(i)

TypeError: 'FirstHundredGenerator' object is not iterable

And that’s because in Python, an `iterator` and an `iterable` are different things. You can iterate over an `iterable`. The iterator is used to get the next value (either from a sequence or generated values).
> You can iterate over iterables, not over iterators.

In [34]:
class PrimeGenerator:
    def __init__(self, stop):
        self.stop = stop
        self.start = 2
        
    def __next__(self):
        # always search from current (inclusive) to stop (exclusive)
        for n in range(self.start, self.stop): 
            for x in range(2, n):   
                # not a prime
                if n % x == 0:             
                    break
            # n is a prime becuase we have gone through the loop without 
            # having a non-prime situation.
            else:     
                # next time we need to start from n+1 otherwise we will 
                # be trapped on n.
                self.start = n + 1  
                # return n for this round
                return n     
        # end of the generator reached
        raise StopIteration()    

In [35]:
my_gen1 = PrimeGenerator(10)

print(next(my_gen1))
print(next(my_gen1))

2
3


<a id='iterables'></a>
## ITERABLES IN PYTHON
**So what is an iterable?**<br>
An iterable is an object that has an `__iter__` method defined. The `__iter__` method *must return an iterator*.

Here’s an example of using our generator to make an iterable.

In [36]:
class FirstHundredGenerator:
    def __init__(self):
        self.number = 0

    def __next__(self):
        if self.number < 100:
            current = self.number
            self.number += 1
            return current
        else:
            raise StopIteration()


class FirstHundredIterable:
    def __iter__(self):
        return FirstHundredGenerator()

Now we have an iterable which uses the iterator to get the next value of the sequence it generates. We can then do this:

In [37]:
print(sum(FirstHundredIterable()))  # gives 4950

4950


In [38]:
for i in FirstHundredIterable():
    print(i, end=' ')

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 

Recall that for loops we needed an object with `__len__` and `__getitem__` defined. So how can we use a for loop with this object that doesn’t have either of those? You can perform iteration over an iterable. An iterable either has:

* `__len__` and `__getitem__` defined; or
* An `__iter__` method that returns an iterator.

If you have either of those two, you have yourself an iterable.

So the `FirstHundredIterable` is returning an object of type `FirstHundredGenerator`. Inside `FirstHundredGenerator`, what is `self`?
(Hint: it’s an object, what is it’s type?)
(Hint hint: it’s of type `FirstHundredGenerator`).
Knowing that, we can change the generator to this:

In [39]:
class FirstHundredGenerator:
    def __init__(self):
        self.number = 0

    def __next__(self):
        if self.number < 100:
            current = self.number
            self.number += 1
            return current
        else:
            raise StopIteration()

    def __iter__(self):
        return self

And then we don’t need a separate iterable at all. The generator itself is now both an iterator and an iterable.

In [40]:
print(sum(FirstHundredGenerator()))

4950


In [41]:
class AnotherIterable:
    def __init__(self):
        self.cars = ['ford', 'fiesta']
        
    def __len__(self):
        return len(self.cars)
    
    def __getitem__(self, i):
        return self.cars[i]
    
for car in AnotherIterable():
    print(car)

ford
fiesta


- Iterator: used to get the next value
- Iterable: used to go over all the values of the iterator

In [42]:
my_numbers = [x for x in [1,2,3,4,5]]       # list comprehension
my_numbers_gen = (x for x in [1,2,3,4,5])   # generator comprehension 

print(next(my_numbers_gen))
print(next(my_numbers_gen))

1
2


In [43]:
print(next(my_numbers))

TypeError: 'list' object is not an iterator

In [44]:
print(next(iter(my_numbers)))

1
