# Module: Generators
Est. time: 60 min

1. What is a generator?
2. How do they work?
3. Why and when to use generators?

## What is a generator?
- Generators are an easy way to create iterators, without boilerplate and managing state.
- Great for lazy execution (file or database I/O)

In [2]:
def sample_gen():
    n = 1
    print('Hi I am here!')
    yield n

    n += 1
    print('Now.. I am here')
    yield n

    n += 1
    print('Finally, I am here')
    yield n

Call `sample_gen` and iterate through 4 times using `next()`

In [10]:
gen = sample_gen()

In [11]:
gen

<generator object sample_gen at 0x7ff50d415d60>

In [12]:
next(gen)

Hi I am here!


1

In [13]:
next(gen)

Now.. I am here


2

In [15]:
next(gen)

StopIteration: 

Note `n` is remembered!

### Exercise
Does it work for a for loop? Try using a for loop to iterate through a generator variable. You can re-use `sample_gen` above.

In [17]:
gen = sample_gen()
for n in gen:
    print(n)
    

Hi I am here!
1
Now.. I am here
2
Finally, I am here
3


### Exercise
For comparison, let's create an `Iterator` that returns the cubes up to a maximum value `n`

In [33]:
class CubesIterator():
    def __init__(self, _max):
        self.max = _max
        self.index = 0
    
    def __iter__(self):
        return self
    
    def __next__(self):
        val = self.index
        if self.index > self.max:
            raise StopIteration
        self.index +=1
        return val ** 3

In [34]:
for i in CubesIterator(5):
    print(i)

0
1
8
27
64
125


Now let's do the same using `yield`, with a generator.

In [35]:
def gen_cubes(max_num):
    n = 0
    while n <= max_num:
        yield n **3
        n +=1 

In [36]:
for i in gen_cubes(5):
    print(i)

0
1
8
27
64
125


## Generator Expressions

In [37]:
[y ** 2 for y in range(20)]

[0,
 1,
 4,
 9,
 16,
 25,
 36,
 49,
 64,
 81,
 100,
 121,
 144,
 169,
 196,
 225,
 256,
 289,
 324,
 361]

In [38]:
(y ** 2 for y in range(20))

<generator object <genexpr> at 0x7ff50d4887b0>

### Passing as function arguments

In [18]:
sum([y**2 for y in range(20)])

2470

In [19]:
sum(y**2 for y in range(20))

2470

## Why and when to use generators?
1. Easy to implement
2. Memory efficient
3. Work with infininity

## (Optional) Lab: Memory Efficiency

### Step 1: Let's create two files.
Create 2 files. 

- The first file should have 100 lines with empty lines occurring roughly 30% of the time, randomly. 
- The second file will be like the first except 100,000 lines
- The non-empty lines can be gibberish, or something like "Hello 1", "Hello 2", etc.

(Hint: uses `import random`)

### Step 2: Create a normal function that returns all non-empty lines from a file.

### Step 3: Create a generator function that does the same thing.

### Step 4: Let's create a function to analyze memory usage

In [20]:
import resource

In [55]:
def get_usage_stats():
    print('Peak Memory Usage =', resource.getrusage(resource.RUSAGE_SELF).ru_maxrss)
    print('User Mode Time =', resource.getrusage(resource.RUSAGE_SELF).ru_utime)
    print('System Mode Time =', resource.getrusage(resource.RUSAGE_SELF).ru_stime)

In [57]:
get_usage_stats()

### Step 5: Let's create two scripts
Put each function into its own script and run each from the notebook.

```
%run -i 'normal_reader.py'
%run -i 'generator_reader.py'
```

### Step 6: Insert `get_usage_stats` into scripts
Call the usage stats function in each and see what the difference is.

In [9]:
# locals()