# Module: Generators
Est. time: 60 min

1. What is a generator?
2. How do they work?
3. Why and when to use generators?

## What is a generator?
- Generators are an easy way to create iterators, without boilerplate and managing state.
- Great for lazy execution (file or database I/O)

In [2]:
def sample_gen():
    n = 1
    print('Hi I am here!')
    yield n

    n += 1
    print('Now.. I am here')
    yield n

    n += 1
    print('Finally, I am here')
    yield n

Call `sample_gen` and iterate through 4 times using `next()`

In [2]:
gen = sample_gen()

Note `n` is remembered!

### Exercise
Does it work for a for loop? Try using a for loop to iterate through a generator variable. You can re-use `sample_gen` above.

### Exercise
For comparison, let's create an `Iterator` that returns the cubes up to a maximum value `n`

In [7]:
class CubesIterator():
    def __init__(self, _max):
        self.max = _max
        self.index = 0
    
    def __iter__(self):
        return self
    
    def __next__(self):
        val = self.index
        # If val less the max, then raise StopIteration
# otherwise increment
        return val

In [8]:
for i in CubesIterator(5):
    print(i)
    
0, 1, 27, 64, 125

Now let's do the same using `yield`, with a generator.

## Generator Expressions

In [16]:
[y ** 2 for y in range(20)]

In [17]:
(y ** 2 for y in range(20))

<generator object <genexpr> at 0x7fd4b9d2e970>

### Passing as function arguments

In [18]:
sum([y**2 for y in range(20)])

2470

In [19]:
sum(y**2 for y in range(20))

2470

## Why and when to use generators?
1. Easy to implement
2. Memory efficient
3. Work with infininity

## Lab: Memory Efficiency

### Step 1: Let's create two files.
Create 2 files. 

- The first file should have 100 lines with empty lines occurring roughly 30% of the time, randomly. 
- The second file will be like the first except 100,000 lines
- The non-empty lines can be gibberish, or something like "Hello 1", "Hello 2", etc.

(Hint: uses `import random`)

### Step 2: Create a normal function that returns all non-empty lines from a file.

### Step 3: Create a generator function that does the same thing.

### Step 4: Let's create a function to analyze memory usage

In [20]:
import resource

In [55]:
def get_usage_stats():
    print('Peak Memory Usage =', resource.getrusage(resource.RUSAGE_SELF).ru_maxrss)
    print('User Mode Time =', resource.getrusage(resource.RUSAGE_SELF).ru_utime)
    print('System Mode Time =', resource.getrusage(resource.RUSAGE_SELF).ru_stime)

In [57]:
get_usage_stats()

### Step 5: Let's create two scripts
Put each function into its own script and run each from the notebook.

```
%run -i 'normal_reader.py'
%run -i 'generator_reader.py'
```

### Step 6: Insert `get_usage_stats` into scripts
Call the usage stats function in each and see what the difference is.

In [9]:
# locals()