In [None]:
%run talktools

# Lazy Sequences

## Processing with List Comprehensions

**Main Problem:** Each list must be in memory until the next list is processed

<img src="https://github.com/yardsale8/STAT489/blob/master/img/eager_evaluation_of_lists.png?raw=true">

## Why use lazy sequences

1. Doesn't clog memory
2. Allows processing large data
    - Too big to fit in memory

<img src="https://github.com/yardsale8/STAT489/blob/master/img/lazy_evaluation_of_lists.png?raw=true">

## Methods of creating lazy sequences

#### Method 1 - Use Lazy constructs like `map` and `filter`

- Built in: `map` and `filter`
- From modules
    - `itertools`
    - `toolz`
    - `more_itertools`

In [46]:
from toolz import pipe
from toolz.curried import map, filter, take, curry
from operator import add, pow
from functools import reduce

reduce = curry(reduce)
add = curry(add)
pow = curry(pow)

pipe(range(10000000),
     filter(lambda n: n % 5 == 0),
     map(add(2)),
     reduce(add))

9999999000000

#### Main Point - We only processed one number at a time

- Eager approach would have required millions of numbers in memory at each step.
- Only eager part was reduce
    - Reduce only keeps 1 extra item (acc) in memory

#### Method 2 - Use generator expression

- Turn list comprehensions (*eager*) into generator expressions (*lazy*)
    - Switch [] to ()

<img src="https://github.com/yardsale8/STAT489/blob/master/img/generator_expression.png?raw=true">

In [5]:
sqrs = (i**2 for i in range(10000))
doubled = (2*i for i in sqrs)
plus_2 = (i + 2 for i in sqrs)
plus_2

<generator object <genexpr> at 0x1041a9410>

In [6]:
next(plus_2)

2

In [7]:
next(plus_2)

3

In [8]:
next(plus_2)

6

#### Note: We have now lost these three results

- We get *one pass* on a lazy stream

### Method 3 - Use the `yield` and `yield from` statements

- This is rarely needed
- `yield` a single value
- `yield from` yields each element in a seq
    - lazily of course

In [47]:
def my_gen(seq):
    yield "hi"
    yield "I am about to do some work"
    yield "Wait for it"
    yield from seq
    yield "Wow! That was too much work!"
g = my_gen(range(3))

In [48]:
next(g)

'hi'

In [49]:
next(g)

'I am about to do some work'

In [50]:
next(g)

'Wait for it'

#### Now we `yield from range(3)`

In [41]:
next(g)

0

In [42]:
next(g)

1

In [43]:
next(g)

2

#### `yield from` is complete, now the last `yield`

In [44]:
next(g)

'Wow! That was too much work!'

#### `g` is now empty, another call to `next` causes an exception

In [45]:
next(g)

StopIteration: 

## Note on `next` and `yield`

- You will rarely need these
- lazy sequences are processed automatically
    - by other lazy processes 
    - `reduce`
    - `consume`